0% found this document useful (0 votes)
159 views

NetBackup and VCS

VCS is Symantec's cluster software that provides high availability for applications like NetBackup. It works by monitoring applications and initiating failovers to redundant nodes when failures occur. For NetBackup, VCS monitors the NetBackup application and can automatically stop it on the active node, mount shared storage on the passive node, and restart NetBackup if an issue is detected. Administrators can also manually configure and manage NetBackup as a resource within VCS.

Uploaded by

mohantys
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
159 views

NetBackup and VCS

VCS is Symantec's cluster software that provides high availability for applications like NetBackup. It works by monitoring applications and initiating failovers to redundant nodes when failures occur. For NetBackup, VCS monitors the NetBackup application and can automatically stop it on the active node, mount shared storage on the passive node, and restart NetBackup if an issue is detected. Administrators can also manually configure and manage NetBackup as a resource within VCS.

Uploaded by

mohantys
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

NetBackup and VCS

Many of us who have been working with NetBackup for a long time come across a
situation where we need to work on NetBackup that is configured with VCS. Not
everyone who knows NetBackup would necessarily know VCS. So here is a small
overview of VCS and how it works with NetBackup.

CLUSTERS:

A computer cluster is a group of linked computers, working together closely so that in


many respects they form a single computer. Clusters are usually deployed to improve
performance and/or availability over that provided by a single computer. There are
many types of clusters, HA Cluster, Load-balancing cluster etc.

High-availability clusters (also known as Failover Clusters and most common for
NetBackup) are implemented primarily for the purpose of improving the availability of
services which the cluster provides. They operate by having redundant nodes, which
are then used to provide service when system components fail. The most common size
for an HA cluster is two nodes, which is the minimum requirement to provide
redundancy. HA cluster implementations attempt to use redundancy of cluster
components to eliminate single points of failure.

For NetBackup, you would usually have a 2-Node cluster. An active node, and a failover
node.

Symantec's cluster software for High-Availability is VCS.

HOW IT WORKS:

Consider a server which is installed and configured as NetBackup master server.


Assume that a disaster situation happens and the host is unavailable. All the services
are unavailable and backups stop running until a corrective action is taken to bring the
host back up (may be a complete DR). To avoid this, an identical host is configured with
exact same installation and configuration of NetBackup. These two hosts can be
configured in HA cluster. These two nodes have to be on same version of NetBackup,
have same LUN assigned from SAN. The NetBackup database (Image db, voldb and
mediadb for 5x, imagedb and EMM for 6x) will reside on this shared LUN. Binaries
will be installed on each node separately. There will be one node which will run normally
with Netbackup and VCS on it. This will be termed as "Active" node and the host that is
not running Netbackup will be the "Passive" or "Fail-over" node.

In this case, VCS monitors the NetBackup application on active node at all times and if
NetBackup becomes unavailable, VCS will detect this failure, it will gracefully stop
everything, unmount the Shared volume from active node, mount on the Passive node
and start netbackup there. The failed node can be now worked upon for disaster
recovery and backups will be interrupted for just a few minutes.

For netbackup to work in cluster, following criteria needs to be met:


- Shared storage between hosts
- At-least 3 NIC on each host
- Identical hardware.
- Same OS and Netbackup version.
- VCS installed and configured.

VCS:

VCS - Veritas Cluster Server, is Symantec's solution for high-availablity. It works on


SLES, RHEL, Solaris and Windows. It is responsible for Startup, Shutdown, Monitoring
and failover of applications configured. For an application to be configured for failover
in VCS, VCS must know the steps to Startup the Application, Monitor it and Shutting it
down. A user can define the logic in which the applications will be handled by VCS.

Terminology:

Heartbeat: Heartbeats are a communication mechanism for nodes to exchange


information concerning hardware and software status, keep track of cluster
membership, and keep this information synchronized across all cluster nodes. It is
recommended to have atleast two heartbeats.

Resource: A resource is an entity that may be brought online, offline, or monitored on a


particular system. Each separate resource is of a resource type. Examples of resource
types are mount points, IP addresses etc

There are three categories of VCS resources: on-off, on-only, and persistent.
- On-off means VCS can fully control the resource;
- on-only is a resource that VCS can restart but not shutdown;
- persistent resource is something that VCS will just monitor but cannot control. (NIC)

Resource agent: Every resource has an agent associated. The agent is responsible for
various actions on resource like online, offline, monitor

Service group: A service group is a logical collection of resources. These resources


will be taken online and offline together. Service groups come in two varieties -- failover
and parallel. Resource for Netbackup will be a failover resource

Dependency: A dependency relationship tells the cluster in what order to bring


resource entities online and offline. In each resource dependency relationship there is a
parent and a child. A parent resource will not be brought online until all of its children
are online.
Split brain: Split brain occurs when two or more systems within the cluster think they
have exclusive access to a shared resource at the same time. This can be very
damaging because data corruption is common in this situation.

Jeopardy: A system is in jeopardy when only one of its heartbeat connections is still
functioning. A loss of the remaining heartbeat network will not allow VCS to know
whether the host has crashed or the last heartbeat network has been disabled.

VCS has can be divided into two important parts:

1. Cluster Communication:
Low Latency Transport (LLT) and Global Atomic Broadcast (GAB) are
responsible for heartbeat and cluster communication. These are kernel modules
and are installed with VCS. LLT provides a fast and high-priority internal cluster
communication. LLT does not work on TCP/IP and its a different technology of
communication. GAB runs over LLT. GAB is primarily responsible for cluster
membership. So, LLT on each node will do the communication and GAB on each
node will maintain the cluster membership.

LLT (Low Latency Transport) -

Configuration files:
/etc/llttab
/etc/llthosts

Commands:
lltstat
lltconfig

GAB -
Configuration file:
/etc/gabtab
Command:
gabconfig

2. HAD (High Availability daemon.):


This is also known as VCS Engine. This is the heart of VCS. HAD is responsible
for all the cluster functionality. HAD talks to all the agents, has all the
configuration/logic in the memory. There is another process called hashadow,
whose primary job is to monitor HAD.

Configuration files for HAD:


/etc/VRTSvcs/conf/config/main.cf
/etc/VRTSvcs/conf/config/types.cf
Commands for HAD:
/opt/VRTS/bin/hastop (stops HAD)
/opt/VRTS/bin/hastart (start HAD)
/opt/VRTS/bin/hastatus (monitor HAD status)
/opt/VRTS/bin/hagrp (monitor/manage Service group)
/opt/VRTS/bin/hares (monitor/manage Resources)
/opt/VRTS/bin/hacf –verify /etc/VRTSvcs/conf/config (checks main.cf for syntax
issues)

NOTE: For LLT, GAB and HAD, there is a dependency. At the system start up, first LLT
starts, then GAB and then HAD. HAD will not run without GAB and GAB will not run
without LLT:

VCS startup:

1. LLT starts. It reads /etc/llttab and /etc/llthosts.


2. GAB starts (It executes /etc/gabtab). It checks for other GABs to establish a
cluster membership.
3. Once GAB is loaded, hashadow starts which lods HAD
4. HAD reads /etc/VRTSvcs/conf/config/main.cf and all include .cf mentioned in
main.cf.
5. HAD checks if there are other HADs avaible. It registers itself with GAB.
6. If there are no other HADs, it loads the main.cf again into HAD memory.
7. Same process will happen when HAD starts on other nodes. The HAD on the
first node will load the main.cf and other .cf files from the local system (also
called as "local build") and all other HADs will load configuration from the first
HAD (also called as "remote build")
8. After starting up, HAD will know all the service groups and resources from
main.cf. It will call the respective agents to check if the resources are currently
online or offline.
9. Based on main.cf, HAD will online/offline the Service group on the respective
nodes.
10. Check if all the service groups are running by command hastatus -sum

Important actions that can be taken by an admin while working on VCS:

VCS -
Start: Follow steps above.
Stop: Stop the HAD, unload GAB and then unload LLT.
Service Groups -
Online: Manually bring a specific service group online on a specific node or all nodes.
Offline: Manually bring a service group offline on a specific node or all nodes.
Freeze: In terms of netbackup, if netbackup has problems, you might want to stop and
start netbackup a couple of times. Its necessary to freeze the service group at that time.
By freezing service group, we are telling VCS not to take any action on it.

Resource -
Online: Manually online a resource
Offline: Manually offline a resource
Probe: Ask the resource agent to probe for the resource and get its current status.

Netbackup in VCS:
Install Netbackup on nodes the way you would normally do. Netbackup installation
wizard asks for EMM server name and Master server name, at that time, give "virtual
name" for installation on both the nodes. Note that right now, nothing will go on the
shared LUN.

Once the installation is done, run the following script:

/usr/openv/netbackup/bin/cluster/cluster_config

This script will prompt for all the information that it needs and does the following:
- Create an agent "NetBackup" and its cf file at
/usr/openv/netbackup/bin/cluster/vcs/NetBackupTypes.cf
- Create service group. (usually nbu_group)
- create resources. (NIC, IP, DG, VOL, MOUNT and NETBACKUP)
- Moves the databases to the shared location
- Creates the file /usr/openv/netbackup/bin/cluster/NBU_RSP which holds information
about cluster configuration.

The good part about cluster_config script is that if any thing fails in the script, it does an
undo on everything, which means that next time you run the script again, it wont create
any duplicates in config.

Basic Tasks:
Create service group (hagrp -add)
Modify service group (hagrp –modify)
Delete service group (hagrp –delete)
Add resource(s) to a service group (hares –add)
Modify resources (hares –modify)
Delete resources (hares –delete)
Monitor the cluster (hastatus)
Switch over service group from one node to other (hagrp –switch)
Config files:
/etc/VRTSvcs/conf/config/main.cf
/etc/VRTSvcs/conf/config/types.cf
/usr/openv/netbackup/bin/cluster/vcs/NetBackupTypes.cf
/usr/openv/netbackup/bin/cluster/NBU_RSP

Logs:
System log
/var/VRTSvcs/logs/engine_A.log
/usr/openv/netbackup/bin/cluster/AGENT_DEBUG.log

I hope you enjoyed reading through it and hope it helps you in your day to day work.

You might also like