0% found this document useful (0 votes)
380 views407 pages

Emr 201807na-A00052276en Us PDF

Uploaded by

aeldeeb7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
380 views407 pages

Emr 201807na-A00052276en Us PDF

Uploaded by

aeldeeb7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Managing HPE Serviceguard for Linux A.

12.30.00

Part Number: P04496-002


Published: July 2018
Contents

Serviceguard for Linux at a Glance.....................................................16


What is Serviceguard for Linux? ................................................................................................ 16
Failover.............................................................................................................................17
Using Serviceguard for Configuring in an Extended Distance Cluster Environment.................. 18
Using Serviceguard Manager..................................................................................................... 18
Launching Serviceguard Manager................................................................................... 18
About the Online Help System ........................................................................................ 18
Configuration Roadmap.............................................................................................................. 19

Understanding Hardware Configurations for Serviceguard for


Linux.......................................................................................................20
Redundant Cluster Components.................................................................................................20
Redundant Network Components ..............................................................................................20
Rules and Restrictions..................................................................................................... 21
Redundant Ethernet Configuration ..................................................................................21
Cross-Subnet Configurations........................................................................................... 22
Redundant Disk Storage.............................................................................................................24
Supported Disk Interfaces ...............................................................................................24
Disk Monitoring.................................................................................................................25
Sample Disk Configurations ............................................................................................25
Redundant Power Supplies ....................................................................................................... 25

Understanding Serviceguard Software Components........................26


Serviceguard Architecture...........................................................................................................26
Serviceguard Daemons....................................................................................................27
Serviceguard WBEM Provider..........................................................................................31
How the Cluster Manager Works ............................................................................................... 33
Configuration of the Cluster ............................................................................................ 33
Heartbeat Messages ....................................................................................................... 33
Manual Startup of Entire Cluster...................................................................................... 34
Automatic Cluster Startup ............................................................................................... 34
Dynamic Cluster Re-formation ........................................................................................ 34
Cluster Quorum to Prevent Split-Brain Syndrome........................................................... 34
Cluster Lock..................................................................................................................... 35
Use of a Lock LUN as the Cluster Lock........................................................................... 35
Use of the Quorum Server as a Cluster Lock.................................................................. 36
No Cluster Lock ...............................................................................................................38
What Happens when You Change the Quorum Configuration Online............................. 39
Using the Cluster Generic Resources Monitoring Service............................................... 39
How the Package Manager Works..............................................................................................41
Package Types.................................................................................................................41
Using the Generic Resources Monitoring Service........................................................... 47
Using Older Package Configuration Files........................................................................ 48
How Packages Run.................................................................................................................... 48
What Makes a Package Run?..........................................................................................48
Before the Control Script Starts....................................................................................... 50
During Run Script Execution............................................................................................ 50

2 Contents
Normal and Abnormal Exits from the Run Script............................................................. 51
Service Startup with cmrunserv ....................................................................................52
While Services are Running.............................................................................................52
When a Service or Subnet Fails or Generic Resource or a Dependency is Not Met....... 52
When a Package is Halted with a Command...................................................................53
During Halt Script Execution............................................................................................ 53
Normal and Abnormal Exits from the Halt Script..............................................................54
How the Network Manager Works ............................................................................................. 56
Stationary and Relocatable IP Addresses and Monitored Subnets................................. 56
Types of IP Addresses..................................................................................................... 57
Adding and Deleting Relocatable IP Addresses ............................................................. 57
Bonding of LAN Interfaces .............................................................................................. 58
Bonding for Load Balancing............................................................................................. 60
Monitoring LAN Interfaces and Detecting Failure: Link Level.......................................... 60
Monitoring LAN Interfaces and Detecting Failure: IP Level............................................. 60
Reporting Link-Level and IP-Level Failures..................................................................... 63
Package Switching and Relocatable IP Addresses......................................................... 64
Address Resolution Messages after Switching on the Same Subnet ............................. 64
VLAN Configurations........................................................................................................64
About Persistent Reservations....................................................................................................65
Rules and Limitations.......................................................................................................66
How Persistent Reservations Work..................................................................................67
Volume Managers for Data Storage............................................................................................68
Storage on Arrays............................................................................................................ 68
Monitoring Disks...............................................................................................................69
More Information on LVM................................................................................................. 69
Veritas Volume Manager (VxVM)..................................................................................... 69
Using VMware Virtual Machine File System Disks...........................................................70
Storage configuration type in a VMware environment..................................................... 70
Root Disk Monitoring...................................................................................................................76
Responses to Failures ............................................................................................................... 77
Reboot When a Node Fails ............................................................................................. 77
Responses to Hardware Failures ....................................................................................78
Responses to Root Disk failures...................................................................................... 79
Responses to Generic Resources Failures at cluster level..............................................79
Responses to Package and Service Failures ................................................................. 79
Responses to Package and Generic Resources Failures................................................80

Planning and Documenting an HA Cluster ........................................81


General Planning ....................................................................................................................... 81
Serviceguard Memory Requirements...............................................................................82
Planning for Expansion ................................................................................................... 82
Using Serviceguard with Virtual Machines..................................................................................82
Rules and Restrictions..................................................................................................... 82
Supported cluster configuration options...........................................................................83
Serviceguard support for VMware Migrate (vMotion).......................................................84
Migrating a node using Serviceguard Manager............................................................... 85
Configuring Serviceguard and VMware HA in a cluster................................................... 87
Serviceguard support for VMware DRS........................................................................... 87
Using Serviceguard with VMware Site Recovery Manager..............................................90
Hardware Planning .................................................................................................................... 97
SPU Information ..............................................................................................................97
LAN Information .............................................................................................................. 97
Shared Storage................................................................................................................ 98
Disk I/O Information .........................................................................................................98

Contents 3
Hardware Configuration Worksheet ................................................................................ 99
Power Supply Planning ..............................................................................................................99
Power Supply Configuration Worksheet ........................................................................100
Cluster Lock Planning............................................................................................................... 100
Cluster Lock Requirements............................................................................................101
Planning for Expansion.................................................................................................. 101
Using a Quorum Server..................................................................................................101
Configuring Asymmetric nodes in a Disaster Recovery Deployment............................. 102
Volume Manager Planning .......................................................................................................104
Volume Groups and Physical Volume Worksheet.......................................................... 104
VxVM Planning ........................................................................................................................ 104
Cluster Configuration Planning ................................................................................................ 104
Easy Deployment........................................................................................................... 105
Heartbeat Subnet and Cluster Re-formation Time ........................................................ 107
About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode................ 108
Cluster Configuration Parameters.................................................................................. 111
Cluster Configuration: Next Step....................................................................................125
Package Configuration Planning...............................................................................................125
Logical Volume and File System Planning .................................................................... 125
Planning for NFS-mounted File Systems....................................................................... 126
Planning for Expansion.................................................................................................. 128
Choosing Switching and Failover Behavior....................................................................128
Configuring DLS based VMDK (VMFS/RDM) in the Package....................................... 129
Parameters for Configuring Generic Resources............................................................ 130
Configuring a Generic Resource....................................................................................131
About Package Dependencies.......................................................................................135
What Happens When a Package Fails.......................................................................... 142
For More Information......................................................................................................143
About Package Weights................................................................................................. 143
About External Scripts....................................................................................................162
About Cross-Subnet Failover......................................................................................... 165
Configuring a Package: Next Steps............................................................................... 167
Planning for Changes in Cluster Size....................................................................................... 168

Building an HA Cluster Configuration.............................................. 169


Preparing Your Systems .......................................................................................................... 169
Installing and Updating Serviceguard ........................................................................... 169
Understanding the Location of Serviceguard Files........................................................ 169
Enabling Serviceguard Command Access.....................................................................170
Configuring Root-Level Access......................................................................................171
Configuring Name Resolution........................................................................................ 172
Ensuring Consistency of Kernel Configuration ..............................................................174
Enabling the Network Time Protocol ............................................................................. 174
Channel Bonding............................................................................................................174
Setting up a Lock LUN................................................................................................... 178
Setting Up and Running the Quorum Server................................................................. 181
Creating the Logical Volume Infrastructure ................................................................... 181
Creating a Storage Infrastructure with VxVM.................................................................190
Configuring the Cluster............................................................................................................. 193
cmquerycl Options..........................................................................................................194
Specifying a Lock LUN...................................................................................................195
Specifying a VCENTER_SERVER or ESX_HOST........................................................ 196
Specifying a Quorum Server.......................................................................................... 197
Obtaining Cross-Subnet Information..............................................................................198
Identifying Heartbeat Subnets........................................................................................200

4 Contents
Specifying Maximum Number of Configured Packages ................................................200
Modifying the MEMBER_TIMEOUT Parameter............................................................. 200
Configuring Root Disk Monitoring parameter................................................................. 200
Controlling Access to the Cluster................................................................................... 200
Configuring Cluster Generic Resources.........................................................................206
Verifying the Cluster Configuration ................................................................................212
Cluster Lock Configuration Messages........................................................................... 213
Distributing the Binary Configuration File ......................................................................213
Managing the Running Cluster................................................................................................. 213
Checking Cluster Operation with Serviceguard Commands.......................................... 214
Setting up Autostart Features ....................................................................................... 215
Changing the System Message .................................................................................... 216
Managing a Single-Node Cluster................................................................................... 216
Disabling identd..............................................................................................................217
Deleting the Cluster Configuration ................................................................................ 217

Configuring Packages and Their Services ...................................... 219


Choosing Package Modules..................................................................................................... 219
Types of Package: Failover, Multi-Node, System Multi-Node........................................ 219
Differences between Failover and Multi-Node Packages...............................................220
Package Modules and Parameters................................................................................ 221
Package Parameter Explanations..................................................................................226
Generating the Package Configuration File.............................................................................. 245
Before You Start............................................................................................................. 245
cmmakepkg Examples................................................................................................... 245
Next Step........................................................................................................................246
Editing the Configuration File....................................................................................................247
Adding or Removing a Module from an Existing Package........................................................250
Verifying and Applying the Package Configuration................................................................... 250
Alert Notification for Serviceguard Environment....................................................................... 252
Adding the Package to the Cluster........................................................................................... 253
How Control Scripts Manage VxVM Disk Groups.....................................................................253
Creating a Disk Monitor Configuration...................................................................................... 254

Cluster and Package Maintenance.................................................... 255


Reviewing Cluster and Package Status ...................................................................................255
Reviewing Cluster and Package Status with the cmviewcl Command .......................255
Viewing Package Dependencies....................................................................................255
Cluster Status ................................................................................................................255
Node Status and State .................................................................................................. 256
Package Status and State..............................................................................................256
Package Switching Attributes.........................................................................................258
Service Status ............................................................................................................... 258
Generic resource status for cluster and package...........................................................258
Network Status...............................................................................................................259
Failover and Failback Policies........................................................................................259
Examples of Cluster and Package States .....................................................................259
Checking the Cluster Configuration and Components................................................... 264
Managing the Cluster and Nodes ............................................................................................ 270
Starting the Cluster When all Nodes are Down..............................................................271
Adding Previously Configured Nodes to a Running Cluster...........................................271
Removing Nodes from Participation in a Running Cluster............................................. 271
Halting the Entire Cluster .............................................................................................. 272
Automatically Restarting the Cluster ............................................................................. 272

Contents 5
Halting a Node or the Cluster while Keeping Packages Running............................................. 272
What You Can Do...........................................................................................................273
Rules and Restrictions................................................................................................... 273
Additional Points To Note............................................................................................... 274
Halting a Node and Detaching its Packages..................................................................276
Halting a Detached Package..........................................................................................276
Halting the Cluster and Detaching its Packages............................................................ 276
Example: Halting the Cluster for Maintenance on the Heartbeat Subnets.....................277
Managing Packages and Services ...........................................................................................277
Starting a Package ........................................................................................................277
Halting a Package ......................................................................................................... 278
Moving a Failover Package ...........................................................................................280
Changing Package Switching Behavior ........................................................................ 280
Maintaining a Package: Maintenance Mode............................................................................. 280
Characteristics of a Package Running in Maintenance Mode or Partial-Startup
Maintenance Mode ........................................................................................................281
Performing Maintenance Using Maintenance Mode...................................................... 283
Performing Maintenance Using Partial-Startup Maintenance Mode.............................. 284
Reconfiguring a Cluster............................................................................................................ 285
Previewing the Effect of Cluster Changes......................................................................286
Reconfiguring a Halted Cluster ..................................................................................... 288
Reconfiguring a Running Cluster................................................................................... 288
Changing the Cluster Networking Configuration while the Cluster Is Running.............. 290
Updating the Cluster Lock LUN Configuration Online....................................................293
Resetting the cluster generic resource restart counter.................................................. 294
Changing MAX_CONFIGURED_PACKAGES............................................................... 294
Changing the VxVM Storage Configuration .................................................................. 294
Reconfiguring a Package..........................................................................................................294
Reconfiguring a Package on a Running Cluster ........................................................... 295
Renaming or Replacing an External Script Used by a Running Package......................295
Reconfiguring a Package on a Halted Cluster .............................................................. 296
Adding a Package to a Running Cluster........................................................................ 296
Deleting a Package from a Running Cluster ................................................................. 296
Resetting the Service Restart Counter...........................................................................297
Allowable Package States During Reconfiguration .......................................................297
Online Reconfiguration of Modular package.................................................................. 301
Migrate generic resources from package to cluster....................................................... 308
Responding to Cluster Events ................................................................................................. 310
Single-Node Operation .............................................................................................................311
Removing Serviceguard from a System....................................................................................311

Understanding Site Aware Disaster Tolerant Architecture............. 312


Terms and Concepts................................................................................................................. 312
Site................................................................................................................................. 312
Complex Workload......................................................................................................... 313
Redundant Configuration............................................................................................... 313
Site Controller Package................................................................................................. 314
Site Safety Latch and its Status..................................................................................... 315
How to Deploy and Configure the Complex Workloads for Disaster Recovery using SADTA.. 317
Configuring the Workload Packages and its Recovery Packages................................. 317
Managing SADTA Configuration............................................................................................... 322
Moving the Site Controller Package Without Affecting Workloads ................................323
Rules for a Site Controller Package in Maintenance Mode............................................323
Detaching a Node When Running Site Controller Package...........................................324
Understanding the Smart Quorum............................................................................................ 324

6 Contents
How to Use the Smart Quorum...................................................................................... 325
Examples........................................................................................................................325
Limitation...................................................................................................................................329

Simulating a Serviceguard Cluster....................................................330


Simulating the Cluster...............................................................................................................331
Creating the Simulated Cluster...................................................................................... 331
Importing the Cluster State............................................................................................ 331
Managing the Cluster..................................................................................................... 332
Managing the Nodes in the Simulated Cluster...............................................................333
Simulation Scenarios for the Package...................................................................................... 333
Creating a Simulated Package.......................................................................................334
Running a Package .......................................................................................................334
Halting a Package.......................................................................................................... 334
Deleting a Package........................................................................................................ 334
Enabling or Disabling Switching Attributes for a Package............................................. 334
Simulating Failure Scenarios.................................................................................................... 335

Cluster Analytics.................................................................................337
Upgrading the Cluster Analytics Software................................................................................ 339
Pre-requisites................................................................................................................. 339
Upgrading serviceguard-analytics Software...................................................................339
Verifying serviceguard-analytics Installation.................................................................. 340
Removing serviceguard-analytics Software................................................................... 340
Configuring NFS as Shared Storage........................................................................................ 340
Cluster Analytics Database Migration to Shared Storage.........................................................341
Starting Cluster Analytics Daemon........................................................................................... 341
Cluster Event Message Consolidation........................................................................... 341
Stopping Cluster Analytics Daemon......................................................................................... 342
Verifying Cluster Analytics Daemon..........................................................................................342
Removing Cluster Analytics State Configuration File............................................................... 343
Command to Retrieve KPIs...................................................................................................... 343
Limitation...................................................................................................................................344

Integrating Application Tuner Express............................................. 345


Serviceguard utility functions.................................................................................................... 345

Troubleshooting Your Cluster............................................................346


Testing Cluster Operation ........................................................................................................ 346
Testing the Package Manager .......................................................................................346
Testing the Cluster Manager ......................................................................................... 347
Monitoring Hardware ................................................................................................................348
Replacing Disks........................................................................................................................ 348
Replacing a Faulty Mechanism in a Disk Array..............................................................348
Replacing a Lock LUN................................................................................................... 348
Revoking Persistent Reservations after a Catastrophic Failure................................................349
Examples........................................................................................................................349
Replacing LAN Cards............................................................................................................... 350
Replacing a Failed Quorum Server System..............................................................................351
Troubleshooting Approaches ................................................................................................... 352
Reviewing Package IP Addresses ................................................................................ 352
Reviewing the System Log File .....................................................................................353

Contents 7
Reviewing Configuration Files .......................................................................................354
Using the cmquerycl and cmcheckconf Commands ............................................... 354
Reviewing the LAN Configuration ................................................................................. 355
Solving Problems ..................................................................................................................... 355
Name Resolution Problems........................................................................................... 355
Halting a Detached Package..........................................................................................356
Cluster Re-formations Caused by Temporary Conditions.............................................. 356
Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low................. 356
System Administration Errors ........................................................................................357
Node and Network Failures ...........................................................................................359
Troubleshooting the Quorum Server.............................................................................. 359
Lock LUN Messages...................................................................................................... 360
Host IO Timeout Messages............................................................................................360
Troubleshooting serviceguard-xdc package............................................................................. 361
Troubleshooting cmvmusermgmt Utility.................................................................................... 361

Support and other resources.............................................................362


Accessing Hewlett Packard Enterprise Support....................................................................... 362
Accessing updates....................................................................................................................362
Websites................................................................................................................................... 363
Related documents................................................................................................................... 363
Customer self repair..................................................................................................................364
Remote support........................................................................................................................ 364
Documentation feedback.......................................................................................................... 364

Designing Highly Available Cluster Applications ........................... 365


Automating Application Operation ........................................................................................... 365
Insulate Users from Outages ........................................................................................ 366
Define Application Startup and Shutdown .....................................................................366
Controlling the Speed of Application Failover .......................................................................... 366
Replicate Non-Data File Systems ................................................................................. 367
Evaluate the Use of a Journaled Filesystem (JFS)........................................................ 367
Minimize Data Loss .......................................................................................................367
Use Restartable Transactions ....................................................................................... 368
Use Checkpoints ........................................................................................................... 368
Design for Multiple Servers ........................................................................................... 369
Design for Replicated Data Sites .................................................................................. 369
Designing Applications to Run on Multiple Systems ................................................................369
Avoid Node Specific Information ................................................................................... 370
Avoid Using SPU IDs or MAC Addresses ..................................................................... 370
Assign Unique Names to Applications .......................................................................... 371
Use uname(2) With Care ...............................................................................................371
Bind to a Fixed Port .......................................................................................................372
Bind to Relocatable IP Addresses .................................................................................372
Give Each Application its Own Volume Group .............................................................. 372
Use Multiple Destinations for SNA Applications ............................................................373
Avoid File Locking ......................................................................................................... 373
Restoring Client Connections .................................................................................................. 373
Handling Application Failures .................................................................................................. 374
Create Applications to be Failure Tolerant .................................................................... 374
Be Able to Monitor Applications .................................................................................... 375
Minimizing Planned Downtime .................................................................................................375
Reducing Time Needed for Application Upgrades and Patches ................................... 375
Providing Online Application Reconfiguration ............................................................... 376

8 Contents
Documenting Maintenance Operations .........................................................................376

Integrating HA Applications with Serviceguard...............................377


Checklist for Integrating HA Applications .................................................................................377
Defining Baseline Application Behavior on a Single System ........................................ 377
Integrating HA Applications in Multiple Systems ...........................................................378
Testing the Cluster ........................................................................................................ 378

Blank Planning Worksheets ..............................................................380


Hardware Worksheet ............................................................................................................... 380
Power Supply Worksheet .........................................................................................................381
Quorum Server Worksheet ...................................................................................................... 381
Volume Group and Physical Volume Worksheet ......................................................................382
Cluster Configuration Worksheet ............................................................................................. 382
Package Configuration Worksheet ...........................................................................................383

IPv6 Network Support.........................................................................385


IPv6 Address Types.................................................................................................................. 385
Textual Representation of IPv6 Addresses.................................................................... 385
IPv6 Address Prefix........................................................................................................386
Unicast Addresses......................................................................................................... 386
IPv4 and IPv6 Compatibility........................................................................................... 386
Network Configuration Restrictions...........................................................................................388
Configuring IPv6 on Linux.........................................................................................................389
Enabling IPv6 on Red Hat Linux.................................................................................... 389
Adding persistent IPv6 Addresses on Red Hat Linux.................................................... 389
Configuring a Channel Bonding Interface with Persistent IPv6 Addresses on Red
Hat Linux........................................................................................................................ 389
Adding Persistent IPv6 Addresses on SUSE................................................................. 390
Configuring a Channel Bonding Interface with Persistent IPv6 Addresses on SUSE....390

Maximum and Minimum Values for Parameters...............................391

Monitoring Script for Generic Resources.........................................392


Launching Monitoring Scripts....................................................................................................392
Template of a Monitoring Script................................................................................................ 395

Monitoring Script for Cluster Generic Resources........................... 398


Cluster Generic Resources template scripts.............................................................................398

Using Serviceguard RESTful Application Programming Interface


.............................................................................................................. 402
Launching Serviceguard RESTful Application Programming Interface.....................................402

Serviceguard Toolkit for Linux.......................................................... 403

Serviceguard Manager for Linux....................................................... 404

Contents 9
Disaster recovery rehearsal overview.......................................................................................404
Prerequisites to deploy DRR on VMware environment..................................................404
Cluster configuration properties................................................................................................ 405
Toolkit Studio overview............................................................................................................. 407
Workload overview....................................................................................................................407

10 Contents
The information contained herein is subject to change without notice. The only warranties for Hewlett
Packard Enterprise products and services are set forth in the express warranty statements accompanying
such products and services. Nothing herein should be construed as constituting an additional warranty.
Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained
herein.
Links to third-party websites take you outside the Hewlett Packard Enterprise website. Hewlett Packard
Enterprise has no control over and is not responsible for information outside the Hewlett Packard
Enterprise website.
Confidential computer software. Valid license from Hewlett Packard Enterprise required for possession,
use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer
Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government
under vendor's standard commercial license.
Copyright © 2016 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are
trademarks or registered trademarks of Veritas Technologies LLC or its affiliates in the U.S. and other
countries. Other names may be trademarks of their respective owners.
NIS™ is a trademark of Sun Microsystems, Inc.
UNIX® is a registered trademark of The Open Group.
Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.
Red Hat® is a registered trademark of Red Hat, Inc. in the United States and other countries.
SUSE® is a registered trademark of SUSE AG, a Novell Business.
VMware and vCenter Server are registered trademarks or trademarks of VMware, Inc. in the United
States and/or other jurisdictions.
Publication History
Publication Date Part Number Edition

August 2017 701460-402b NA

July 2017 701460-402a NA

July 2016 701460-009a NA

June 2016 701460-009 NA

March 2016 701460-008a NA

December 2015 701460-008 NA

August 2015 701460-007 NA

March 2015 701460-006e NA

December 2014 701460-006d NA

December 2014 701460-006c NA

November 2014 701460-006b NA

November 2014 701460-006a NA

October 2014 701460-006 NA

August 2014 701460-005b NA

June 2014 701460-005a NA

March 2014 701460-005 NA

December 2015 701460-005R NA

The last publication date and part number indicate the current edition, which applies to the A.12.00.40
version of HPE Serviceguard for Linux.
The publication date changes when a new edition is published. (Minor corrections and updates which are
incorporated at reprint do not cause the date to change.) The part number is revised when extensive
technical changes are incorporated.
New editions of this manual will incorporate all material updated since the previous edition.

Publication History 13
Preface
This guide describes how to configure and manage Serviceguard for Linux on HPE ProLiant server under
the Linux operating system. It is intended for experienced Linux system administrators. (For Linux system
administration tasks that are not specific to Serviceguard, use the system administration documentation
and manpages for your distribution of Linux.)

NOTE: Starting Serviceguard A.12.00.00, legacy packages are obsolete and only modular packages are
supported. For more information about how to migrate to modular packages, see the white paper
Migrating packages from legacy to modular style available at http://www.hpe.com/info/linux-
serviceguard-docs.

The contents are as follows:

• Serviceguard for Linux at a Glance describes a Serviceguard cluster and provides a roadmap for
using this guide.
• Understanding Hardware Configurations for Serviceguard for Linux provides a general view of
the hardware configurations used by Serviceguard.
• Understanding Serviceguard Software Components describes the software components of
Serviceguard and shows how they function within the Linux operating system.
• Planning and Documenting an HA Cluster steps through the planning process.
• Building an HA Cluster Configuration describes the creation of the cluster configuration.
• Configuring Packages and Their Services describes the creation of high availability packages.
• Cluster and Package Maintenance presents the basic cluster administration tasks.
• Understanding Site Aware Disaster Tolerant Architecture describes Site Aware Disaster Tolerant
Architecture for deploying complex workloads with inter-dependent packages that are managed
collectively for disaster tolerance.
• Simulating a Serviceguard Cluster describes how to simulate Serviceguard clusters.
• Cluster Analytics provides a mechanism to the users to perform "availability audits" on their
Serviceguard cluster, nodes, and application packages running on the clusters.
• Troubleshooting Your Cluster explains cluster testing and troubleshooting strategies.
• Designing Highly Available Cluster Applications gives guidelines for creating cluster-aware
applications that provide optimal performance in a Serviceguard environment.
• Integrating HA Applications with Serviceguard provides suggestions for integrating your existing
applications with Serviceguard for Linux.
• Blank Planning Worksheets contains a set of empty worksheets for preparing a Serviceguard
configuration.
• IPv6 Network Support provides information about IPv6.
• Maximum and Minimum Values for Parameters provides a reference to the supported ranges for
Serviceguard parameters.

14 Preface
• Monitoring Script for Generic Resources provides the monitoring script template for Generic
Resources.
• Serviceguard Toolkit for Linux describes a group of tools to simplify the integration of popular
applications with Serviceguard.

Preface 15
Serviceguard for Linux at a Glance
This chapter introduces Serviceguard for Linux and shows where to find different kinds of information in
this book. It includes the following topics:

• What is Serviceguard for Linux?


• Using Serviceguard for Configuring in an Extended Distance Cluster Environment
• Using Serviceguard Manager
• Configuration Roadmap

If you are ready to start setting up Serviceguard clusters, skip ahead to Planning and Documenting an
HA Cluster . Specific steps for setup are in Building an HA Cluster Configuration.

What is Serviceguard for Linux?


Serviceguard for Linux allows you to create high availability clusters of ProLiant server. A high availability
computer system allows application services to continue in spite of a hardware or software failure. Highly
available systems protect users from software failures as well as from failure of a system processing unit
(SPU), disk, or local area network (LAN) component. In the event that one component fails, the redundant
component takes over. Serviceguard and other high availability subsystems coordinate the transfer
between components.
A Serviceguard cluster is a networked grouping of ProLiant server (host systems known as nodes) having
sufficient redundancy of software and hardware that a single point of failure will not significantly disrupt
service. Application services (individual Linux processes) are grouped together in packages; in the event
of a single service, node, network, or other resource failure, Serviceguard can automatically transfer
control of the package to another node within the cluster, allowing services to remain available with
minimal interruption.

Node 1 Node 2

Root Disk array Root


PkgA PkgB

Copy Disk array Copy

Hub

Figure 1: Typical Cluster Configuration

In the figure, node 1 (one of two SPU's) is running package A, and node 2 is running package B. Each
package has a separate group of disks associated with it, containing data needed by the package's

16 Serviceguard for Linux at a Glance


applications, and a copy of the data. Note that both nodes are physically connected to disk arrays.
However, only one node at a time may access the data for a given group of disks. In the figure, node 1 is
shown with exclusive access to the top two disks (solid line), and node 2 is shown as connected without
access to the top disks (dotted line). Similarly, node 2 is shown with exclusive access to the bottom two
disks (solid line), and node 1 is shown as connected without access to the bottom disks (dotted line).
Disk arrays provide redundancy in case of disk failures. In addition, a total of four data buses are shown
for the disks that are connected to node 1 and node 2. This configuration provides the maximum
redundancy and also gives optimal I/O performance, since each package is using different buses.
Note that the network hardware is cabled to provide redundant LAN interfaces on each node.
Serviceguard uses TCP/IP network services for reliable communication among nodes in the cluster,
including the transmission of heartbeat messages, signals from each functioning node which are central
to the operation of the cluster. TCP/IP services also are used for other types of inter-node communication.
(See, Understanding Serviceguard Software Components for more information about heartbeat.)

Failover
Under normal conditions, a fully operating Serviceguard cluster simply monitors the health of the cluster's
components while the packages are running on individual nodes. Any host system running in the
Serviceguard cluster is called an active node. When you create the package, you specify a primary node
and one or more adoptive nodes. When a node or its network communications fails, Serviceguard can
transfer control of the package to the next available adoptive node. This situation is shown in the following
figure.

Node 1 Node 2

Root Disk array PkgA Root

PkgB
Copy Disk array Copy

Hub

Figure 2: Typical Cluster after Failover

After this transfer, the package typically remains on the adoptive node as long the adoptive node
continues running. If you wish, however, you can configure the package to return to its primary node as
soon as the primary node comes back online. Alternatively, you may manually transfer control of the
package back to the primary node at the appropriate time.
Figure 2: Typical Cluster after Failover on page 17 does not show the power connections to the cluster,
but these are important as well. In order to remove all single points of failure from the cluster, you should
provide as many separate power circuits as needed to prevent a single point of failure of your nodes,
disks and disk mirrors. Each power circuit should be protected by an uninterruptible power source. For
more details, see Power Supply Planning section.

Failover 17
Serviceguard is designed to work in conjunction with other high availability products, such as disk arrays,
which use various RAID levels for data protection; and Hewlett Packard Enterprise-supported
uninterruptible power supplies (UPS), which eliminate failures related to power outage. Hewlett Packard
Enterprise recommends these products; in conjunction with Serviceguard they provide the highest degree
of availability.

Using Serviceguard for Configuring in an Extended


Distance Cluster Environment
An extended distance cluster (also known as extended campus cluster) is a normal Serviceguard cluster
that has alternate nodes located in different data centers separated by some distance, with a third
location supporting the quorum service. Extended distance clusters are connected using a high speed
cable that guarantees network access between the nodes as long as all guidelines for disaster recovery
architecture are followed. For more information, see the following documents at http://www.hpe.com/
info/linux-serviceguard-docs:

• HPE Serviceguard Extended Distance Cluster for Linux A.12.00.40 Deployment Guide
• Understanding and Designing Serviceguard Disaster Recovery Architectures

Using Serviceguard Manager


Serviceguard Manager is a web-based GUI management application for Serviceguard clusters.
Serviceguard Manager is used to configure, monitor, and administer Serviceguard clusters, and
Serviceguard Disaster Recovery clusters like Extended Distance Cluster and Metrocluster. Serviceguard
Manager protects critical applications from planned and unplanned downtime.

Launching Serviceguard Manager


Serviceguard Manager runs on 5511 and 5522 ports. 5511 is the http port and 5522 is the https port. To
access Serviceguard Manager the URL must be in the following format:
http://<fully_qualified_domain&gt;:5511/
Serviceguard Manager uses the default keystore (certificate) that comes along with jetty installation.
Update the keystore path in /opt/hp/cmcluster/serviceguardmanager/jetty/etc/jetty-
ssl.xml to your own certificate; this will make Serviceguard Manager to use the custom SSL certificate
for the 5522 port.
Once the Serviceguard Manager login screen appears enter username and password based on your
requirement; either login as root user or sgmgr user or as user with admin permissions. For information
on user types and user roles, see Security section in the online help.

About the Online Help System


The Serviceguard Manager online help provides required information to configure, administer, and
monitor multiple clusters in the landscape (scope of cluster management).
Some of the important features are:

• Provides context-sensitive online help.


• Provides a series of steps to navigate through specific screens and configuration procedures.

The online help provides multiple ways to navigate through help topics. To find the information that you
need quickly and easily, you can:

18 Using Serviceguard for Configuring in an Extended Distance Cluster Environment


• Click the Help link on any Serviceguard Manager page.
• Click help on this page link on a help page.
• Click Browse help link on a help page.
• Select a topic in a Serviceguard Manager online help.

Configuration Roadmap
This manual presents the tasks you need to perform in order to create a functioning HA cluster using
Serviceguard. These tasks are shown in the following image:

Plan the Cluster Set Up Hardware Configuration LVM

• Hardware • SPU • Physical Volumes


• LVM • Disks • Volume Groups
• Cluster • LANs • Logical Volumes
• Packages • Power • Copying

Chapter 1-4 Chapter 5 Chapter 5

Configure Cluster Configure Packages Maintenance


• Gather Data • Edit Pkg • Cluster
• Edit Cluster Config File • Packages
Configuration File • Check FIles • Hardware
• Check File • Sent to Other • Diagnostics
• Send to Other Nodes Notes • Monitors
Chapter 5 Chapter 6 Chapter 7,8

Figure 3: Tasks in Configuring a Serviceguard Cluster

Hewlett Packard Enterprise recommends that you gather all the data that is needed for configuration
before you start. See Planning and Documenting an HA Cluster for tips on gathering data.

Configuration Roadmap 19
Understanding Hardware Configurations for
Serviceguard for Linux
This chapter gives a broad overview of how the server hardware components operate with Serviceguard
for Linux. The following topics are presented:

• Redundant Cluster Components


• Redundant Network Components
• Redundant Disk Storage
• Redundant Power Supplies

Refer to the next chapter for information about Serviceguard software components.

Redundant Cluster Components


In order to provide a high level of availability, a typical cluster uses redundant system components, for
example, two or more SPUs and two or more independent disks. Redundancy eliminates single points of
failure. In general, the more redundancy, the greater your access to applications, data, and supportive
services in the event of a failure. In addition to hardware redundancy, you need software support to
enable and control the transfer of your applications to another SPU or network after a failure.
Serviceguard provides this support as follows:

• In the case of LAN failure, the Linux bonding facility provides a standby LAN, or Serviceguard moves
packages to another node.
• In the case of SPU failure, your application is transferred from a failed SPU to a functioning SPU
automatically and in a minimal amount of time.
• For software failures, an application can be restarted on the same node or another node with minimum
disruption.

Serviceguard also gives you the advantage of easily transferring control of your application to another
SPU in order to bring the original SPU down for system administration, maintenance, or version
upgrades.
The maximum number of nodes supported in a Serviceguard Linux cluster is 32; the actual number
depends on the storage configuration.
A package that does not use data from shared storage can be configured to fail over to as many nodes
as you have configured in the cluster (up to the maximum of 32), regardless of disk technology. For
instance, a package that runs only local executables, and uses only local data, can be configured to fail
over to all nodes in the cluster.

Redundant Network Components


To eliminate single points of failure for networking, each subnet accessed by a cluster node is required to
have redundant network interfaces. Redundant cables are also needed to protect against cable failures.
Each interface card is connected to a different cable and hub or switch.
Network interfaces are allowed to share IP addresses through a process known as channel bonding. See
Implementing Channel Bonding (Red Hat) on page 175 or Implementing Channel Bonding (SUSE)
on page 177.

20 Understanding Hardware Configurations for Serviceguard for Linux


Serviceguard supports a maximum of 30 network interfaces per node. For this purpose an interface is
defined as anything represented as a primary interface in the output of ifconfig, so the total of 30 can
comprise any combination of physical LAN interfaces or bonding interfaces. (A node can have more than
30 such interfaces, but only 30 can be part of the cluster configuration.)

Rules and Restrictions


• A single subnet cannot be configured on different network interfaces (NICs) on the same node.
• In the case of subnets that can be used for communication between cluster nodes, the same network
interface must not be used to route more than one subnet configured on the same node.
• For IPv4 subnets, Serviceguard does not support different subnets on the same LAN interface.

◦ For IPv6, Serviceguard supports up to two subnets per LAN interface (site-local and global).

• Serviceguard does support different subnets on the same bridged network (this applies at both the
node and the cluster level).
• Serviceguard does not support using networking tools such as ifconfig to add IP addresses to
network interfaces that are configured into the Serviceguard cluster, unless those IP addresses
themselves will be immediately configured into the cluster as stationary IP addresses.

CAUTION: If you configure any address other than a stationary IP address on a Serviceguard
network interface, it could collide with a relocatable package IP address assigned by
Serviceguard. See Stationary and Relocatable IP Addresses and Monitored Subnets on
page 56.

◦ Similarly, Serviceguard does not support using networking tools to move or reconfigure any IP
addresses configured into the cluster.
Doing so leads to unpredictable results because the Serviceguard view of the configuration is
different from the reality.

NOTE: If you will be using a cross-subnet configuration, see also the Restrictions that apply specifically
to such configurations.

Redundant Ethernet Configuration


The use of redundant network components is shown in the following figure, which is an Ethernet
configuration.

Rules and Restrictions 21


Node 1 Node 2

PkgA PkgB
Bounded Disk array Bounded
PkgC LAN cards
LAN cards
eth0 eth1 eth0 eth1
Disk array

Hub

Hub

Figure 4: Redundant LANs

In Linux configurations, the use of symmetrical LAN configurations is strongly recommended, with the use
of redundant hubs or switches to connect Ethernet segments. The software bonding configuration should
be identical on each node, with the active interfaces connected to the same hub or switch.

Cross-Subnet Configurations
As of Serviceguard A.11.18 or later, it is possible to configure multiple subnets, joined by a router, both for
the cluster heartbeat and for data, with some nodes using one subnet and some another.
A cross-subnet configuration allows:

• Automatic package failover from a node on one subnet to a node on another


• A cluster heartbeat that spans subnets.

Configuration Tasks
Cluster and package configuration tasks are affected as follows:

• You must use the -w full option to cmquerycl discover actual or potential nodes and subnets
across routers.
• You must configure two parameters in the package configuration file to allow packages to fail over
across subnets:

◦ ip_subnet_node - to indicate which nodes the subnet is configured on


◦ monitored_subnet_access - to indicate whether the subnet is configured on all nodes ( FULL) or
only some (PARTIAL)

• You should not use the wildcard (*) for node_name in the package configuration file, as this could
allow the package to fail over across subnets when a node on the same subnet is eligible; failing over

22 Cross-Subnet Configurations
across subnets can take longer than failing over on the same subnet. List the nodes in order of
preference instead of using the wildcard.
• You should configure IP monitoring for each subnet; see Monitoring LAN Interfaces and Detecting
Failure: IP Level.

Restrictions
The following restrictions apply:

• All nodes in the cluster must belong to the same network domain (that is, the domain portion of the
fully-qualified domain name must be the same.)
• The nodes must be fully connected at the IP level.
• A minimum of two heartbeat paths must be configured for each cluster node.
• There must be less than 200 milliseconds of latency in the heartbeat network.
• Each heartbeat subnet on each node must be physically routed separately to the heartbeat subnet on
another node; that is, each heartbeat path must be physically separate:

◦ The heartbeats must be statically routed; static route entries must be configured on each node to
route the heartbeats through different paths.
◦ Failure of a single router must not affect both heartbeats at the same time.

• IPv6 heartbeat subnets are not supported in a cross-subnet configuration.


• IPv6–only and mixed modes are not supported in a cross-subnet configuration. For more information
about these modes, see About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed
Mode.
• Deploying applications in this environment requires careful consideration; see Implications for
Application Deployment.
• cmrunnode will fail if the “hostname LAN” is down on the node in question. (“Hostname LAN” refers to
the public LAN on which the IP address that the node’s hostname resolves to is configured.)
• If a monitored_subnet is configured for PARTIAL monitored_subnet_access in a package’s
configuration file, it must be configured on at least one of the nodes on the node_name list for that
package. Conversely, if all of the subnets that are being monitored for this package are configured for
PARTIAL access, each node on the node_name list must have at least one of these subnets
configured.

◦ As in other configurations, a package will not start on a node unless the subnets configured on that
node, and specified in the package configuration file as monitored subnets, are up.

NOTE: See also the Rules and Restrictions that apply to all cluster networking configurations.

For More Information


For more information on the details of configuring the cluster and packages in a cross-subnet context, see
About Cross-Subnet Failover and Obtaining Cross-Subnet Information.
See also the white paper Technical Considerations for Creating a Serviceguard Cluster that Spans
Multiple IP Subnets, which you can find at the address below. This paper discusses and illustrates
supported configurations, and also potential mis-configurations.

Restrictions 23
IMPORTANT: Although cross-subnet topology can be implemented on a single site, it is most
commonly used by extended-distance clusters and Metrocluster. For more information about such
clusters, see the following documents at http://www.hpe.com/info/linux-serviceguard-docs:

• Understanding and Designing Serviceguard Disaster Recovery Architectures


• HPE Serviceguard Extended Distance Cluster for Linux A.12.00.40 Deployment Guide
• Building Disaster Recovery Serviceguard Solutions Using Metrocluster with 3PAR Remote Copy
for Linux
• Building Disaster Recovery Serviceguard Solutions Using Metrocluster with Continuous Access
XP P9000 for Linux
• Building Disaster Recovery Serviceguard Solutions Using Metrocluster with Continuous Access
EVA P6000 for Linux

Redundant Disk Storage


Each node in a cluster has its own root disk, but each node may also be physically connected to several
other disks in such a way that more than one node can obtain access to the data and programs
associated with a package it is configured for. This access is provided by the Logical Volume Manager
(LVM). A volume group must be activated by no more than one node at a time, but when the package is
moved, the volume group can be activated by the adoptive node.

NOTE: As of release A.11.16.07, Serviceguard for Linux provides functionality similar to HP-UX exclusive
activation. This feature is based on LVM2 hosttags, and is available only for Linux distributions that
officially support LVM2.

All of the disks in the volume group owned by a package must be connected to the original node and to
all possible adoptive nodes for that package.
Shared disk storage in Serviceguard Linux clusters is provided by disk arrays, which have redundant
power and the capability for connections to multiple nodes. Disk arrays use RAID modes to provide
redundancy.

Supported Disk Interfaces


The following interfaces are supported by Serviceguard for disks that are connected to two or more nodes
(shared data disks):

• FibreChannel
• iSCSI

Using iSCSI LUNs as Shared Storage


The following guidelines are applicable when iSCSI LUNs are used as shared storage:

• The iSCSI storage can be configured on a channel bonding. For more information about channel
bonding, see Implementing Channel Bonding (Red Hat) on page 175 or Implementing Channel
Bonding (SUSE) on page 177.
• Software initiator models support iSCSI storage.

NOTE: Ensure that the iSCSI daemon is persistent across reboots.

24 Redundant Disk Storage


• The iSCSI storage configured over LAN is similar to other LANs that are part of the cluster.
• Hewlett Packard Enterprise recommends that you do not use heartbeat LAN for iSCSI storage device.

The following restrictions are applicable when iSCSI LUNs are used as a shared storage:

• An iSCSI storage device does not support configuring a lock LUN.


• An iSCSI storage device that are exposed using SCSI targets is not supported.

Disk Monitoring
You can configure monitoring for disks and configure packages to be dependent on the monitor. For each
package, you define a package service that monitors the disks that are activated by that package. If a
disk failure occurs on one node, the monitor will cause the package to fail, with the potential to fail over to
a different node on which the same disks are available.

Sample Disk Configurations


The figure shows a two node cluster. Each node has one root disk which is mirrored and one package for
which it is the primary node. Resources have been allocated to each node so that each node can adopt
the package from the other node. Each package has one disk volume group assigned to it and the logical
volumes in that volume group are mirrored.

Node 1 Node 2

Root Disk array Root


PkgA PkgB

Copy Disk array Copy

Figure 5: Mirrored Disks Connected for High Availability

Redundant Power Supplies


You can extend the availability of your hardware by providing battery backup to your nodes and disks.
Hewlett Packard Enterprise-supported uninterruptible power supplies (UPS) can provide this protection
from momentary power loss.
Disks should be attached to power circuits in such a way that disk array copies are attached to different
power sources. The boot disk should be powered from the same circuit as its corresponding node.
Quorum server systems should be powered separately from cluster nodes. Your Hewlett Packard
Enterprise representative can provide more details about the layout of power supplies, disks, and LAN
hardware for clusters.

Disk Monitoring 25
Understanding Serviceguard Software
Components
This chapter gives a broad overview of how the Serviceguard software components work. It includes the
following topics:

• Serviceguard Architecture
• How the Cluster Manager Works
• How the Package Manager Works
• How Packages Run
• How the Network Manager Works
• Volume Managers for Data Storage
• Responses to Failures

If you are ready to start setting up Serviceguard clusters, skip ahead to Chapter 4, “Planning and
Documenting an HA Cluster.”

Serviceguard Architecture
The following figure shows the main software components used by Serviceguard for Linux. This chapter
discusses these components in some detail.

26 Understanding Serviceguard Software Components


Packages Apps/Services/Resources

Package Manager

MC/ServiceGuard
Cluster Manager
Components

Network Manager

Operating
Linux Kernel (with LVM)
System

Figure 6: Serviceguard Software Components on Linux

Serviceguard Daemons
Serviceguard for Linux uses the following daemons:

• cmclconfd —configuration daemon

• cmcld —cluster daemon

• cmnetd —Network Manager daemon

• cmlogd —cluster system log daemon

• cmdisklockd —cluster lock LUN daemon

• cmresourced —Serviceguard Generic Resource Assistant daemon

• cmprd—Persistent Reservation daemon

• cmserviced —Service Assistant daemon

• qs —Quorum Server daemon

• cmlockd —utility daemon

Serviceguard Daemons 27
• cmsnmpd —cluster SNMP subagent (optionally running)

• cmwbemd —WBEM daemon

• cmproxyd —proxy daemon

Each of these daemons logs to the Linux system logging files. The quorum server daemon logs to the
user specified log file, such as, /usr/local/qs/log/qs.log file on Red Hat or /var/log/qs/
sq.log on SUSE.

NOTE: The file cmcluster.conf contains the mappings that resolve symbolic references to $SGCONF,
$SGROOT, $SGLBIN, etc, used in the path names in the subsections that follow. See Understanding the
Location of Serviceguard Files on page 169 for details.

Configuration Daemon: cmclconfd


This daemon is used by the Serviceguard commands to gather information from all the nodes within the
cluster. It gathers configuration information such as information on networks and volume groups. It also
distributes the cluster binary configuration file to all nodes in the cluster. This daemon is started by the
internet daemon, xinetd(1M).
Parameters are in the /etc/xinetd.d/hacl-cfg and /etc/xinetd.d/hacl-cfgudp files. The
path for this daemon is $SGLBIN/cmclconfd.

Cluster Daemon: cmcld


This daemon determines cluster membership by sending heartbeat messages to cmcld daemons on
other nodes in the Serviceguard cluster. It runs at a real time priority and is locked in memory. The cmcld
daemon sets a safety timer in the kernel which is used to detect kernel hangs. If this timer is not reset
periodically by cmcld, the kernel will cause a system reboot This could occur because cmcld could not
communicate with the majority of the cluster’s members, or because cmcld exited unexpectedly, aborted,
or was unable to run for a significant amount of time and was unable to update the kernel timer, indicating
a kernel hang. Before a system reset resulting from the expiration of the safety timer, messages will be
written to syslog, and the kernel’s message buffer, if possible, and a system dump is performed.
The duration of the safety timer depends on the cluster configuration parameter MEMBER_TIMEOUT,
and also on the characteristics of the cluster configuration, such as whether it uses a quorum server or a
cluster lock (and what type of lock) and whether or not standby LANs are configured.
For further discussion, see What Happens when a Node Times Out. For advice on setting
MEMBER_TIMEOUT, see Cluster Configuration Parameters. For troubleshooting, see Cluster Re-
formations Caused by MEMBER_TIMEOUT Being Set too Low.
cmcld also manages Serviceguard packages, determining where to run them and when to start them.
The path for this daemon is: $SGLBIN/cmcld.

NOTE: Two of the central components of Serviceguard—Package Manager, and Cluster Manager—run
as parts of the cmcld daemon. This daemon runs at priority 94 and is in the SCHED_RR class. No other
process is allowed a higher real-time priority.

Network Manager Daemon: cmnetd


This daemon monitors the health of cluster networks. It also handles the addition and deletion of
relocatable package IPs, for both IPv4 and IPv6 addresses.

28 Configuration Daemon: cmclconfd


Log Daemon: cmlogd
cmlogd is used by cmcld to write messages to the system log file. Any message written to the system
log by cmcld it written through cmlogd. This is to prevent any delays in writing to syslog from impacting
the timing of cmcld. The path for this daemon is $SGLBIN/cmlogd.

Lock LUN Daemon: cmdisklockd


If a lock LUN is being used, cmdisklockd runs on each node in the cluster, providing tie-breaking
services when needed during cluster re-formation. It is started by cmcld when the node joins the cluster.
The path for this daemon is $SGLBIN/cmdisklockd.

NOTE:
An iSCSI storage device does not support configuring a lock LUN.

Generic Resource Assistant Daemon: cmresourced


This daemon is responsible to set and get the status/value of generic resources configured as part of the
cluster and/or package and influence the availability of the package based on the availability of the
resource at the cluster and /or package level.
Generic resources allows integration of custom defined monitors in Serviceguard. It provides better
control, options, and flexibility in terms of getting and setting the status of a resource.
This daemon is used by the Serviceguard commands cmgetresource(1m) and cmsetresource(1m)
to get or set the status/value of a simple/extended generic resource configured in a cluster or package
and is local to a node. This daemon runs on every node on which cmcld is running.

Persistent Reservation Daemon: cmprd


This daemon is responsible for managing persistent reservations for a FibreChannel or iSCSI storage that
is configured in a multi-node package. The daemon clears the reservations during halt or failure of the
multi-node package. It also ensures that the reservation on the disk is always held by a node where the
multi-node package is up and running. This daemon runs on every node where the cmcld cluster
daemon is executed.

Service Assistant Daemon: cmserviced


This daemon forks and execs any script or processes as required by the cluster daemon, cmcld. There
are two type of forks that this daemon carries out:

• Executing package run and halt scripts


• Launching services

For services, cmcld monitors the service process and, depending on the number of service retries,
cmcld either restarts the service through cmsrvassistd or it causes the package to halt and moves the
package to an available alternate node. The path for this daemon is: $SGLBIN/cmserviced.

Quorum Server Daemon: qs


Using a quorum server is one way to break a tie and establish a quorum when the cluster is re-forming;
the other way is to use a Lock LUN. See Cluster Quorum to Prevent Split-Brain Syndrome on page
34 and the sections that follow it.
The quorum server, if used, runs on a system external to the cluster. It is normally started from /etc/
inittab with the respawn option, which means that it automatically restarts if it fails or is killed. It can

Log Daemon: cmlogd 29


also be configured as a Serviceguard package in a cluster other than the one(s) it serves; see Figure 9:
Quorum Server to Cluster Distribution on page 38.
All members of the cluster initiate and maintain a connection to the quorum server; if it dies, the
Serviceguard nodes will detect this and then periodically try to reconnect to it. If there is a cluster re-
formation while the quorum server is down and tie-breaking is needed, the re-formation will fail and all the
nodes will halt (system reset). For this reason it is important to bring the quorum server back up as soon
as possible.
For more information about the Quorum Server software and how it works, including instructions for
configuring the Quorum Server as a Serviceguard package, see the latest version of the HPE
Serviceguard Quorum Server release notes at http://www.hpe.com/info/linux-serviceguard-docs
(Select HP Serviceguard Quorum Server Software). See also Use of the Quorum Server as a
Cluster Lock on page 36.
The path for this daemon is:

• For SUSE: /opt/qs/bin/qs

• For Red Hat: /usr/local/qs/bin/qs

Utility Daemon: cmlockd


Runs on every node on which cmcld is running. It maintains the active and pending cluster resource
locks.

Cluster SNMP Agent Daemon: cmsnmpd


This daemon collaborates with the SNMP Master Agent to provide instrumentation for the cluster
Management Information Base (MIB).
The SNMP Master Agent and the cmsnmpd provide notification (traps) for cluster-related events. For
example, a trap is sent when the cluster configuration changes, or when a Serviceguard package has
failed. To configure the agent to send traps to one or more specific destinations, add the trap destinations
to /etc/snmp/snmptrapd.conf (SUSE and Red Hat). Make sure traps are turned on with trap2sink
in /etc/snmp/snmpd.conf (SUSE and Red Hat).
The installation of the cmsnmpd rpm configures snmpd and cmsnmpd to start up automatically. Their
startup scripts are in /etc/init.d/. The scripts can be run manually to start and stop the daemons.
For more information, see the cmsnmpd (1)manpage.

Cluster WBEM Agent Daemon: cmwbemd


This daemon collaborates with the Serviceguard WBEM provider plug-in module (SGProviders) and
WBEM services cimserver (for Red Hat Enterprise Linux server) or sfcbd (SUSE Linux Enterprise
Server) to provide notification (WBEM Indications) of Serviceguard cluster events to Serviceguard WBEM
Indication subscribers that have registered a subscription with the cimserver (for Red Hat Enterprise
Linux server) or sfcbd (SUSE Linux Enterprise Server) . For example, an Indication is sent when the
cluster configuration changes, or when a Serviceguard package has failed.
You can start and stop cmwbemd with the commands /etc/init.d/cmwbemd start and /etc/
init.d/cmwbemd stop.

Proxy Daemon: cmproxyd


This daemon is used to proxy or cache Serviceguard configuration data for use by certain Serviceguard
commands running on the local node. This allows these commands to get the data quicker and removes
the burden of responding to certain requests from cmcld.

30 Utility Daemon: cmlockd


Disaster Recover Daemon: cmrehearsald
This daemon is responsible for managing Disaster Recovery Rehearsal (DRR) operations performed in a
cluster. DRR can be carried out at the package, node, or at the site level. This daemon stores, updates,
and provides the status for all the DRR operations performed in the cluster. It also performs a sequence
of steps if there is a disaster during DRR operation and ensures that DRR operation is non-disruptive.
This daemon runs on every node where the cmcld cluster daemon is executed.

Serviceguard WBEM Provider

What is WBEM?
Web-Based Enterprise Management (WBEM) is a set of management and Internet standard technologies
developed to unify the management of distributed computing environments, facilitating the exchange of
data across otherwise disparate technologies and platforms. WBEM is based on Internet standards and
Distributed Management Task Force (DMTF) open standards: Common Information Model (CIM)
infrastructure and schema, CIM-XML, CIM operations over HTTP, and WS-Management.
For more information, see the following:
Common Information Model (CIM)
Web-Based Enterprise Management (WBEM)

Support for Serviceguard WBEM Provider


Serviceguard WBEM provider allows you to get the basic Serviceguard cluster information via the
Common Information Model object Manager (CIMOM) technology. It also sends notification (WBEM
Indications) of Serviceguard cluster events to Serviceguard WBEM Indication subscribers that have
registered a subscription with the SFCB (for SUSE Linux Enterprise Server) or cimserver (for Red Hat
Enterprise Linux Server). For example, an indication is sent when the cluster configuration changes, or
when a Serviceguard package fails.
To use the Serviceguard WBEM provider:

Procedure

1. Verify whether the SGProviders rpm is installed:


rpm -qa | fgrep sgproviders

2. If it is not installed, install the rpm. For example,


On SUSE Linux Enterprise Server 11
rpm —ivh sgproviders-A.04.00.10-0.sles11.x86_64.rpm

WBEM Query
Serviceguard WBEM provider implements the following classes that can be queried to retrieve the cluster
information:

• HP_Cluster
• HP_Node
• HP_ParticipatingCS
• HP_ClusterSoftware
• HP_NodeIdentity

Disaster Recover Daemon: cmrehearsald 31


• HP_SGCluster
• HP_SGNode
• HP_SGParticipatingCS
• HP_SGClusterSoftware
• HP_SGNodeIdentity
• HP_SGIPProtocolEndpoint
• HP_SGClusterIPProtocolEndpoint
• HP_SGPackage
• HP_SGClusterPackage
• HP_SGNodePackage
• HP_SGPService
• HP_SGPackagePService
• HP_SGNodePService
• HP_SGLockLunDisk
• HP_SGRemoteQuorumService
• HP_SGLockObject
• HP_SGQuorumServer
• HP_SGLockLun
• HP_SGLockDisk

For more information about WBEM provider classes, see Managed Object Format (MOF) files for
properties. When SGProviders is installed, the MOF files are copied to the /opt/sgproviders/mof/
directory on SUSE Linux Enterprise Server and /usr/local/sgproviders/mof/ directory on Red
Hat Enterprise Linux server.

NOTE: WBEM queries for the previous classes on SUSE Linux Enterprise Server might fail because of
access denied issues, if Serviceguard is not able to validate the credentials of the WBEM request.
Small Footprint CIM Broker (SFCB) which is the CIM server in SUSE Linux Enterprise Server 11 SP1 and
later has a configuration parameter doBasicAuth which enables basic authentication for HTTP and
HTTPS connections. This parameter must be set to true in the /etc/sfcb/sfcb.cfg file. Otherwise,
the user credentials of any WBEM request is not passed to Serviceguard WBEM Provider.

WBEM Indications
For an indication to be received on occurrence of a Serviceguard event, a WBEM subscription must exist
for one of the following indication classes:

• CIM_AlertIndication
• HP_ServiceguardIndication

32 WBEM Indications
How the Cluster Manager Works
The cluster manager is used to initialize a cluster, to monitor the health of the cluster, to recognize node
failure if it should occur, and to regulate the re-formation of the cluster when a node joins or leaves the
cluster. The cluster manager operates as a daemon process that runs on each node. During cluster
startup and re-formation activities, one node is selected to act as the cluster coordinator. Although all
nodes perform some cluster management functions, the cluster coordinator is the central point for inter-
node communication.

Configuration of the Cluster


The system administrator sets up cluster configuration parameters and does an initial cluster startup;
thereafter, the cluster regulates itself without manual intervention in normal operation. Configuration
parameters for the cluster include the cluster name, nodes, networking parameters for the cluster
heartbeat, cluster lock information, timing parameters (discussed in detail in Planning and Documenting
an HA Cluster ) and vCenter server or Esxi host parameter when using VMware Virtual Machine File
System (VMFS) volumes (described in Specifying a VCENTER_SERVER or ESX_HOST on page
196) . Cluster parameters are entered by editing the cluster configuration file (see Configuring the
Cluster on page 193). The parameters you enter are used to build a binary configuration file which is
propagated to all nodes in the cluster. This binary cluster configuration file must be the same on all the
nodes in the cluster.

Heartbeat Messages
Central to the operation of the cluster manager is the sending and receiving of heartbeat messages
among the nodes in the cluster. Each node in the cluster exchanges UDP heartbeat messages with every
other node over each IP network configured as a heartbeat device.
If a cluster node does not receive heartbeat messages from all other cluster nodes within the prescribed
time, a cluster re-formation is initiated; see What Happens when a Node Times Out. At the end of the
re-formation, if a new set of nodes form a cluster, that information is passed to the package coordinator
(described later in this chapter, under How the Package Manager Works). Failover packages that were
running on nodes that are no longer in the new cluster are transferred to their adoptive nodes.
If heartbeat and data are sent over the same LAN subnet, data congestion may cause Serviceguard to
miss heartbeats and initiate a cluster re-formation that would not otherwise have been needed. For this
reason, Hewlett Packard Enterprise recommends that you dedicate a LAN for the heartbeat as well as
configuring heartbeat over the data network.
Each node sends its heartbeat message at a rate calculated by Serviceguard on the basis of the value of
the MEMBER_TIMEOUT parameter, set in the cluster configuration file, which you create as a part of
cluster configuration.

IMPORTANT: When multiple heartbeats are configured, heartbeats are sent in parallel;
Serviceguard must receive at least one heartbeat to establish the health of a node. Hewlett Packard
Enterprise recommends that you configure all subnets that interconnect cluster nodes as heartbeat
networks; this increases protection against multiple faults at no additional cost.
Heartbeat IP addresses must be on the same subnet on each node, but it is possible to configure a
cluster that spans subnets; see Cross-Subnet Configurations. See HEARTBEAT_IP, under
Cluster Configuration Parameters on page 111, for more information about heartbeat
requirements. For timeout requirements and recommendations, see the MEMBER_TIMEOUT
parameter description in the same section. For troubleshooting information, see Cluster Re-
formations Caused by MEMBER_TIMEOUT Being Set too Low. See also Cluster Daemon:
cmcld on page 28.

How the Cluster Manager Works 33


Manual Startup of Entire Cluster
A manual startup forms a cluster out of all the nodes in the cluster configuration. Manual startup is
normally done the first time you bring up the cluster, after cluster-wide maintenance or upgrade, or after
reconfiguration.
Before startup, the same binary cluster configuration file must exist on all nodes in the cluster. The
system administrator starts the cluster with the cmruncl command issued from one node. The cmruncl
command can only be used when the cluster is not running, that is, when none of the nodes is running
the cmcld daemon.
During startup, the cluster manager software checks to see if all nodes specified in the startup command
are valid members of the cluster, are up and running, are attempting to form a cluster, and can
communicate with each other. If they can, then the cluster manager forms the cluster.

Automatic Cluster Startup


An automatic cluster startup occurs any time a node reboots and joins the cluster. This can follow the
reboot of an individual node, or it may be when all nodes in a cluster have failed, as when there has been
an extended power failure and all SPUs went down.
Automatic cluster startup will take place if the flag AUTOSTART_CMCLD is set to 1 in the $SGCONF/
cmcluster.rc file. When any node reboots with this parameter set to 1, it will rejoin an existing cluster,
or if none exists it will attempt to form a new cluster.

Dynamic Cluster Re-formation


A dynamic re-formation is a temporary change in cluster membership that takes place as nodes join or
leave a running cluster. Re-formation differs from reconfiguration, which is a permanent modification of
the configuration files. Re-formation of the cluster occurs under the following conditions (not a complete
list):

• An SPU or network failure was detected on an active node.


• An inactive node wants to join the cluster. The cluster manager daemon has been started on that
node.
• A node has been added to or deleted from the cluster configuration.
• The system administrator halted a node.
• A node halts because of a package failure.
• A node halts because of a service failure.
• Heavy network traffic prohibited the heartbeat signal from being received by the cluster.
• The heartbeat network failed, and another network is not configured to carry heartbeat.

Typically, re-formation results in a cluster with a different composition. The new cluster may contain fewer
or more nodes than in the previous incarnation of the cluster.

Cluster Quorum to Prevent Split-Brain Syndrome


In general, the algorithm for cluster re-formation requires a cluster quorum of a strict majority (that is,
more than 50%) of the nodes previously running. If both halves (exactly 50%) of a previously running
cluster were allowed to re-form, there would be a split-brain situation in which two instances of the same
cluster were running. In a split-brain scenario, different incarnations of an application could end up
simultaneously accessing the same disks. One incarnation might well be initiating recovery activity while

34 Manual Startup of Entire Cluster


the other is modifying the state of the disks. Serviceguard’s quorum requirement is designed to prevent a
split-brain situation.

Cluster Lock
Although a cluster quorum of more than 50% is generally required, exactly 50% of the previously running
nodes may re-form as a new cluster provided that the other 50% of the previously running nodes do
not also re-form. This is guaranteed by the use of a tie-breaker to choose between the two equal-sized
node groups, allowing one group to form the cluster and forcing the other group to shut down. This tie-
breaker is known as a cluster lock. The cluster lock is implemented either by means of a lock LUN or a
quorum server. A cluster lock is required on two-node clusters.
The cluster lock is used as a tie-breaker only for situations in which a running cluster fails and, as
Serviceguard attempts to form a new cluster, the cluster is split into two sub-clusters of equal size. Each
sub-cluster will attempt to acquire the cluster lock. The sub-cluster which gets the cluster lock will form the
new cluster, preventing the possibility of two sub-clusters running at the same time. If the two sub-clusters
are of unequal size, the sub-cluster with greater than 50% of the nodes will form the new cluster, and the
cluster lock is not used.
If you have a two-node cluster, you are required to configure a cluster lock. If communications are lost
between these two nodes, the node that obtains the cluster lock will take over the cluster and the other
node will halt (system reset). Without a cluster lock, a failure of either node in the cluster will cause the
other node, and therefore the cluster, to halt. Note also that if the cluster lock fails during an attempt to
acquire it, the cluster will halt.

NOTE: In Serviceguard for Linux 12.00.30, a new feature called “Smart Quorum” is introduced to handle
the split-brain failure scenario. By default, Smart Quorum feature is disabled. For more information about
how to enable this feature, see Understanding the Smart Quorum on page 324.

Use of a Lock LUN as the Cluster Lock


A lock LUN can be used for clusters up to and including four nodes in size. The cluster lock LUN is a
special piece of storage (known as a partition) that is shareable by all nodes in the cluster. When a node
obtains the cluster lock, this partition is marked so that other nodes will recognize the lock as “taken.”

NOTE:

• The lock LUN is dedicated for use as the cluster lock, and, in addition, Hewlett Packard Enterprise
recommends that this LUN comprise the entire disk; that is, the partition should take up the entire disk.
• An iSCSI storage device does not support configuring a lock LUN.
• A storage device of type Dynamically linked storage configuration does not support configuring lock
LUN. For description about Dynamically linked storage configuration, see Storage configuration type
in a VMware Environment.

The complete path name of the lock LUN is identified in the cluster configuration file.
The operation of the lock LUN is shown in the following figure.

Cluster Lock 35
Node 1 Disk array Node 2

Linux
Root Root
Pkg2

Mirror HA Software Mirror

Hub

Figure 7: Lock LUN Operation

Serviceguard periodically checks the health of the lock LUN and writes messages to the syslog file if the
disk fails the health check. This file should be monitored for early detection of lock disk problems.

Use of the Quorum Server as a Cluster Lock


The cluster lock in Linux can also be implemented by means of a quorum server. A quorum server can be
used in clusters of any size. The quorum server software can be configured as a Serviceguard package,
or standalone, but in either case it must run on a system outside of the cluster for which it is providing
quorum services.
The quorum server listens to connection requests from the Serviceguard nodes on a known port. The
server maintains a special area in memory for each cluster, and when a node obtains the cluster lock, this
area is marked so that other nodes will recognize the lock as “taken.”
If the quorum server is not available when its tie-breaking services are needed during a cluster re-
formation, the cluster will halt.
The operation of the quorum server is shown in the figure. When there is a loss of communication
between node 1 and node 2, the quorum server chooses one node (in this example, node 2) to continue
running in the cluster. The other node halts.

36 Use of the Quorum Server as a Cluster Lock


Storage Storage

Storage Storage

Node 1 Node 2 Quorum Server

Linux

Pkg2
QS
HA S/W

Hub

Figure 8: Quorum Server Operation

A quorum server can provide quorum services for multiple clusters. Figure 9: Quorum Server to Cluster
Distribution on page 38 illustrates quorum server use across four clusters.

Understanding Serviceguard Software Components 37


Stand alone quorum
server for cluster 1 and 2
QS

Cluster 1 Cluster 2
Quorum
server
QS
Linux package Linux Linux
Linux for 3 and 4
Pkg Pkg Pkg
Pkg
HA S/W HA S/W HA S/W HA S/W

Node 1 Node 2 Node 1 Node 2

Storage Storage Storage Storage

Storage Storage Storage Storage

Cluster 3 Cluster 4

Linux Linux Linux Linux

Pkg Pkg Pkg Pkg

HA S/W HA S/W HA S/W HA S/W

Node 1 Node 2 Node 1 Node 2

Storage Storage Storage Storage

Storage Storage Storage Storage

Figure 9: Quorum Server to Cluster Distribution

IMPORTANT: For more information about the quorum server, see the latest version of the HPE
Serviceguard Quorum Server release notes at http://www.hpe.com/info/linux-serviceguard-docs
(Select HP Serviceguard Quorum Server Software).

NOTE: In Serviceguard for Linux 12.00.30, a new feature called “Smart Quorum” is introduced to handle
the split-brain failure scenario. By default, Smart Quorum feature is disabled. For more information about
how to enable this feature, see Understanding the Smart Quorum on page 324.

No Cluster Lock
Normally, you should not configure a cluster of three or fewer nodes without a cluster lock. In two-node
clusters, a cluster lock is required. You may consider using no cluster lock with configurations of three or
more nodes, although the decision should be affected by the fact that any cluster may require tie-
breaking. For example, if one node in a three-node cluster is removed for maintenance, the cluster re-
forms as a two-node cluster. If a tie-breaking scenario later occurs due to a node or communication
failure, the entire cluster will become unavailable.

38 No Cluster Lock
In a cluster with four or more nodes, you may not need a cluster lock since the chance of the cluster
being split into two halves of equal size is very small. However, be sure to configure your cluster to
prevent the failure of exactly half the nodes at one time. For example, make sure there is no potential
single point of failure such as a single LAN between equal numbers of nodes, and that you don’t have
exactly half of the nodes on a single power circuit.

What Happens when You Change the Quorum Configuration Online


You can change the quorum configuration while the cluster is up and running. This includes changes to
the quorum method (for example, from a lock disk to a quorum server), the quorum device (for example,
from one quorum server to another), and the parameters that govern them (for example, the quorum
server polling interval). For more information about the quorum server and lock parameters, see Cluster
Configuration Parameters on page 111.
When you make quorum configuration changes, Serviceguard goes through a two-step process:

Procedure

1. All nodes switch to a strict majority quorum (turning off any existing quorum devices).
2. All nodes switch to the newly configured quorum method, device and parameters.

IMPORTANT: During Step 1, while the nodes are using a strict majority quorum, node failures can
cause the cluster to go down unexpectedly if the cluster has been using a quorum device before the
configuration change. For example, suppose you change the quorum server polling interval while a
two-node cluster is running. If a node fails during Step 1, the cluster will lose quorum and go down,
because a strict majority of prior cluster members (two out of two in this case) is required. The
duration of Step 1 is typically around a second, so the chance of a node failure occurring during that
time is very small.
In order to keep the time interval as short as possible, make sure you are changing only the quorum
configuration, and nothing else, when you apply the change.
If this slight risk of a node failure leading to cluster failure is unacceptable, halt the cluster before
you make the quorum configuration change.

Using the Cluster Generic Resources Monitoring Service


Cluster generic resource is a resource monitoring mechanism provided by Serviceguard to monitor critical
resources of a cluster. Use the cluster generic resource to integrate custom or user-defined monitoring
scripts in the cluster configuration. Cluster Generic Resource feature can be configured only when all the
nodes in the cluster are running on Serviceguard version A.12.30.00 or later.
You can specify the resources that are common across all nodes or resources that are required for one or
more packages in a cluster to be monitored at the cluster level. All the packages or workloads that require
this common resource can reference it as part of their respective package configuration. Each of these
packages can have different criteria to declare the health of resource and act accordingly. Cluster generic
resource provides a single monitoring agent at the cluster level and various packages that are dependent
can use it appropriately.
You can configure a maximum of 10 cluster generic resources of node scope. This number is inclusive of
the package generic resource count of 100. That is, there can be a maximum of 90 package generic
resources; if 10 cluster generic resources are already configured in a cluster.
You can either generate a new cluster configuration file containing the generic resource parameters or
add the following cluster generic resource parameters to an existing cluster configuration. When you
generate a cluster configuration file, Serviceguard provides the following parameters for configuring
cluster generic resources:

What Happens when You Change the Quorum Configuration Online 39


• GENERIC_RESOURCE_NAME
• GENERIC_RESOURCE_TYPE
• GENERIC_RESOURCE_CMD
• GENERIC_RESOURCE_SCOPE
• GENERIC_RESOURCE_RESTART
• GENERIC_RESOURCE_HALT_TIMEOUT

You can then configure cluster generic resources using these parameters. For details on the parameters,
see Cluster Configuration Parameters and the cmquerycl (1m) manpage. For steps to configure a
cluster generic resources, see Configuring Cluster Generic Resources. You can also add, delete, or
modify generic resources depending on certain conditions. For information, see Online reconfiguration
of cluster generic resources.
Monitoring of these resources takes place outside of the Serviceguard environment. The resources are
monitored by writing monitoring scripts that can be launched within the Serviceguard environment by
configuring them as cluster generic resource command (GENERIC_RESOURCE_CMD) in cluster
configuration.
These scripts are written by end users and must contain the core logic to monitor a resource and the
status of the generic resource set accordingly using cmsetresource(1m). The scripts are started as
part of cluster start and will continue to run until cluster services are halted. For more information, see
Monitoring Script for Cluster Generic Resources.
See the recommendation from Hewlett Packard Enterprise and an example under Cluster Generic
Resources template scripts.
Cluster Generic resources can be of two types - Simple and Extended. The type of generic resource
should be specified by using GENERIC_RESOURCE_TYPE parameter of cluster configuration.
Simple Generic resource:

• For a simple resource, the monitoring mechanism is based on the status of the resource.
• The status can be UP, DOWN, or UNKNOWN.
• The default status is UNKNOWN, UP and DOWN can be set using the cmsetresource(1m)
command.

Extended Generic Resource:

• For an extended resource, the monitoring mechanism is based on the current value of the resource.
• The default current value is 0.
• Valid values are positive integer values ranging from 1 to 2147483647.

NOTE: You can get or set the status/value of a simple/extended generic resource using the
cmgetresource(1m) and cmsetresource(1m) commands respectively. See Getting and Setting
the Status/Value of a Simple/Extended Cluster Generic Resource on page 211 and the manpages for
more information.
A single cluster can have a combination of simple and extended resources, but a given generic resource
cannot be configured as a simple resource in cluster and as an extended resource in package. It must be
either simple generic resource or extended generic resource in both cluster and packages.

40 Understanding Serviceguard Software Components


How the Package Manager Works
Packages are the means by which Serviceguard starts and halts configured applications. A package is a
collection of services, disk volumes, generic resources, and IP addresses, that are managed by
Serviceguard to ensure they are available.
Each node in the cluster runs an instance of the package manager; the package manager residing on the
cluster coordinator is known as the package coordinator.
The package coordinator does the following:

Decides when and where to run, halt, or move packages.

The package manager on all nodes does the following:

• Executes the control scripts that run and halt packages and their services.
• Reacts to changes in the status of monitored resources.

Package Types
Three different types of packages can run in the cluster; the most common is the failoverpackage. There
are also special-purpose packages that run on more than one node at a time, and so do not fail over.
They are typically used to manage resources of certain failover packages.

Non-failover Packages
There are two types of special-purpose packages that do not fail over and that can run on more than one
node at the same time: the system multi-node package, which runs on all nodes in the cluster, and the
multi-node package, which can be configured to run on all or some of the nodes in the cluster. System
multi-node packages are reserved for use by Hewlett Packard Enterprise-supplied applications.
The rest of this section describes failover packages.

Failover Packages
A failover package starts up on an appropriate node (see node_name node_name) when the cluster
starts. In the case of a service, network, or other resource or dependency failure, package failover takes
place. A package failover involves both halting the existing package and starting the new instance of the
package on a new node.
Failover is shown in the following figure:

Node 1 Node 2 Node 3

PkgA PkgB PkgC

Figure 10: Package Moving during Failover

How the Package Manager Works 41


Configuring Failover Packages
You configure each package separately. You create a failover package by generating and editing a
package configuration file template, then adding the package to the cluster configuration database;
details are in Configuring Packages and Their Services .
Modular packages are managed by a master control script that is installed with Serviceguard; see
Configuring Packages and Their Services , for instructions for creating modular packages.
Deciding When and Where to Run and Halt Failover Packages
The package configuration file assigns a name to the package and includes a list of the nodes on which
the package can run.
Failover packages list the nodes in order of priority (i.e., the first node in the list is the highest priority
node). In addition, failover packages’ files contain three parameters that determine failover behavior.
These are the auto_run parameter, the failover_policy parameter, and the failback_policy parameter.
Failover Packages’ Switching Behavior
The auto_run parameter (known in earlier versions of Serviceguard as the pkg_switching_enabled
parameter) defines the default global switching attribute for a failover package at cluster startup: that is,
whether Serviceguard can automatically start the package when the cluster is started, and whether
Serviceguard should automatically restart the package on a new node in response to a failure. Once the
cluster is running, the package switching attribute of each package can be temporarily set with the
cmmodpkg command; at reboot, the configured value will be restored.
The auto_run parameter is set in the package configuration file.
A package switch normally involves moving failover packages and their associated IP addresses to a new
system. The new system must already have the same subnet configured and working properly, otherwise
the packages will not be started.

NOTE: It is possible to configure a cluster that spans subnets joined by a router, with some nodes using
one subnet and some another. This is known as a cross-subnet configuration. In this context, you can
configure packages to fail over from a node on one subnet to a node on another, and you will need to
configure a relocatable IP address for each subnet the package is configured to start on; see About
Cross-Subnet Failover, and in particular the subsection Implications for Application Deployment.

When a package fails over, TCP connections are lost. TCP applications must reconnect to regain
connectivity; this is not handled automatically. Note that if the package is dependent on multiple subnets,
normally all of them must be available on the target node before the package will be started. (In a cross-
subnet configuration, all the monitored subnets that are specified for this package, and configured on the
target node, must be up.)
If the package has a dependency on a resource or another package, the dependency must be met on the
target node before the package can start.
The switching of relocatable IP addresses is shown in the figures that follow. Users connect to each node
with the IP address of the package they wish to use. Each node has a stationary IP address associated
with it, and each package has an IP address associated with it.

42 Understanding Serviceguard Software Components


Package IP address Node 1(192.168.1.1) Node 2(192.168.1.2) Package IP address
127.15.12.147 127.15.12.149

Pkg1 Disk array Pkg2

Disk array

LAN LAN
interfaces interfaces

Connection by client 1 Connection by client 2


to 127.15.12.147 to 127.15.12.149

Client 1 Client 2

Figure 11: Before Package Switching

In After Package Switching diagram, node1 has failed and pkg1 has been transferred to node2. pkg1's
IP address was transferred to node2 along with the package. pkg1 continues to be available and is now
running on node2. Also note that node2 now has access both to pkg1's disk and pkg2's disk.

NOTE: For design and configuration information about clusters that span subnets, see the documents
listed under Cross-Subnet Configurations.

Node 1(192.168.1.1) Package IP address Node 2(192.168.1.2) Package IP address


127.15.12.147 127.15.12.149
Pkg1
Disk array
Pkg2

Disk array

LAN LAN
interfaces interfaces

Connection
Connection by client 2
by client 1 to
to 127.15.12.149
127.15.12.147

Client 1 Client 2

Figure 12: After Package Switching

Failover Policy
The Package Manager selects a node for a failover package to run on based on the priority list included
in the package configuration file together with the failover_policy parameter, also in the configuration file.
The failover policy governs how the package manager selects which node to run a package on when a
specific node has not been identified and the package needs to be started. This applies not only to
failovers but also to startup for the package, including the initial startup. The failover policies are
configured_node (the default), min_package_node, site_preferred, and

Understanding Serviceguard Software Components 43


site_preferred_manual. The parameter is set in the package configuration file. For more information,
see failback_policy.
Automatic Rotating Standby
Using the min_package_node failover policy, it is possible to configure a cluster that lets you use one
node as an automatic rotating standby node for the cluster. Consider the following package configuration
for a four node cluster. Note that all packages can run on all nodes and have the same node_name lists.
Although the example shows the node names in a different order for each package, this is not required.

Table 1: Package Configuration Data

Package Name NODE_NAME FAILOVER_POLICY


List

pkgA node1, node2, node3, min_package_node


node4

pkgB node2, node3, node4, min_package_node


node1

pkgC node3, node4, node1, min_package_node


node2

When the cluster starts, each package starts as shown in the figure.

Node 1 Node 2 Node 3 Node 4

PkgA PkgB PkgC

Figure 13: Rotating Standby Configuration before Failover

If a failure occurs, the failing package would fail over to the node containing fewest running packages:

Node 1 Node 2 Node 3 Node 4

PkgA PkgC PkgB

Figure 14: Rotating Standby Configuration after Failover

44 Understanding Serviceguard Software Components


NOTE: Under the min_package_node policy, when node2 is repaired and brought back into the cluster,
it will then be running the fewest packages, and thus will become the new standby node.

If these packages had been set up using the configured_node failover policy, they would start initially
as in Before Package Switching, but the failure of node2 would cause the package to start on node3,
as shown in After Package Switching.

Node 1 Node 2 Node 3 Node 4

PkgB
PkgA
PkgC

Figure 15: configured_node Policy Packages after Failover

If you use configured_node as the failover policy, the package will start up on the highest-priority
eligible node in its node list. When a failover occurs, the package will move to the next eligible node in the
list, in the configured order of priority.
Failback Policy
The use of the failback_policy parameter allows you to decide whether a package will return to its primary
node if the primary node becomes available and the package is not currently running on the primary
node. The configured primary node is the first node listed in the package’s node list.
The two possible values for this policy are automatic and manual. The parameter is set in the package
configuration file:
As an example, consider the following four-node configuration, in which failover_policy is set to
configured_node and failback_policy is automatic:

Node 1 Node 2 Node 3 Node 4

PkgA PkgB PkgC

Figure 16: Automatic Failback Configuration before Failover

Understanding Serviceguard Software Components 45


Table 2: Node Lists in Sample Cluster

Package Name NODE_NAME FAILOVER POLICY FAILBACK POLICY


List

pkgA node1, node4 configured_node automatic

pkgB node2, node4 configured_node automatic

pkgC node3, node4 configured_node automatic

node1 panics, and after the cluster reforms, pkgA starts running on node4:

Node 1 Node 2 Node 3 Node 4

PkgB PkgC PkgA

Figure 17: Automatic Failback Configuration after Failover

After rebooting, node1 rejoins the cluster. At that point, pkgA will be automatically stopped on node4 and
restarted on node1.

Node 1 Node 2 Node 3 Node 4

PkgA PkgB PkgC

Figure 18: Automatic Failback Configuration after Restart of node1

NOTE: Setting the failback_policy to automatic can result in a package failback and application outage
during a critical production period. If you are using automatic failback, you may want to wait to add the
package’s primary node back into the cluster until you can allow the package to be taken out of service
temporarily while it switches back to the primary node.
Serviceguard automatically chooses a primary node for a package when the NODE_NAME is set to '*'.
When you set the NODE_NAME to '*' and the failback_policy is automatic, if you add, delete, or
rename a node in the cluster, the primary node for the package might change resulting in the automatic
failover of that package.

46 Understanding Serviceguard Software Components


On Combining Failover and Failback Policies
Combining a failover_policy of min_package_node with a failback_policy of automatic can result in a
package’s running on a node where you did not expect it to run, since the node running the fewest
packages will probably not be the same host every time a failover occurs.

Using the Generic Resources Monitoring Service


Generic Resources module is a resource monitoring mechanism in Serviceguard that allows you to
monitor critical resources for a package. It provides integration of custom, user-defined monitors in
Serviceguard by configuring generic resources as part of package configuration. With generic resources
different kind of monitoring mechanisms, such as Custom monitors can be used and these can co-exist in
a single package.
Generic resources has the following advantages:

• Custom defined monitors can also be integrated


• Provides better control, options, and flexibility in terms of getting and setting the status of a resource

Generic resources can be configured into any modular style package. They can be configured for failover
or multi-node packages and are included in modular failover packages by default. A single resource can
be specified across multiple packages.
You can either generate a new package configuration file containing the generic resource parameters or
add the module to an existing package to include the generic resource parameters. When you generate a
package with the generic resource module, Serviceguard provides the following parameters for
configuring generic resources:

• generic_resource_name
• generic_resource_evaluation_type
• generic_resource_up_criteria

You can then configure generic resources using these parameters. For details on the parameters, see
Package Parameter Explanations and the cmmakepkg (1m) manpage. For steps to configure a
generic resources, see Configuring a Generic Resource.
You can also add, delete, or modify generic resources depending on certain conditions. For information,
see Online Reconfiguration of Generic Resources.
Monitoring of these resources takes place outside of the Serviceguard environment. These are done by
writing monitoring scripts that can be launched either within the Serviceguard environment by configuring
them as services, or outside of Serviceguard environment.
These scripts are written by end-users and must contain the core logic to monitor a resource , and the
status of the generic resource set accordingly using cmsetresource(1m). These are started as part of
package start and will continue to run until package services are halted. For more information, see
Monitoring Script for Generic Resources.
If there is a common generic resource that needs to be monitored as a part of multiple packages, then the
monitoring script for that resource can be launched as part of one package and all other packages can
use the same monitoring script. There is no need to launch multiple monitors for a common resource. If
the package that has started the monitoring script fails or is halted, then all the other packages that are
using this common resource also fail.
See the recommendation from Hewlett Packard Enterprise and an example under Launching
Monitoring Scripts.
Generic resources can be of two types - Simple and Extended.

Using the Generic Resources Monitoring Service 47


A given generic resource is considered to be a simple generic resource when the up criteria parameter is
not specified.

• For a simple resource, the monitoring mechanism is based on the status of the resource.
• The status can be UP, DOWN, or UNKNOWN.
• The default status is UNKNOWN; UP and DOWN can be set using the cmsetresource(1m)
command.

A given generic resource is considered to be an extended generic resource when the up criteria
parameter is specified.

• For an extended resource, the monitoring mechanism is based on the current value of the resource.
• The current value is matched with the generic_resource_up_criteria specified for the resource in a
package and this determines whether the generic resource status is UP or DOWN.
• The default current value is 0.
• Valid values are positive integer values ranging from 1 to 2147483647.

NOTE: You can get or set the status/value of a simple/extended generic resource using the
cmgetresource(1m) and cmsetresource(1m) commands respectively. See Getting and Setting
the Status/Value of a Simple/Extended Generic Resource and the manpages for more information.
A single package can have a combination of simple and extended resources, but a given generic
resource cannot be configured as a simple resource in one package and as an extended resource in
another package. It must be either simple generic resource or extended generic resource in all packages.

Using Older Package Configuration Files


If you are using package configuration files that were generated using a previous version of Serviceguard,
Hewlett Packard Enterprise recommends you use the cmmakepkg command to open a new template,
and then copy the parameter values into it. In the new template, read the descriptions and defaults of the
choices that did not exist when the original configuration was made. For example, the default for
failover_policy is now configured_node and the default for failback_policy is now manual.
For full details of the current parameters and their default values, see Configuring Packages and Their
Services , and the package configuration file template itself.

How Packages Run


Packages are the means by which Serviceguard starts and halts configured applications. Failover
packages are also units of failover behavior in Serviceguard. A package is a collection of services, disk
volumes, generic resources, and IP addresses that are managed by Serviceguard to ensure they are
available. There can be a maximum of 300 packages per cluster, a total of 900 services and a total of 100
generic resources per cluster. The count of generic resource includes 10 cluster generic resources also.
That is, you can configure a maximum of 90 generic resource at package level and 10 cluster generic
resources at cluster level.

What Makes a Package Run?


There are 3 types of packages:

48 Using Older Package Configuration Files


• The failover package is the most common type of package. It runs on one node at a time. If a failure
occurs, it can switch to another node listed in its configuration file. If switching is enabled for several
nodes, the package manager will use the failover policy to determine where to start the package.
• A system multi-node package runs on all the active cluster nodes at the same time. It can be started or
halted on all nodes, but not on individual nodes.
• A multi-node package can run on several nodes at the same time. If auto_run is set to yes,
Serviceguard starts the multi-node package on all the nodes listed in its configuration file. It can be
started or halted on all nodes, or on individual nodes, either by user command (cmhaltpkg) or
automatically by Serviceguard in response to a failure of a package component, such as service or
subnet.

System multi-node packages are supported only for use by applications supplied by Hewlett Packard
Enterprise.
A failover package can be configured to have a dependency on a multi-node or system multi-node
package. The package manager cannot start a package on a node unless the package it depends on is
already up and running on that node.
The package manager will always try to keep a failover package running unless there is something
preventing it from running on any node. The most common reasons for a failover package not being able
to run are that auto_run is disabled so Serviceguard is not allowed to start the package, that node
switching is disabled for the package on particular nodes, or that the package has a dependency that is
not being met. When a package has failed on one node and is enabled to switch to another node, it will
start up automatically in a new location where its dependencies are met. This process is known as
package switching, or remote switching.
A failover package starts on the first available node in its configuration file; by default, it fails over to the
next available one in the list. Note that you do not necessarily have to use a cmrunpkg command to
restart a failed failover package; in many cases, the best way is to enable package and/or node switching
with the cmmodpkg command.
When you create the package, you indicate the list of nodes on which it is allowed to run. System multi-
node packages must list all cluster nodes in their cluster. Multi-node packages and failover packages can
name some subset of the cluster’s nodes or all of them.
If the auto_run parameter is set to yes in a package’s configuration file Serviceguard automatically starts
the package when the cluster starts. System multi-node packages are required to have auto_run set to
yes. If a failover package has auto_run set to no, Serviceguard cannot start it automatically at cluster
startup time; you must explicitly enable this kind of package using the cmmodpkg command.

NOTE: If you configure the package while the cluster is running, the package does not start up
immediately after the cmapplyconf command completes. To start the package without halting and
restarting the cluster, issue the cmrunpkg or cmmodpkg command.

How does a failover package start up, and what is its behavior while it is running? Some of the many
phases of package life are shown in Figure 19: Modular Package Time Line Showing Important
Events on page 50.

Understanding Serviceguard Software Components 49


Run Command Halt Command
or cluster startup or failure detected

Status: Starting Status: Running Status: Halting


Master Control Script operaters Services Running Halt Script Operates

Master control Halt script


script completes completes

Figure 19: Modular Package Time Line Showing Important Events

The following are the most important moments in a package’s life:

1. Before the master control script starts.


2. During master control script execution to start the package.
3. While services are running.
4. If there is a generic resource configured at cluster or package level and it fails, then the package will
be halted.
5. When a service or subnet fails, or a dependency is not met.
6. During master control script execution to halt the package.
7. When the package or the node is halted with a command.
8. When the node fails.

Before the Control Script Starts


First, a node is selected. This node must be in the package’s node list, it must conform to the package’s
failover policy, and any resources required by the package must be available on the chosen node. One
resource is the subnet that is monitored for the package. If the subnet is not available, the package
cannot start on this node. Another type of resource is a dependency on another package. If monitoring
shows a value for a configured resource that is outside the permitted range, the package cannot start.
If a generic resource of type BPS is configured, it must be up; if not, the package cannot start on this
node.
Once a node is selected, a check is then done to make sure the node allows the package to start on it.
Then services are started up for a package by the control script on the selected node. The master control
script on the selected node is used to start a modular package.

During Run Script Execution


Once the package manager has determined that the package can start on a particular node, it launches
the script that starts the package (that is, a package’s control script or master control script is executed
with the start parameter). This script carries out the following steps:

1. Executes any external_pre_scripts. For more information, see About External Scripts).
2. Activates volume groups or disk groups.

50 Before the Control Script Starts


3. Mounts file systems.
4. Assigns package IP addresses to the LAN card on the node (failover packages only).
5. Executes any external_script. For more information, see About External Scripts).
6. Starts each package service.
7. Exits with an exit code of zero (0).

cmm0dpkg -e or
cluster startup or
cmrunpkg

Master Control Script operaters Services Running Halt Script Operates

Activate Mount File Execute Customer Launch


Assign Package IP Addresses
VolGroup System Run Commands Services

Figure 20: Modular Package Time Line

At any step along the way, an error will result in the script exiting abnormally (with an exit code of 1). For
example, if a package service is unable to be started, the control script will exit with an error.
If the run script execution is not complete before the time specified in the run_script_timeout parameter,
the package manager will kill the script. During run script execution, messages are written to a log file. For
modular packages, the pathname is determined by the script_log_file parameter in the package
configuration file. Normal starts are recorded in the log, together with error messages or warnings related
to starting the package.

NOTE: After the package run script has finished its work, it exits, which means that the script is no longer
executing once the package is running normally. After the script exits, the PIDs of the services started by
the script are monitored by the package manager directly. If the service dies, the package manager will
then run the package halt script or, if service_fail_fast_enabled is set to yes, it will halt the node on
which the package is running. If a number of restarts is specified for a service in the package control
script, the service may be restarted if the restart count allows it, without re-running the package run script.

Normal and Abnormal Exits from the Run Script


Exit codes on leaving the run script determine what happens to the package next. A normal exit means
the package startup was successful, but all other exits mean that the start operation did not complete
successfully.

• 0—normal exit. The package started normally, so all services are up on this node.
• 1—abnormal exit, also known as no_restart exit. The package did not complete all startup steps
normally. Services are killed, and the package is disabled from failing over to other nodes.
• 2—alternative exit, also known as restart exit. There was an error, but the package is allowed to
start up on another node. You might use this kind of exit from a customer defined procedure if there

Normal and Abnormal Exits from the Run Script 51


was an error, but starting the package on another node might succeed. A package with a restart
exit is disabled from running on the local node, but can still run on other nodes.
• Timeout—Another type of exit occurs when the run_script_timeout is exceeded. In this scenario,
the package is killed and disabled globally. It is not disabled on the current node, however. The
package script may not have been able to clean up some of its resources such as LVM volume groups
or package mount points, so before attempting to start up the package on any node, be sure to check
whether any resources for the package need to be cleaned up.

Service Startup with cmrunserv


Within the package control script, the cmrunserv command starts up the individual services. This
command is executed once for each service that is coded in the file. You can configure a number of
restarts for each service. The cmrunserv command passes this number to the package manager, which
will restart the service the appropriate number of times if the service should fail. For more information
about configuring services in modular packages, see the discussion starting with service_name in
Chapter 6, and the comments in the package configuration template file.

While Services are Running


During the normal operation of cluster services, the package manager continuously monitors the
following:

• Process IDs of the services


• Subnets configured for monitoring in the package configuration file
• Generic resources configured for monitoring in the package configuration file

If a service fails but the restart parameter for that service is set to a value greater than 0, the service will
restart, up to the configured number of restarts, without halting the package.
During normal operation, while all services are running, you can see the status of the services in the
“Script Parameters” section of the output of the cmviewcl command.

When a Service or Subnet Fails or Generic Resource or a Dependency is


Not Met
What happens when something goes wrong? If a service fails and there are no more restarts, or if a
configured dependency on another package is not met, then a failover package will halt on its current
node and, depending on the setting of the package switching flags, may be restarted on another node. If
a multi-node or system multi-node package fails, all of the packages that have configured a dependency
on it will also fail.
Package halting normally means that the package halt script executes (see the next section). However, if
a failover package’s configuration has the service_fail_fast_enabled flag set to yes for the service that
fails, then the node will halt as soon as the failure is detected. If this flag is not set, the loss of a service
will result in halting the package gracefully by running the halt script.
If auto_run is set to yes, the package will start up on another eligible node, if it meets all the
requirements for startup. If auto_run is set to no, then the package simply halts without starting up
anywhere else.

NOTE: If a package is dependent on a subnet, and the subnet on the primary node fails, the package will
start to shut down. If the subnet recovers immediately (before the package is restarted on an adoptive
node), the package manager restarts the package on the same node; no package switch occurs.

52 Service Startup with cmrunserv


When a Package is Halted with a Command
The Serviceguard cmhaltpkg command has the effect of executing the package halt script, which halts
the services that are running for a specific package. This provides a graceful shutdown of the package
that is followed by disabling automatic package startup (see auto_run).
You cannot halt a multi-node or system multi-node package unless all the packages that have a
configured dependency on it are down. Use cmviewcl to check the status of dependents. For example, if
pkg1 and pkg2 depend on PKGa, both pkg1 and pkg2 must be halted before you can halt PKGa.

NOTE: If you use cmhaltpkg command with the -n <nodename> option, the package is halted only if it
is running on that node.

The cmmodpkg command cannot be used to halt a package, but it can disable switching either on
particular nodes or on all nodes. A package can continue running when its switching has been disabled,
but it will not be able to start on other nodes if it stops running on its current node.

During Halt Script Execution


Once the package manager has detected the failure of a service or package that a failover package
depends on, or when the cmhaltpkg command has been issued for a particular package, the package
manager launches the halt script. That is, master control script is executed with the stop parameter. This
script carries out the following steps:

1. Halts all package services.


2. Executes any external_script. For more information, see external_script.
3. Removes package IP addresses from the LAN card on the node.
4. Unmounts file systems.
5. Deactivates volume groups.
6. Revokes Persistent registrations and reservations, if any
7. Exits with an exit code of zero (0).
8. Executes any external_pre_script. For more information, see external_pre_script.

cmhaltpkg or cmhaltcl or
loss of resource or loss of service

Master Control Script operaters Services Running Halt Script Operates

Halt Execute Customer Remove Package Unmount Deactivate


Services Halt Commands IP Addresses File System VolGroup

Figure 21: Modular Package Time Line for Halt Script Execution

When a Package is Halted with a Command 53


At any step along the way, an error will result in the script exiting abnormally (with an exit code of 1). If the
halt script execution is not complete before the time specified in the halt_script_timeout, the package
manager will kill the script. During halt script execution, messages are written to a log file. For modular
packages, the pathname is determined by the script_log_file parameter in the package configuration file.
Normal starts are recorded in the log, together with error messages or warnings related to halting the
package.

Normal and Abnormal Exits from the Halt Script


The package’s ability to move to other nodes is affected by the exit conditions on leaving the halt script.
The following are the possible exit codes:

• 0—normal exit. The package halted normally, so all services are down on this node.
• 1—abnormal exit, also known as no_restart exit. The package did not halt normally. Services are
killed, and the package is disabled globally. It is not disabled on the current node, however.
• 2 — abnormal exit, also known as restart exit. The package did not halt normally. Services are
killed, and the package is disabled globally. It is not disabled on the current node, however. The
package is allowed to run on an alternate node.
• 3—abnormal exit. The package did not halt normally and will be placed in the halt_aborted state.
The package switching is disabled and it will not failover to other nodes.
• Timeout—Another type of exit occurs when the halt_script_timeout is exceeded. In this scenario, the
package is killed and disabled globally. It is not disabled on the current node, however.

Package Control Script Error and Exit Conditions


The table shows the possible combinations of error condition, failfast setting and package movement for
failover packages.

Table 3: Error Conditions and Package Movement for Failover Packages

Package Error Condition Results

Error or Exit Node Service Linux Halt script Package Package


Code Failfast Failfast Status on runs after Allowed to Run Allowed to
Enabled Enabled Primary Error or on Primary Run on
after Error Exit Node after Error Alternate
Node

Service Failure Either Yes system reset No N/A (system Yes


Setting reset)

Service Failure Either No Running Yes No Yes


Setting

Run Script Exit Either Either Running No Not changed No


1 Setting Setting

Run Script Exit Yes Either system reset No N/A (system Yes
2 Setting reset)

Table Continued

54 Normal and Abnormal Exits from the Halt Script


Package Error Condition Results

Error or Exit Node Service Linux Halt script Package Package


Code Failfast Failfast Status on runs after Allowed to Run Allowed to
Enabled Enabled Primary Error or on Primary Run on
after Error Exit Node after Error Alternate
Node

Run Script Exit No Either Running No No Yes


2 Setting

Run Script Yes Either system reset No N/A (system Yes


Timeout Setting reset)

Run Script No Either Running No Not changed No


Timeout Setting

Halt Script Exit Yes Either Running N/A Yes No


1 Setting

Halt Script Exit No Either Running N/A Yes No


1 Setting

Halt Script Exit No Either Running N/A Yes Yes


2 Setting

Halt Script Exit No No Running N/A Yes No


3

Halt Script Yes Either system reset N/A N/A (system Yes, unless
Timeout Setting reset) the timeout
happened
after the
cmhaltpkg
command
was
executed.

Halt Script No Either Running N/A Yes No


Timeout Setting

Service Failure Either Yes system reset No N/A (system Yes


Setting reset)

Service Failure Either No Running Yes No Yes


Setting

Loss of Yes Either system reset No N/A (system Yes


Network Setting reset)

Table Continued

Understanding Serviceguard Software Components 55


Package Error Condition Results

Error or Exit Node Service Linux Halt script Package Package


Code Failfast Failfast Status on runs after Allowed to Run Allowed to
Enabled Enabled Primary Error or on Primary Run on
after Error Exit Node after Error Alternate
Node

Loss of No Either Running Yes Yes Yes


Network Setting

package Either Either Running Yes Yes when Yes if


depended on Setting Setting dependency is dependency
failed again met met

How the Network Manager Works


The purpose of the network manager is to detect and recover from network card failures so that network
services remain highly available to clients. In practice, this means assigning IP addresses for each
package to LAN interfaces on the node where the package is running and monitoring the health of all
interfaces, switching them when necessary.

NOTE: Serviceguard monitors the health of the network interfaces (NICs) and can monitor the IP level
(layer 3) network.

Stationary and Relocatable IP Addresses and Monitored Subnets


Each node (host system) should have an IP address for each active network interface. This address,
known as a stationary IP address, is configured in the file /etc/sysconfig/network-scripts/
ifcfg-<interface> on Red Hat or /etc/sysconfig/network/ifcfg-<mac_address> on
SUSE. The stationary IP address is not associated with packages, and it is not transferable to another
node.
Stationary IP addresses are used to transmit data, heartbeat messages (described under How the
Cluster Manager Works ), or both. They are configured into the cluster via the cluster configuration file;
see the entries for HEARTBEAT_IP and STATIONARY_IP under Cluster Configuration Parameters on
page 111.
Serviceguard monitors the subnets represented by these IP addresses. They are referred to as monitored
subnets, and you can see their status at any time in the output of the cmviewcl command; see Network
Status on page 259 for an example.
You can also configure these subnets to be monitored for packages, using the monitored_subnet
parameter in the package configuration file. A package will not start on a node unless the subnet(s)
identified by monitored_subnet in its package configuration file are up and reachable from that node.

IMPORTANT: Any subnet identified as a monitored_subnet in the package configuration file must
be configured into the cluster via NETWORK_INTERFACE and either STATIONARY_IP or
HEARTBEAT_IP in the cluster configuration file. See Cluster Configuration Parameters on page
111 and Package Parameter Explanations.

In addition to the stationary IP address, you normally assign one or more unique IP addresses to each
package. The package IP address is assigned to a LAN interface when the package starts up.
The IP addresses associated with a package are called relocatable IP addresses (also known as IP
aliases, package IP addresses or floating IP addresses) because the addresses can actually move from

56 How the Network Manager Works


one cluster node to another. You can use up to 200 relocatable IP addresses in a cluster spread over as
many as 300 packages. These addresses can be IPv4, IPv6, or a combination of both address families.
Because system multi-node and multi-node packages do not fail over, they do not have relocatable IP
address.
A relocatable IP address is like a virtual host IP address that is assigned to a package. Hewlett Packard
Enterprise recommends that you configure names for each package through DNS (Domain Name
System). A program then can use the package’s name like a host name as the input to
gethostbyname(3), which will return the package’s relocatable IP address.
Relocatable addresses (but not stationary addresses) can be taken over by an adoptive node if control of
the package is transferred. This means that applications can access the package via its relocatable
address without knowing which node the package currently resides on.

IMPORTANT: Any subnet that is used by a package for relocatable addresses should be configured
into the cluster via NETWORK_INTERFACE and either STATIONARY_IP or HEARTBEAT_IP in the
cluster configuration file. For more information about those parameters, see Cluster Configuration
Parameters on page 111. For more information about configuring relocatable addresses, see the
descriptions of the package ip_ parameters ip_subnet.

NOTE: It is possible to configure a cluster that spans subnets joined by a router, with some nodes using
one subnet and some another. This is called a cross-subnet configuration. In this context, you can
configure packages to fail over from a node on one subnet to a node on another, and you will need to
configure a relocatable address for each subnet the package is configured to start on; see About Cross-
Subnet Failover, and in particular the subsection Implications for Application Deployment.

Types of IP Addresses
Both IPv4 and IPv6 address types are supported in Serviceguard. IPv4 addresses are the traditional
addresses of the form n.n.n.n where n is a decimal digit between 0 and 255. IPv6 addresses have the
form x:x:x:x:x:x:x:x where x is the hexadecimal value of each of eight 16-bit pieces of the 128-bit address.
You can define heartbeat IPs, stationary IPs, and relocatable (package) IPs as IPv4 or IPv6 addresses (or
certain combinations of both).

Adding and Deleting Relocatable IP Addresses


When a package is started, any relocatable IP addresses configured for that package are added to the
specified IP subnet. When the package is stopped, the relocatable IP address is deleted from the subnet.
These functions are performed by the cmmodnet command in the package master control script.
IP addresses are configured only on each primary network interface card. Multiple IPv4 addresses on the
same network card must belong to the same IP subnet.

CAUTION: Hewlett Packard Enterprise strongly recommends that you add relocatable addresses to
packages only by editing ip_address in the package configuration file and running cmapplyconf
(1m).

Load Sharing
Serviceguard allows you to configure several services into a single package, sharing a single IP address;
in that case all those services will fail over when the package does. If you want to be able to load-balance
services (that is, move a specific service to a less loaded system when necessary) you can do so by
putting each service in its own package and giving it a unique IP address.

Types of IP Addresses 57
Bonding of LAN Interfaces
Several LAN interfaces on a node can be grouped together in a process known in Linux as channel
bonding. In the bonded group, typically one interface is used to transmit and receive data, while the
others are available as backups. If one interface fails, another interface in the bonded group takes over.
Hewlett Packard Enterprise strongly recommends you use channel bonding in each critical IP subnet to
achieve highly available network services.
Host Bus Adapters (HBAs) do not have to be identical. Ethernet LANs must be the same type, but can be
of different bandwidth (for example, 1 Gb and 100 Mb). Serviceguard for Linux supports the use of
bonding of LAN interfaces at the driver level. The Ethernet driver is configured to employ a group of
interfaces.
Once bonding is enabled, each interface can be viewed as a single logical link of multiple physical ports
with only one IP and MAC address. There is no limit to the number of slaves (ports) per bond, and the
number of bonds per system is limited to the number of Linux modules you can load.
You can bond the ports within a multi-ported networking card (cards with up to four ports are currently
available). Alternatively, you can bond ports from different cards. Hewlett Packard Enterprise
recommends that use different cards. The figure shows an example of four separate interfaces bonded
into one aggregate.

Node 1

15.13.122.34 - eth0

15.13.122.34 - eth1
Individual LANICs
without bonding
15.13.122.34 - eth2

15.13.122.34 - eth3

Node 2

Group of Bonded
LANICs 15.13.122.34 - bond0

Figure 22: Bonded Network Interfaces

The LANs in the non-bonded configuration have four LAN cards, each associated with a separate non-
aggregated IP address and MAC address, and each with its own LAN name (eth1, eth2, eth3, or
eth4). When these ports are aggregated, all four ports are associated with a single IP address and MAC
address. In this example, the aggregated ports are collectively known as bond0, and this is the name by
which the bond is known during cluster configuration.
Figure 3-18 shows a bonded configuration using redundant hubs with a crossover cable.

58 Bonding of LAN Interfaces


Figure 23: Bonded NICs

In the bonding model, individual Ethernet interfaces are slaves, and the bond is the master. In the basic
high availability configuration (mode 1), one slave in a bond assumes an active role, while the others
remain inactive until a failure is detected. (In Figure 3-18, both eth0 slave interfaces are active.) It is
important that during configuration, the active slave interfaces on all nodes are connected to the same
hub. If this were not the case, then normal operation of the LAN would require the use of the crossover
between the hubs and the crossover would become a single point of failure.
After the failure of a card, messages are still carried on the bonded LAN and are received on the other
node, but now eth1 has become active in bond0 on node1. This situation is shown in Figure 24:
Bonded NICs after Failure on page 59.

Node 1 Node 2

Bound 0 Bound 0

eth0 eth1 eth0 eth1

Hub

Active

Active

Hub

Figure 24: Bonded NICs after Failure

Various combinations of Ethernet card types (single or dual-ported) and bond groups are possible, but it is
vitally important to remember that at least two physical cards (or physically separate on-board LAN
interfaces) must be used in any combination of channel bonds to avoid a single point of failure for
heartbeat connections.

Understanding Serviceguard Software Components 59


Bonding for Load Balancing
It is also possible to configure bonds in load balancing mode, which allows all slaves to transmit data in
parallel, in an active/active arrangement. In this case, high availability is provided by the fact that the bond
still continues to function (with less throughput) if one of the component LANs should fail. The user should
check the Bonding documentation to determine if the hardware configuration must use Ethernet switches
such as the HPE Procurve switch, which supports trunking of switch ports. The bonding driver
configuration must specify mode 0 for the bond type.
An example of this type of configuration is shown in Figure 25: Bonded NICs Configured for Load
Balancing on page 60.

Node 1 Node 2

Bound 0 Bound 0

eth0 eth1 eth0 eth1

Active Active
Switch Switch

Active Active

Fast Ethernet Trunks

Figure 25: Bonded NICs Configured for Load Balancing

Monitoring LAN Interfaces and Detecting Failure: Link Level


At regular intervals, determined by the NETWORK_POLLING_INTERVAL (see Cluster Configuration
Parameters on page 111), Serviceguard polls all the network interface cards specified in the cluster
configuration file (both bonded and non-bonded). If the link status of an interface is down, Serviceguard
marks the interface, and all subnets running on it, as down; this is shown in the output of cmviewcl
(1m); see Reporting Link-Level and IP-Level Failures on page 63. When the link comes back up,
Serviceguard marks the interface, and all subnets running on it, as up.

Monitoring LAN Interfaces and Detecting Failure: IP Level


Serviceguard can also monitor the IP level, checking Layer 3 health and connectivity for both IPv4 and
IPv6 subnets. This is done by the IP Monitor, which is configurable: you can enable IP monitoring for any
subnet configured into the cluster, but you do not have to monitor any. You can configure IP monitoring for
a subnet, or turn off monitoring, while the cluster is running.
The IP Monitor:

• Detects when a network interface fails to send or receive IP messages, even though it is still up at the
link level.
• Handles the failure, failover, recovery, and failback.

60 Bonding for Load Balancing


Reasons To Use IP Monitoring
Beyond the capabilities already provided by link-level monitoring, IP monitoring can:

• Monitor network status beyond the first level of switches; see How the IP Monitor Works
• Detect and handle errors such as:

◦ IP packet corruption on the router or switch


◦ Link failure between switches and a first-level router
◦ Inbound failures
◦ Errors that prevent packets from being received but do not affect the link-level health of an interface

IMPORTANT: You should configure the IP Monitor in a cross-subnet configuration, because IP


monitoring will detect some errors that link-level monitoring will not. See also Cross-Subnet
Configurations.

How the IP Monitor Works


Using Internet Control Message Protocol (ICMP) and ICMPv6, the IP Monitor sends polling messages to
target IP addresses and verifies that responses are received. When the IP Monitor detects a failure, it
marks the network interface down at the IP level, as shown in the output of cmviewcl (1m); see
Reporting Link-Level and IP-Level Failures on page 63 and Failure and Recovery Detection Times
on page 62.
The monitor can perform two types of polling:

• Peer polling.
In this case the IP Monitor sends ICMP ECHO messages from each IP address on a subnet to all other
IP addresses on the same subnet on other nodes in the cluster.

• Target polling.
In this case the IP Monitor sends ICMP ECHO messages from each IP address on a subnet to an
external IP address specified in the cluster configuration file; see POLLING_TARGET under Cluster
Configuration Parameters on page 111. cmquerycl (1m) will detect gateways available for use
as polling targets, as shown in the example below.
Target polling enables monitoring beyond the first level of switches, allowing you to detect if the route
is broken anywhere between each monitored IP address and the target.

NOTE: In a cross-subnet configuration, nodes can configure peer interfaces on nodes on the other
routed subnet as polling targets.

Hewlett Packard Enterprise recommends that you configure target polling if the subnet is not private to
the cluster.
The IP Monitor section of the cmquerycl output looks similar to this:

Route Connectivity (no probing was performed):

IPv4:

Reasons To Use IP Monitoring 61


1 16.89.143.192
16.89.120.0

Possible IP Monitor Subnets:

IPv4:

16.89.112.0 Polling Target 16.89.112.1

IPv6:

3ffe:1000:0:a801:: Polling Target 3ffe:1000:0:a801::254


The IP Monitor section of the cluster configuration file will look similar to the following for a subnet on
which IP monitoring is configured with target polling.

IMPORTANT: By default, the cmquerycl does not verify that the gateways it detects will work
correctly for monitoring. But if you use the -w full option, cmquerycl will validate them as polling
targets.

SUBNET 192.168.1.0
IP_MONITOR ON
POLLING_TARGET 192.168.1.254
By default, IP_MONITOR parameter is set to OFF. If a gateway is detected for the subnet in question, it
populates the POLLING_TARGET , which is commented out, and sets the IP_MONITOR parameter to
OFF.

SUBNET 192.168.1.0
IP_MONITOR OFF
#POLLING_TARGET 192.168.1.254
To configure a subnet for IP monitoring with peer polling, edit the IP Monitor section of the cluster
configuration file to look similar to this:

SUBNET 192.168.2.0
IP_MONITOR ON
The IP Monitor section of the cluster configuration file will look similar to the following in the case of a
subnet on which IP monitoring is disabled:

SUBNET 192.168.3.0
IP_MONITOR OFF

Failure and Recovery Detection Times


With the default NETWORK_POLLING_INTERVAL of 2 seconds (see Cluster Configuration
Parameters on page 111), the IP monitor will detect IP failures typically within 8–10 seconds for Ethernet
and within 16–18 seconds for InfiniBand. Similarly, with the default NETWORK_POLLING_INTERVAL,
the IP monitor will detect the recovery of an IP address typically within 8–10 seconds for Ethernet and
with 16–18 seconds for InfiniBand.

62 Understanding Serviceguard Software Components


The minimum time for detecting a failure/recovery of an IP address is 8 seconds for Ethernet and 15
seconds for InfiniBand.

IMPORTANT: Hewlett Packard Enterprise strongly recommends that you do not change the default
NETWORK_POLLING_INTERVAL value of 2 seconds.

See also Reporting Link-Level and IP-Level Failures on page 63.

Constraints and Limitations

• A subnet must be configured into the cluster in order to be monitored.


• Polling targets are not detected beyond the first-level router.
• Polling targets must accept and respond to ICMP (or ICMPv6) ECHO messages.

• A peer IP on the same subnet should not be a polling target because a node can always ping itself.

The following constraints apply to peer polling when there are only two interfaces on a subnet:

• If one interface fails, both interfaces and the entire subnet will be marked down on each node, unless
bonding is configured and there is a working standby.
• If the node that has one of the interfaces goes down, the subnet on the other node will be marked
down.
• In a 2-node cluster, there is only a single peer for polling. When POLLING_TARGET is not defined, if
either of the nodes fail (For example, a node is rebooted or all the interfaces of a node are down), IP
monitoring fails and all the subnets are marked down on the operational node. This results in failure of
packages running on the operational node.
Therefore, peer polling is not suitable when there is only a single peer as exists in a 2-node cluster. In
such scenarios, a polling target should always be defined so that a single LAN failure does not affect
polling of other LANs.

Reporting Link-Level and IP-Level Failures


Any given failure may occur at the link level or the IP level; a failure is reported slightly differently in the
output of cmviewcl (1m) depending on whether link-level or IP monitoring detects the failure.
If a failure is detected at the link level, output from cmviewcl -v will look like something like this:
Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY down (Link and IP) 0/3/1/0 eth2
PRIMARY up 0/5/1/0 eth3
cmviewcl -v -f line will report the same failure like this:
node:gary|interface:lan2|status=down
node:gary|interface:lan2|disabled=false
node:gary|interface:lan2|failure_type=link+ip
If a failure is detected by IP monitoring, output from cmviewcl -v will look like something like this:
Network_Parameters:
INTERFACE STATUS PATH NAME
PRIMARY down (IP only) 0/3/1/0 eth2
PRIMARY up 0/5/1/0 eth3

Constraints and Limitations 63


cmviewcl -v -f line will report the same failure like this:
node:gary|interface:lan2|status=down
node:gary|interface:lan2|disabled=false
node:gary|interface:lan2|failure_type=ip_only

Package Switching and Relocatable IP Addresses


A package switch involves moving the package to a new system. In the most common configuration, in
which all nodes are on the same subnet(s), the package IP (relocatable IP; see Stationary and
Relocatable IP Addresses and Monitored Subnets on page 56) moves as well, and the new system
must already have the subnet configured and working properly, otherwise the packages will not be
started.

NOTE: It is possible to configure a cluster that spans subnets joined by a router, with some nodes using
one subnet and some another. This is called a cross-subnet configuration. In this context, you can
configure packages to fail over from a node on one subnet to a node on another, and you will need to
configure a relocatable address for each subnet the package is configured to start on; see About Cross-
Subnet Failover, and in particular the subsection Implications for Application Deployment.

When a package switch occurs, TCP connections are lost. TCP applications must reconnect to regain
connectivity; this is not handled automatically. Note that if the package is dependent on multiple subnets
(specified as monitored_subnets in the package configuration file), all those subnets must normally be
available on the target node before the package will be started. (In a cross-subnet configuration, all
subnets configured on that node, and identified as monitored subnets in the package configuration file,
must be available.)
The switching of relocatable IP addresses is shown in Before Package Switching and After Package
Switching diagrams in Failover Packages’ Switching Behavior.

Address Resolution Messages after Switching on the Same Subnet


When a relocatable IP address is moved to a new interface, either locally or remotely, an ARP message is
broadcast to indicate the new mapping between IP address and link layer address. An ARP message is
sent for each IP address that has been moved. All systems receiving the broadcast should update the
associated ARP cache entry to reflect the change. Currently, the ARP messages are sent at the time the
IP address is added to the new system. An ARP message is sent in the form of an ARP request. The
sender and receiver protocol address fields of the ARP request message are both set to the same
relocatable IP address. This ensures that nodes receiving the message will not send replies.
Unlike IPv4, IPv6 addresses use NDP messages to determine the link-layer addresses of their neighbors.

VLAN Configurations
Virtual LAN configuration (VLAN) is supported in Serviceguard clusters.

What is VLAN?
VLAN is a technology that allows logical grouping of network nodes, regardless of their physical locations.
VLAN can be used to divide a physical LAN into multiple logical LAN segments or broadcast domains,
helping to reduce broadcast traffic, increase network performance and security, and improve
manageability.
Multiple VLAN interfaces, each with its own IP address, can be configured from a physical LAN interface;
these VLAN interfaces appear to applications as ordinary network interfaces (NICs). See the
documentation for your Linux distribution for more information on configuring VLAN interfaces.

64 Package Switching and Relocatable IP Addresses


Support for Linux VLAN
VLAN interfaces can be used as heartbeat as well as data networks in the cluster. The Network Manager
monitors the health of VLAN interfaces configured in the cluster, and performs remote failover of VLAN
interfaces when failure is detected. Failure of a VLAN interface is typically the result of the failure of the
underlying physical NIC port or Channel Bond interface.

Configuration Restrictions
Linux allows up to 1024 VLANs to be created from a physical NIC port. A large pool of system resources
is required to accommodate such a configuration; Serviceguard could suffer performance degradation if
many network interfaces are configured in each cluster node. To prevent this and other problems,
Serviceguard imposes the following restrictions:

• A maximum of 30 network interfaces per node is supported. The interfaces can be physical NIC ports,
VLAN interfaces, Channel Bonds, or any combination of these.
• Only port-based and IP-subnet-based VLANs are supported. Protocol-based VLAN is not supported
because Serviceguard does not support any transport protocols other than TCP/IP.
• Each VLAN interface must be assigned an IP address in a unique subnet.
• Using VLAN in a Wide Area Network cluster is not supported.

Additional Heartbeat Requirements


VLAN technology allows great flexibility in network configuration. To maintain Serviceguard’s reliability
and availability in such an environment, the heartbeat rules are tightened as follows when the cluster is
using VLANs:

1. VLAN heartbeat networks must be configured on separate physical NICs or Channel Bonds, to avoid
single points of failure.
2. Heartbeats are still recommended on all cluster networks, including VLANs.
3. If you are using VLANs, but decide not to use VLANs for heartbeat networks, heartbeats are
recommended for all other physical networks or Channel Bonds specified in the cluster configuration
file.

About Persistent Reservations


Serviceguard for Linux packages use persistent reservations (PR) to control access to LUNs. Persistent
reservations, defined by the SCSI Primary Commands version 3 (SPC-3) standard, provide a means to
register I/O initiators and specify who can access LUN devices (anyone, all registrants, only one
registrant) and how (read-only, write-only).
Unlike exclusive activation for volume groups, which does not prevent unauthorized access to the
underlying LUNs, PR controls access at the LUN level. Registration and reservation information is stored
on the device and enforced by its firmware; this information persists across device resets and system
reboots.

NOTE: Persistent Reservations coexist with, and are independent of, activation protection of volume
groups. You should continue to configure activation protection as instructed under Enabling Volume
Group Activation Protection on page 185. Subject to the Rules and Limitations on page 66 spelled
out below, Persistent Reservations will be applied to the cluster's LUNs, whether or not the LUNs are
configured into volume groups.

Advantages of PR:

Support for Linux VLAN 65


• Consistent behavior.
Different volume managers may implement exclusive activation differently (or not at all) PR is
implemented at the device level and does not depend on volume-manager support for exclusive
activation.

• Packages can control access to LUN devices independently of a volume manager.

Rules and Limitations


Serviceguard automatically implements PR for packages that use LUN storage, subject to the following
constraints:

• You can use PR with the following restrictions:

◦ PR is supported with Device Mapper (DM) multipath in the modular package.


◦ During package startup, Serviceguard performs registration only on active paths. If the path
becomes active after package startup, it remains unusable until the package is restarted.
If you are aware of any inactive paths that became active after the package restarted, you can
either

– Restart the package or


– Obtain the key value of the registered path: sg_persist -k <Path>

– And register the active path: sg_persist --out -G --param-sark=<key value>


<Path>

◦ If you are not using the udev alias names, multipath physical volumes names must be in the /dev/
mapper/XXXX or /dev/mpath/XXXX format.

◦ The udev alias names must not be configured in the /dev/mapper/ or /dev/mpath/ directory.

◦ Multipath device alias names must not contain “pN”, “_partN”, or “-partN” strings, where N is the
number.
For example, /dev/mapper/evadskp1, /dev/mapper/evadsk_part1 or /dev/mapper/
evadsk-part1.

• If you accidently run the pr_cleanup command on LUNs belonging to a package that is already
running, PR protection is disabled. To enable PR protection, you must restart the package.
• The udev alias names must be created using symlinks. For more information about how to create udev
alias names using symlinks, see the Using udev to Simplify HPE Serviceguard for Linux Configuration
white paper at http://www.hpe.com/info/linux-serviceguard-docs.
• PR is available in Serviceguard-xdc packages. For more information, see HPE Serviceguard Extended
Distance Cluster for Linux A.12.00.40 Deployment Guide .
• The LUN device must support PR and be consistent with the SPC-3 specification.

66 Rules and Limitations


• PR is available in modular failover and multinode packages.

◦ All instances of a modular multi-node package must be able to use PR; otherwise it will be turned
off for all instances.

• The package must have access to real devices, not only virtualized ones.
• iSCSI storage devices with PR is supported only on LVM volume groups.
• Serviceguard does not support PR with the disks which are part of VxVM diskgroups.
• Serviceguard does not support PR with disks which are of type VMFS on VMware virtual machines.
• On Red Hat Enterprise Linux 7, Serviceguard supports only user friendly named mapper device. For
information about how to setup user friendly named mapper device, see Red Hat Enterprise Linux 7
DM Multipath Configuration and Administration available at https://access.redhat.com/
documentation/en-US/Red_Hat_Enterprise_Linux/7/pdf/DM_Multipath/
Red_Hat_Enterprise_Linux-7-DM_Multipath-en-US.pdf.

CAUTION: Serviceguard makes and revokes registrations and reservations during normal package
startup and shutdown, or package failover. Serviceguard also provides a script to clear reservations
in the event of a catastrophic cluster failure. You need to make sure that this script is run in that
case; the LUN devices could become unusable otherwise. See Revoking Persistent Reservations
after a Catastrophic Failure on page 349 for more information.

How Persistent Reservations Work


You do not need to do any configuration to enable or activate PR, and in fact you cannot enable it or
disable it, either at the cluster or the package level; Serviceguard makes the decision for each cluster and
package on the basis of the Rules and Limitations on page 66 described above.
When you run cmapplyconf (1m) to configure a new cluster, or add a new node, Serviceguard sets the
variable cluster_pr_mode to either pr_enabled or pr_disabled.

• ENABLED means that packages can in principle use PR, but in practice will do so only if they meet the
conditions spelled out under Rules and Limitations on page 66.
• DISABLED means that no packages can use PR

You can see the setting of cluster_pr_mode in the output of cmviewcl -f line; for example:
...
cluster_pr_mode: pr_enabled

NOTE: You cannot change the setting of cluster_pr_mode.

If a package is qualified to use PR, Serviceguard automatically makes and revokes registrations and
reservations for the package's LUNs during package startup, and revokes them during package
shutdown, using the sg_persist command. This command is available, and has a manpage, on both
Red Hat 5, Red Hat 6, and SUSE 11 .
Serviceguard makes a PR of type Write Exclusive Registrants Only (WERO) on the package's LUN
devices. This gives read access to any initiator regardless of whether the initiator is registered or not, but
grants write access only to those initiators who are registered. (WERO is defined in the SPC-3 standard.)

How Persistent Reservations Work 67


All initiators on each node running the package register with LUN devices using the same PR Key, known
as the node_pr_key. Each node in the cluster has a unique node_pr_key, which you can see in the output
of cmviewcl -f line; for example:
...
node:bla2|node_pr_key=10001
When a failover package starts up, any existing PR keys and reservations are cleared from the underlying
LUN devices first; then the node_pr_key of the node that the package is starting on is registered with
each LUN.
In the case of a multi-node package, the PR reservation is made for the underlying LUNs by the first
instance of the package, and the node_pr_key is registered each time the package starts on a new node.
If a node fails, the instances of the package running on other nodes will remove the registrations of the
failed node.
You can use cmgetpkgenv (1m) to see whether PR is enabled for a given package; for example:
cmgetpkgenv pkg1
...
PKG_PR_MODE="pr_enabled"

Volume Managers for Data Storage


A volume manager lets you create units of disk storage that are more flexible than individual disk
partitions. These units can be used on single systems or in high-availability clusters. Serviceguard for
Linux uses the Linux Logical Volume Manager (LVM) which creates redundant storage groups. This
section provides an overview of volume management with LVM. See Creating the Logical Volume
Infrastructure on page 181 in Chapter 5 for information about configuring volume groups, logical
volumes, and file systems for use in Serviceguard packages.
In Serviceguard for Linux, the supported shared data storage type is disk arrays which configure
redundant storage in hardware.
In a disk array, the basic element of storage is a LUN, which already provides storage redundancy via
RAID1 or RAID5. Before you can use the LUNs, you must partition them using fdisk.
In LVM, you manipulate storage in one or more volume groups. A volume group is built by grouping
individual physical volumes. Physical volumes can be disk partitions or LUNs that have been marked as
physical volumes as described below.
You use the pvcreate command to mark the LUN as physical volumes. Then you use the vgcreate
command to create volume groups out of one or more physical volumes. Once configured, a volume
group can be subdivided into logical volumes of different sizes and types. File systems or databases used
by the applications in the cluster are mounted on these logical volumes. In Serviceguard clusters, volume
groups are activated by package control scripts when an application starts up, and they are deactivated
by package control scripts when the application halts.

Storage on Arrays
Physical Disks Combined into LUNs shows LUNs configured on a storage array. Physical disks are
configured by an array utility program into logical units, or LUNs, which are seen by the operating system.

68 Volume Managers for Data Storage


Physical Mechanisms Logical Units

Disk Disk
mechanism mechanism

LUN 0
Disk Disk
mechanism mechanism Disk Array

LUN 1
Disk Disk LUN Configuration
mechanism mechanism

LUN 2
Disk Disk
mechanism mechanism

Figure 26: Physical Disks Combined into LUNs

NOTE: LUN definition is normally done using utility programs provided by the disk array manufacturer.
Since arrays vary considerably, you should refer to the documentation that accompanies your storage
unit.

Monitoring Disks
Each package configuration includes information about the disks that are to be activated by the package
at startup. If monitoring is used, the health of the disks is checked at package startup. The package will
fail if the disks are not available.
When this happens, the package may be restarted on another node. If auto_run is set to yes, the
package will start up on another eligible node, if it meets all the requirements for startup. If auto_run is set
to no, then the package simply halts without starting up anywhere else.
The process for configuring disk monitoring is described in Creating a Disk Monitor Configuration on
page 254.

More Information on LVM


Refer to the section “Creating the Logical Volume Infrastructure” in Chapter 5 for details about configuring
volume groups, logical volumes, and file systems for use in Serviceguard packages.
For a basic description of Linux LVM, see the article, Logical Volume Manager HOWTO on the Linux
Documentation Project page at http://www.tldp.org.

Veritas Volume Manager (VxVM)


Veritas Volume Manager (VxVM) by Veritas is a storage management subsystem that allows you to
manage physical disks and LUNs (Logical Unit Numbers) as logical devices called volumes. A VxVM
volume appears to applications and the operating system as a physical device on which file systems,
databases, and other managed data objects can be configured.

NOTE: VxVM and VxFS are supported on HPE Serviceguard A.12.00.00 with Veritas Storage Foundation
6.0.1 and later.

Monitoring Disks 69
For more information about how to create a storage infrastructure with VxVM, see Creating a Storage
Infrastructure with VxVM on page 190.

Limitations
Following are the limitations of VxVM:

• Serviceguard does not support VxVM with Virtual Machines (VMs).


• Serviceguard does not support VxVM on iSCSI storage.
• Serviceguard supports VxVM only with OS Native naming scheme. To set the OS Native naming
scheme:
#/usr/sbin/vxddladm set namingscheme=osn

Using VMware Virtual Machine File System Disks


Starting wih Serviceguard for Linux A.12.00.40 you can configure VMFS/RDM based VMDK disks in the
packages for the application use. This release also introduces a new dynamically linked storage
mechanism (DLS), using which the VMDK(VMFS/RDM (Physical and Virtual)) disks can be configured in
a guest as a Serviceguard node, and which can be used in a Serviceguard package as well. In this new
mechanism, the VMDK disks are at any point of time exposed to only one VM in the cluster. During the
package failover Serviceguard automatically deataches the VMDK disks from one VM and attaches the
same to the new adoptive VM where the will be running. Earlier to this release, Serviceguard only
supported RDM based physical disks to be used in the packages for application data. The RDM Physical
disks were statically linked to the guest(Serviceguard node) that is, accessible to all the guests at all time.
In this confirguration, the SCSI-3 protocol based Persistent Reservation (PR) was used to ensure data
integrity. The way VMware native multipathing handled SCSI3 PR during path failover was not possible in
a Serviceguard environment. Additionally, there were restrictions with VMware that did not allow for
vMotion of VM's that were configured to have physical disks or LUNs accessed with RDM mapping.

Storage configuration type in a VMware environment


Starting with Serviceguard for Linux A.12.00.40 you can configure VMFS or RDM based VMDK disks in
the packages for the application. This release also introduces a new dynamically linked storage
mechanism (DLS), using which the VMFS and RDM (Compatibility mode as Physical or Virtual) based
VMDK disks can be configured in a guest as a Serviceguard node, and which can be used in a
Serviceguard package as well. In this new mechanism, the VMDK disks are at any point of time exposed
to only one VM in the cluster. During the package failover Serviceguard automatically detaches the VMDK
disks from one VM and attaches the same to the new adoptive VM where the package is now going to
run.
Earlier to this release, Serviceguard only supported RDM based physical disks to be used in the
packages for application data. The RDM Physical disks were statically linked to the guest (Serviceguard
node) that is, accessible to all the guests at all the time. In this configuration, the SCSI-3 protocol based
Persistent Reservation (PR) is used to ensure data integrity. The way VMware native multipathing
handled SCSI3 PR during path failover was not possible in a Serviceguard environment. Additionally,
there were restrictions with VMware that does not allow for vMotion of VMs that were configured to have
physical disks or LUNs accessed with RDM mapping.
Serviceguard clustered VMs using the new dynamically linked storage mechanism with VMDK disks can
now leverage VMware features, such as vMotion and VMware Native Multipathing (NMP).
The table describes the differences between Statically storage (Only RDM based) and dynamically linked
storage configuration (RDM or VMFS based).

70 Limitations
Table 4: Storage configuration type in a VMware Environment

Storage Configuration Type Description

Static Linked Storage (SLS) In SLS, VMDK disks are configured to all VMs that are part of the
cluster as RDM in physical compatibility mode. Serviceguard node
which is active for a given package places PR for exclusive access of
RDM disks to ensure data integrity. This is the only supported storage
configuration until Serviceguard A.12.00.30 and will continue to be
available in later versions of Serviceguard.

Dynamic Linked Storage (DLS) In Dynamically linked storage configuration the disks are accessible to
a single VMware virtual machine at a time in the Serviceguard cluster.

NOTE: When you want to add a new cluster and a package with DLS, then using the same command is
not supported.
For Example:
cmapplyconf -C clus.ascci -P pkg.ascii
In such cases, you must apply the cluster configuration first, and then add the package with DLS.
For Example:
cmapplyconf -C clus.ascii
cmapplyconf -P pkg.ascii

How does the DLS work


As part of package start, stop, and failover sequences, the Serviceguard packages attach or detach the
configured VMDK disks to the virtual machines accordingly.
Package Startup
As a part of the start up sequence, the package attaches the configured VMDK disks on the node, where
the package is starting.
Package Stop
As a part of the stop sequence, the package detaches the configured VMDK disks on the node, where the
package is halting.
During Failover (Application Failure)
In the event of an application failure, Serviceguard detects the application failure and fails over the
package to the adoptive node. If the package is configured with VMFS disks, then as part of the package
failover Serviceguard detaches the VMFS disk from the guest (cluster node) where the application failed,
and attaches the same VMFS disk on to the adoptive guest node.
During Failover (Node Failure)
When the VM node running the package fails, the package fails over to an adoptive node. While starting
on an adoptive node package attempts to detach VMDK disk remotely from the failed VM, and then
attaches the disk to the VM, where the package is failed over.
During Failover (Host Failure)
When the host fails, it cannot receive any calls to detach a disk. In this scenario, Serviceguard attaches
the disk to an adoptive node without waiting for the issued detach disk request to be successfully
completed on a failed node. When the failed host becomes available, you must manually detach the disks
from the failed virtual machine and then power it up.

How does the DLS work 71


To perform the attach and detach operations the packages issue attach and deatach requests. These
requests are issued either to a VMware vCenter Server, which is already configured to manage the VMs
in the cluster, or directly to the ESX Host on which the VM is hosted. You must make a choice as to how
to perform the attach and detach operations before deploying the solution.
If you choose to use the vCenter server, you must then update the vCenter server details only once.
There are no changes to configuration required when the VMs are migrated to other hosts. When the
requests are issued to the vCenter, the vCenter propagates the request to the appropriate host. However,
for the successful failover of packages configured with VMDK disks the vCenter server is required. The
vCenter then can become a single point of failure if sufficient redundancy is not built for the vCenter
server.
If you choose to directly issue the request to the hosts, there is no dependency on other components like
vCenter server for a successful failover. However, you must update the cluster configuration every time a
VM in the cluster is migrated from one host to another. The host to guest mapping is required so that the
package can issue the requests to the right host or as required.

Prerequisites
Before you enable this feature, ensure the following prerequisites are met:
VMware Prerequisites

• Ensure that all the ESXi host version must be 5.5 or later.
• Ensure that VMware vCenter server version must be 5.5 or later.
• Ensure that vCenter or Exsi hosts must be listening on port 443 — https 443/TCP.
• Ensure that the VMware virtual machines configured in the cluster must have unique UUID across the
vCenter or ESXi hosts.
• Ensure that you have installed Open Java and IBM Java. Minimum supported versions are:

◦ Open Java — Open Java 7 Update Patch provided by Linux system


◦ IBM Java — IBM Java Version 7 Release 1 Service Refresh 3 (1.7.1_sr3.0)

• The cluster must be configured with ESXi host or VMware vCenter server on which the VMware virtual
machine exists. For more information about how to configure ESXi host or vCenter server in cluster,
see Specifying a VCENTER_SERVER or ESX_HOST on page 196
• Datastore must be created on a shared disk which is accessible to all virtual machines configured in a
cluster as per VMware configuration guidelines.

NOTE: Ensure that the VMware virtual machines configured in the cluster must have unique UUID across
the vCenter or ESXi host.

Serviceguard Prerequisites
Serviceguard for Linux A.12.00.40 must be installed on all the nodes in the cluster.
All the cluster nodes must have the following minimum supported versions of Open Java (or) IBM Java
installed Open Java.

• Open Java 7 Update Patch provided by Linux system IBM Java


• IBM Java Version 7 Release 1 Service Refresh 3 (1.7.1_sr3.0)

Requirements to Perform Attach and Detach Disk Operations in a DLS mechanism.

72 Prerequisites
The following are the prerequisites required to perform attach and detach disk operations in a DLS
mechanism:

• You must choose as to how you want to perform attach and detach operations via vCenter Server (or)
to the ESXi host.
• After you choose vCenter Server (or) to the ESXi host, then both must be listening on port 443 — https
443/TCP.
• Specify in the Serviceguard Credential Store(SCS), the VCenter server login credentials (or) the ESX
Host login credential. For more information about how to configure ESXi host or vCenter server in
cluster.

Procedure

Storage Requirements

1. The VMware Datastores containing the VMDK disks/files configured in the package must be created
on a shared disk which is accessible to all VMs in the cluster. The Datastores themselves must be
configured as per VMware configuration guidelines.
2. The Virtual machine’s SCSI controller type must be VMware paravirtual.

3. The Virtual machine’s SCSI controller's SCSI Bus Sharing flag must be configured as "None" which
imply that virtual disks cannot be attached to two virtual machines anytime.
4. The VMDK disk can be of type RDM or Virtual disk (VMFS). If it is RDM, then both compatibility Mode
Physical and Virtual are supported. In a fully virtualized environment all the disk types (RDM or Virtual
disk (VMFS)) are supported. However, in a hybrid cluster, that is, a mix of physical and virtual
machines, only RDM disks in Physical compatibility Mode are supported.
5. When configuring the disk you are prompted to choose the "SCSI Controller Slot Number". The format
of this is "0:0" or "1:2", and so on. Where the first digit is the SCSI controller number and the second
digit is the slot on the controller where the disk is attached. Serviceguard requires that for a given disk
the "SCSI Controller Slot Number" where the disk will be attached in a VM must be the same; on all
the VM nodes in the cluster, where the package may run. This means when configuring the disks you
must ensure that the required "SCSI Controller Slot Number" for a disk is free on all the VMs in the
cluster.

NOTE: Ensure that the VMware virtual machines configured in the cluster must have unique UUID
across the vCenter or ESXi host.

Supported Configuration
Serviceguard supports the following configurations with VMware disks (RDM/VMFS):

Table 5: Supported Configuration in DLS with VMware disks (RDM/VMFS):

Number VMware Disk Type Cluster Type


Hybrid Cluster Fully Virtualized Cluster

1 RDM (Compatibility Mode in Yes Yes


Physical)

Table Continued

Supported Configuration 73
2 RDM (Compatibility Mode in No Yes
Virtual)

3 VMFS No Yes

• Configuring SLS based storage (RDM) and DLS based storage (RDM or VMFS) in the same package
are not supported.
• Configuring both DLS based RDM disks and VMFS disks in the same package is not supported.
• Package with SLS based storage (RDM) and another package with DLS based storage (RDM or
VMFS) can coexist in a cluster.

Adding Dynamically linked storage (RDM or VMFS) to a package


To add VMFS volumes to a package in a DLS mechanism, following are the high level steps that are
required to configure VMDK based VMFS or RDM disks in Serviceguard packages.

Procedure

1. Create and populate Serviceguard Credential Store (SCS) utility with entries for all the required Esxi/
vCenter hosts on which VMware virtual nodes configured, which are planning to create a cluster. For
more details, see cmvmusermgmt (1m) manpage.

2. # cmvmusermgmt -U -H <Esxi/vCenter>

3. Add the appropriate ESX_HOST or VCENTER_SERVER parameter in the cluster configuration file.
For more information about these parameters and its description, see Specifying a
VCENTER_SERVER or ESX_HOST on page 196.
4. Create the package with VMFS module package parameters and apply the configuration. For more
information about how to add VMFS module package parameters, see Configuring DLS based
VMDK (VMFS/RDM) in the Package on page 129.

Reconfiguring from Serviceguard cluster with SLS configured package to DLS


configured package
Cluster Reconfiguration
You must first upgrade all cluster nodes to SGLX A.12.0.40. This is done using LAD or rolling upgrades.
The cluster configuration must be updated with the ESX Host or vCenter Server details as listed inAdding
Dynamically linked storage (RDM or VMFS) to a package on page 74. This can be an online operation
and the cluster service can be up and running. After this you must populate SCS. For more information,
see Specifying a VCENTER_SERVER or ESX_HOST on page 196. This also can be done online and
requires no down time.
Storage Layout Reconfiguration
The new dynamically linked storage mechanism (VMFS and RDM based VMDK disks) is the preferred
storage layout for application data in Serviceguard VMware deployments. HPE recommends to move to
the new storage layout.
The following are the two of the multiple possible approaches to move to the new storage layout.
Option1: Reusing the existing statically linked RDM disks
To reuse the same disks, customer must reconfigure the existing statically linked RDM (Physical) disk as
per the requirements of the new DLS based storage layout. For a fully virtualized set up the disk type

74 Adding Dynamically linked storage (RDM or VMFS) to a package


must be RDM in Physical mode or Virtual mode. In case of hybrid cluster only RDM Physical Mode can
be used. For more information, see Prerequisites.
The disks must be reconfigured to meet all the requirements as listed in the Supported Configuration on
page 73. While reconfiguring disks you must follow all the VMware recommendations.
The new storage layout also requires that for a given disk the "SCSI Controller Slot Number", where the
disk will be attached in a VM must be the same on all the VM nodes in the cluster, where the package
may run. This means when configuring the disks you must ensure that the chosen "SCSI Controller Slot
Number" for a disk is free on all the VMs in the cluster. This was not required with the RDM based SLS
mechanism.
Reconfiguring the storage layout may require a reboot of VM. This can be done in a rolling fashion where
we can perform the reconfiguration one VM node at a time.

NOTE: Before reconfiguring ensure that the disks are sufficiently backed up.

Migrate Data to New disks


You can alternatively create fresh new disk and replace the old RDM based statically linked disk. To
migrate data follow all storage and VMware prerequisites. For more information, see Storage
configuration type in a VMware environment on page 70.
Package Reconfiguration
The package reconfiguration procedure requires down time of the package. Add the new VMFS modules
to the packages. For more information in configuring the package parameters, see Configuring DLS
based VMDK (VMFS/RDM) in the Package on page 129.
For example, assume there is a two node Serviceguard cluster configured, and node names are "nodeA"
and "nodeB". The pkg1 is configured with one SLS configured RDM disk in physical sharing mode and it
is up and running on nodeA.
You must complete all the requirements mentioned in prerequisites, before copying data on backup disk
(RDM/VMFS).

Procedure

1. Configure Serviceguard Credential Store (SCS) of all required Esxi or vCenter hosts on which
VMware virtual nodes configured, which are participating in a cluster. For more information to add
ESX Host to SCS, see cmvmusermgmt manpage.
#cmvmusermgmt -U -H <Esxi or vCenter host>

2. Configure the Esxi or vCenter parameters in the cluster. Specifying a VCENTER_SERVER or


ESX_HOST on page 196. You can add the parameters online in the cluster.
3. Halt the configured package.
#cmhaltpkg pkg1

4. Move all other running packages if any to Node B, and then halt node A.
#cmhaltnode

5. Power off the VM (node A).


6. Prepare the DLS based storage layout for application use on Node A.

Understanding Serviceguard Software Components 75


a. Change the SCSI controller type of the SCSI controller by attaching the corresponding disk to
"VMware Paravirtual".

b. Change the SCSI Controller Bus sharing mode from Physical to "None".

c. In the existing configuration, if the disk is attached to the same SCSI controller and slot number
on both the VMs (Node A and Node B), then the same can be reused. If the SCSI controller and
slot number are different for the disk on the two VMs, then you must reconfigure to ensure that
the disk attaches to the same SCSI controller and slot number on both the VMs (Node A and
Node B). If the slot numbers that we choose to use are used in other RDM disks in other
packages, then those packages must also be brought down.

7. Power on the VM (node A).


8. After the VM is up, start the cluster on node (nodeA).
#cmrunnode

9. If there are other packages running on node B, move then to Node A and repeat the steps 4 through
8 till last node in a cluster.
10. Get the existing package configuration.
#cmgetconf -p pkg1 > pkg1.ascii

11. You must upgrade the existing package pkg1 to have the new VMFS related parameters. Run the
following command to create a new package ascii file as "pkg1_new.ascii":
#cmmakepkg -u pkg1.ascii -m sg/vmfs pkg1_new.ascii

12. Edit the pkg2.ascii, and add the values for VMFS related parameters (VMDK NAME, Datastore
name, Type of VMware virtual disk and SCSI controller slot number (in X:Y format)). For more
information on parameters, see Configuring DLS based VMDK (VMFS/RDM) in the Package on
page 129.
13. Apply the new package configuration with the VMFS parameters.
cmapplyconf -P <pkg1_new.ascii>

14. Restart the package.


#cmrunpkg pkg1

Root Disk Monitoring


Serviceguard for Linux A.12.20.00 introduces support for Root Disk Monitoring. Serviceguard Root Disk
Monitoring provides protection against failure of root disks. The root disk failure leaves the system in an
inconsistent state. The applications and workloads running on that system become vulnerable to failures.
After enabling the root disk monitoring option, if the system root disk fails or becomes unresponsive to the
applications, then Serviceguard completes the following tasks:

1. Detects the failure


2. Sends a message about the failure to all member nodes in the cluster
3. Resets the node

After the node is reset, all the applications and workloads running on that node fail over to another
healthy node. You can enable or disable Root Disk Monitoring at the node or at the cluster level. By
default, root disk monitoring option is disabled.

76 Root Disk Monitoring


NOTE:

• If a node is the only active node in the cluster and if the root disk fails, Serviceguard will not reset that
node.
• Serviceguard will not take any action on the packages which are in failed state when root disk
monitoring feature detects root disk failure and subsequently resets the system. Those packages not
in failed state on the affected node of a cluster would become eligible for Serviceguard initiated
failover operation.
• If you are using mapper devices for root disk, ensure that the disk names are configured with user
friendly names like /dev/mapper/mpathX.

For more information about Root Disk Monitoring, see Configuring Root Disk Monitoring parameter

Responses to Failures
Serviceguard responds to different kinds of failures in specific ways. For most hardware failures, the
response is not user-configurable, but for package and service failures, you can choose the system’s
response, within limits.

Reboot When a Node Fails


The most dramatic response to a failure in a Serviceguard cluster is a system reboot. This allows
packages to move quickly to another node, protecting the integrity of the data.
A reboot is done if a cluster node cannot communicate with the majority of cluster members for the pre-
determined time, or under other circumstances such as a kernel hang or failure of the cluster daemon
(cmcld). When this happens, you may see the following message on the console:
DEADMAN: Time expired, initiating system restart.
The case is covered in more detail under What Happens when a Node Times Out. See also Cluster
Daemon: cmcld on page 28.
A reboot is also initiated by Serviceguard itself under specific circumstances; see Responses to
Package and Service Failures on page 79.

What Happens when a Node Times Out


Each node sends a heartbeat message to all other nodes at an interval equal to one-fourth of the value of
the configured MEMBER_TIMEOUT or 1 second, whichever is less. You configure MEMBER_TIMEOUT
in the cluster configuration file; see Cluster Configuration Parameters on page 111. The heartbeat
interval is not directly configurable. If a node fails to send a heartbeat message within the time set by
MEMBER_TIMEOUT, the cluster is reformed minus the node no longer sending heartbeat messages.
When a node detects that another node has failed (that is, no heartbeat message has arrived within
MEMBER_TIMEOUT microseconds), the following sequence of events occurs:

1. The node contacts the other nodes and tries to re-form the cluster without the failed node.
2. If the remaining nodes are a majority or can obtain the cluster lock, they form a new cluster without the
failed node.
3. If the remaining nodes are not a majority or cannot get the cluster lock, they halt (system reset).

Responses to Failures 77
Example
Situation. Assume a two-node cluster, with Package1 running on SystemA and Package2 running on
SystemB. Volume group vg01 is exclusively activated on SystemA; volume group vg02is exclusively
activated on SystemB. Package IP addresses are assigned to SystemA and SystemB respectively.
Failure. Only one LAN has been configured for both heartbeat and data traffic. During the course of
operations, heavy application traffic monopolizes the bandwidth of the network, preventing heartbeat
packets from getting through.
Since SystemA does not receive heartbeat messages from SystemB, SystemA attempts to re-form as a
one-node cluster. Likewise, since SystemB does not receive heartbeat messages from SystemA,
SystemB also attempts to reform as a one-node cluster. During the election protocol, each node votes for
itself, giving both nodes 50 percent of the vote. Because both nodes have 50 percent of the vote, both
nodes now vie for the cluster lock. Only one node will get the lock.
Outcome. Assume SystemA gets the cluster lock. SystemA re-forms as a one-node cluster. After re-
formation, SystemA will make sure all applications configured to run on an existing clustered node are
running. When SystemA discovers Package2 is not running in the cluster it will try to start Package2 if
Package2 is configured to run on SystemA.
SystemB recognizes that it has failed to get the cluster lock and so cannot re-form the cluster. To release
all resources related toPackage2 (such as exclusive access to volume group vg02 and the Package2 IP
address) as quickly as possible, SystemB halts (system reset).

NOTE: If AUTOSTART_CMCLD in /etc/rc.config.d/cmcluster ($SGAUTOSTART) is set to zero,


the node will not attempt to join the cluster when it comes back up.

For more information on cluster failover, see the white paper Optimizing Failover Time in a Serviceguard
Environment (version A.11.19 or later) at http://www.hpe.com/info/linux-serviceguard-docs (Select
“White Papers”). For troubleshooting information, see Cluster Re-formations Caused by
MEMBER_TIMEOUT Being Set too Low.

Responses to Hardware Failures


If a serious system problem occurs, such as a system panic or physical disruption of the SPU's circuits,
Serviceguard recognizes a node failure and transfers the packages currently running on that node to an
adoptive node elsewhere in the cluster. (System multi-node and multi-node packages do not fail over.)
The new location for each package is determined by that package's configuration file, which lists primary
and alternate nodes for the package. Transfer of a package to another node does not transfer the
program counter. Processes in a transferred package will restart from the beginning. In order for an
application to be expeditiously restarted after a failure, it must be “crash-tolerant”; that is, all processes in
the package must be written so that they can detect such a restart. This is the same application design
required for restart after a normal system crash.
In the event of a LAN interface failure, bonding provides a backup path for IP messages. If a heartbeat
LAN interface fails and no redundant heartbeat is configured, the node fails with a reboot. If a monitored
data LAN interface fails, the node fails with a reboot only if node_fail_fast_enabled (described further
under Configuring a Package: Next Steps) is set to yes for the package. Otherwise any packages
using that LAN interface will be halted and moved to another node if possible (unless the LAN recovers
immediately; see When a Service or Subnet Fails or Generic Resource or a Dependency is Not Met
on page 52).
Disk monitoring provides additional protection. You can configure packages to be dependent on the
health of disks, so that when a disk monitor reports a problem, the package can fail over to another node.
See Creating a Disk Monitor Configuration on page 254.
Serviceguard does not respond directly to power failures, although a loss of power to an individual cluster
component may appear to Serviceguard like the failure of that component, and will result in the

78 Responses to Hardware Failures


appropriate switching behavior. Power protection is provided by Hewlett Packard Enterprise-supported
uninterruptible power supplies (UPS).

Responses to Root Disk failures


If the root disk is unresponsive to the applications or if the root disk completely fails, Serviceguard detects
the failure. It sends a message about the failure to all member nodes of the cluster and then resets the
node. An error message ERROR: Root Disk failure detected for node <nodename> is
logged in the syslog of all member nodes in the cluster.
After the node with the faulty root disk is reset, all the applications and workloads running on that node fail
over to another healthy node configured in the cluster.
You can use the error messages from Serviceguard on the member nodes to analyze the cause of the
system failure. These messages will not be available on the node that has failed, as the system level log
files are not visible due to root disk failure.
Sometimes the node can detect a root disk failure before Serviceguard does and resets the node.
Serviceguard does not send out any message to member nodes about the failure when the node resets
by itself. The failed node might not contain the failure message in the local log files, as the root disk,
which usually saves the system log files, has failed.

Responses to Generic Resources Failures at cluster level


In a cluster that is configured with a generic resource and is running, failure of a resource doesn’t trigger
any action at cluster level unless the same generic resource is configured in the package. However the
generic resource status will be shown as down at cluster level.
When the generic resource monitoring is configured in cluster and if any package is configured to use the
generic resource at package level; then failure of generic resource will prompt Serviceguard Package
Manager to take appropriate action based on the style of the package. For more information, see
Responses to Package and Generic Resources Failures.

Responses to Package and Service Failures


In the default case, the failure of a package, a generic resource or service of the package or of a service
within a package causes the package to shut down by running the control script with the stop parameter,
and then restarting the package on an alternate node. A package will also fail if it is configured to have a
dependency on another package, and that package fails.
You can modify this default behavior by specifying that the node should halt (system reset) before the
transfer takes place. You do this by setting failfast parameters in the package configuration file.
In cases in which package shutdown might hang, leaving the node in an unknown state, failfast options
can provide a quick failover, after which the node will be cleaned up on reboot. Remember, however, that
a system reset causes all packages on the node to halt abruptly.
The settings of the failfast parameters in the package configuration file determine the behavior of the
package and the node in the event of a package or resource failure:

• If service_fail_fast_enabled is set to yes in the package configuration file, Serviceguard will reboot
the node if there is a failure of that specific service.
• If node_fail_fast_enabled is set to yes in the package configuration file, and the package fails,
Serviceguard will halt (reboot) the node on which the package is running.

For more information, see Package Configuration Planning and Configuring Packages and Their
Services .

Responses to Root Disk failures 79


Responses to Package and Generic Resources Failures
In a package that is configured with a generic resource and is running, failure of a resource prompts the
Serviceguard Package Manager to take appropriate action based on the style of the package.
For failover packages, the package is halted on the node where the resource failure occurred and started
on an available alternative node. For multi-node packages, failure of a generic resources causes the
package to be halted only on the node where the failure occurred.

• In case of simple resources, failure of a resource must trigger the monitoring script to set the status of
a resource to 'down' using the cmsetresource command.

• In case of extended resources, the value fetched by the monitoring script can be set using the
cmsetresource command.
The Serviceguard Package Manager evaluates this value against the generic_resource_up_criteria set
for a resource in the packages where it is configured. If the value that is set (current_value) does not
satisfy the generic_resource_up_criteria, then the generic resource is marked as 'down' on that node.

NOTE: If a simple resource is down on a particular node, it is down on that node for all the packages
using it whereas, in case of an extended resource the resource may be up on a node for a particular
package and down for another package, since it is dependent on the generic_resource_up_criteria.

Additionally, in a running package configured with a generic resource:

• Any failure of a generic resource of evaluation type "before_package_start" configured in a package


will not disable the node switching for the package.
• Any failure of a generic resource of evaluation type "during_package_start" configured in a package
will disable the node switching for the package.

Choosing Switching and Failover Behavior provides advice on choosing appropriate failover behavior.
See Parameters for Configuring Generic Resources.

Service Restarts
You can allow a service to restart locally following a failure. To do this, you indicate a number of restarts
for each service in the package control script. When a service starts, the variable service_restart is set in
the service’s environment. The service, as it executes, can examine this variable to see whether it has
been restarted after a failure, and if so, it can take appropriate action such as cleanup.

Network Communication Failure


An important element in the cluster is the health of the network itself. As it continuously monitors the
cluster, each node listens for heartbeat messages from the other nodes confirming that all nodes are able
to communicate with each other. If a node does not hear these messages within the configured amount of
time, a node timeout occurs, resulting in a cluster re-formation and later, if there are still no heartbeat
messages received, a reboot. See What Happens when a Node Times Out.

80 Responses to Package and Generic Resources Failures


Planning and Documenting an HA Cluster
Building a Serviceguard cluster begins with a planning phase in which you gather and record information
about all the hardware and software components of the configuration.
This chapter assists you in the following planning areas:

• General Planning
• Hardware Planning
• Power Supply Planning
• Cluster Lock Planning
• Volume Manager Planning
• Cluster Configuration Planning
• Package Configuration Planning

Blank Planning Worksheets on page 380 contains a set of blank worksheets which you may find useful
as an offline record of important details of the configuration.

NOTE: Planning and installation overlap considerably, so you may not be able to complete the
worksheets before you proceed to the actual configuration. In that case, fill in the missing elements to
document the system as you proceed with the configuration.

Subsequent chapters describe configuration and maintenance tasks in detail.

General Planning
A clear understanding of your high availability objectives will quickly help you to define your hardware
requirements and design your system. Use the following questions as a guide for general planning:

1. What applications must continue to be available in the event of a failure?


2. What system resources (processing power, networking, SPU, memory, disk space) are needed to
support these applications?
3. How will these resources be distributed among the nodes in the cluster during normal operation?
4. How will these resources be distributed among the nodes of the cluster in all possible combinations of
failures, especially node failures?
5. How will resources be distributed during routine maintenance of the cluster?
6. What are the networking requirements? Are all networks and subnets available?
7. Have you eliminated all single points of failure? For example:

• network points of failure.


• disk points of failure.
• electrical points of failure.
• application points of failure.

Planning and Documenting an HA Cluster 81


Serviceguard Memory Requirements
Serviceguard requires approximately 15.5 MB of lockable memory.

Planning for Expansion


When you first set up the cluster, you indicate a set of nodes and define a group of packages for the initial
configuration. At a later time, you may wish to add additional nodes and packages, or you may wish to
use additional disk hardware for shared data storage. If you intend to expand your cluster without having
to bring it down, you need to plan the initial configuration carefully. Use the following guidelines:

• Set the Maximum Configured Packages parameter (described later in this chapter under Cluster
Configuration Planning high enough to accommodate the additional packages you plan to add.
• Networks should be pre-configured into the cluster configuration if they will be needed for packages
you will add later while the cluster is running. See LAN Information .

See Cluster and Package Maintenance, for more information about changing the cluster configuration
dynamically, that is, while the cluster is running.

Using Serviceguard with Virtual Machines


This section describes the various configurations for Serviceguard for Linux clusters using physical
machine, VMware virtual machines running on ESX server, RHEV guests, Hyper-V virtual machines
running on a Windows Server, and Kernel-based Virtual Machine (KVM) guests built on KVM hypervisor
provided with Red Hat Enterprise Linux 6, 7, and SUSE Linux Enterprise Server 12 (SLES 12) so as to
provide high availability for applications.
Serviceguard for Linux supports using VMware, RHEV, Hyper-V, and KVM guests as cluster nodes. In this
configuration, the virtual machine is a member of a Serviceguard cluster, allowing failover of application
packages between other physical or VM nodes in the cluster.
Running Serviceguard for Linux in the virtual machines provides a significant level of extra protection.
Serviceguard fails over an application when one or more number of failures occur, including:

• Failure of the application


• Failure of networking required by the application
• Failure of storage
• An operating system “hang” or the failure of virtual machine itself
• Failure of the physical machine

In addition, it provides the following advantages:

• Minimize both planned and unplanned downtime of VM guests


• Serviceguard for Linux rolling upgrade feature allows for less planned downtime

Rules and Restrictions


Using VMware guests as cluster nodes

82 Serviceguard Memory Requirements


• Hewlett Packard Enterprise recommends that you configure the node using NPIV to have more than
one guests from each host as cluster.
• HPE does not mandate to use NPIV to have more than one guests from each host as cluster, if all
Serviceguard package configured in the cluster uses only dynamically linked storages (DLS).

Using KVM guests as cluster nodes

• Lock LUN is not supported on iSCSI storage device. Hence, Quorum server is the only supported
quorum mechanism that can be used for arbitration.
• Live migration of KVM guests is not supported when the KVM guests are configured as Serviceguard
cluster nodes.

Using Hyper-V guests as cluster nodes

Lock LUN is not supported on iSCSI storage device. Hence, Quorum server is the only supported quorum
mechanism that can be used for arbitration.

Using RHEV guests as cluster nodes

• Only iSCSI storage devices are supported.


• Lock LUN is not supported on iSCSI storage devices. Hence, Quorum server is the only supported
quorum mechanism that can be used for arbitration.

Supported cluster configuration options


Following are the supported cluster configuration options when using VMware, RHEV, Hyper-V, or KVM
guests as cluster nodes:

• Cluster with VMware, RHEV, Hyper-V, or KVM guests from a single host as cluster nodes (cluster-in-a-
box; not recommended)

NOTE: This configuration is not recommended because failure of the host brings down all the nodes in
the cluster which is a single point of failure.

• Cluster with VMware, RHEV, Hyper-V, or KVM guests from multiple hosts as cluster nodes
• Cluster with VMware, RHEV, Hyper-V, or KVM guests and physical machines as cluster nodes

NOTE:

• Guests running on different Hypervisor (VMware, RHEV, Hyper-V, or KVM guests) must not be
configured as cluster nodes in the same cluster.
• Cluster with VMware from a single host as cluster nodes configuration must be avoided in
Serviceguard-xdc environment. For more information about Serviceguard-xdc support with VMware
virtual machines, see HPE Serviceguard Extended Distance Cluster for Linux A.12.00.40 Deployment
Guide .
• KVM guests cannot be used as cluster nodes in the Serviceguard-xdc environment.

For more information about how to integrate VMware and KVM guests as Serviceguard cluster nodes,
see the following white paper at http://www.hpe.com/info/linux-serviceguard-docs:

Supported cluster configuration options 83


• Using HPE Serviceguard for Linux with VMware Virtual Machines
• Using HPE Serviceguard for Linux with Red Hat KVM and RHEV Guests

Serviceguard support for VMware Migrate (vMotion)


The VMware vMotion feature enables the live migration of running virtual machines from one physical
server to another with zero downtime. This ensures continuous service availability and complete
transaction integrity.
Serviceguard Manager B.12.00.50 also introduces the capability to perform vMotion of Serviceguard
cluster nodes (VMs) with a single click. This feature enables you to initiate vMotion of Serviceguard
cluster nodes (VMware VMs) from the Serviceguard Manager GUI. Prior to initiating vMotion, it also
performs all the required pre checks. This feature simplifies the migration of cluster nodes (VMs) by
automating the detach, attach of nodes, pre and post vMotion.

NOTE:
When migrating a VM cluster node for any maintenance activities, you might find that more than half of
the cluster nodes are running on one host. In this situation, the host becomes a single point of failure,
because failure of the host would cause the entire cluster to go down. To resolve this problem, you should
reset the configuration to equal node distribution across the hosts as soon as possible.

Prerequisites
When you use the following configurations, the vMotion is supported in the VMs used as Serviceguard
cluster nodes.
The following are the prerequisites:

• The boot image or boot disk of the guests must reside on shared disks; they must be accessible from
both the source and the destination hosts.
• The source and destination hosts must be managed by a common VMware vCenter server instance.
The same vCenter must be configured in the Serviceguard cluster.

◦ To configure the vCenter in Serviceguard cluster using Serviceguard Manager GUI, see Editing a
cluster section in Serviceguard Manager Online Help.

• An SLS environment would not require vCenter to be configured but to support vMotion, configuring
vCenter is a pre-requisite. To configure vCenter in Serviceguard cluster follow below steps:

◦ Create and populate Serviceguard Credential Store (SCS) utility with entries for the required
vCenter on which VMware virtual nodes are configured, which are part of existing Serviceguard
cluster. For more details, see cmvmusermgmt (1m) manpage.
◦ Add the appropriate VCENTER_SERVER parameter in the cluster configuration file. For more
information about these parameters and its description, see Specifying a VCENTER_SERVER or
ESX_HOST.

• When using Statically Linked Storage (SLS), it must be configured with Fiber Channel or iSCSI; all the
configurations must be completed as mentioned in the Shared Storage Configuration for vMotion when
using SLS and NPIV section in HPE Serviceguard for Linux with VMware virtual machines whitepaper
available at http://www.hpe.com/info/linux-serviceguard-docs.
• Ensure that required cluster and package resources (network and storage) are available on the target
hosts.

84 Serviceguard support for VMware Migrate (vMotion)


• HPE recommends to use DLS when using vMotion. For more information on how DLS works and how
to configure, see Storage configuration type in a VMware environment on page 70.
• When using SLS, it must be configured with Fiber Channel or iSCSI; all the configurations must be
completed as mentioned in the Shared Storage Configuration for vMotion when using statically
linked storage and NPIV section in HPE Serviceguard for Linux with VMware virtual machines White
paper available at http://www.hpe.com/info/linux-serviceguard-docs.

Migrating a node using Serviceguard Manager


To migrate a VM that is a Serviceguard cluster node, a user can login to Serviceguard Manager from any
node that is on the same subnet as the node to be migrated, and initiate vMotion of any of the cluster
nodes in the management view. The nodes can be in running or halted state. When logged in with
nonroot credentials, user will be promoted for the root password.
Follow the steps to migrate a node in cluster (VMware VM) configured with vCenter:

1. From the Main menu, select Nodes.


2. Select the required node to Migrate, then node’s overview information will be displayed.

Figure 27: Migrate of Serviceguard cluster node


3. Select Actions —> Migrate.

• The Migrate overlay lists the discovered available destination Esxi hosts in data center to which the
node can be migrated. You can now choose the destination host to Migrate. On successful
Migration, you will see the notification in the activity Page.

Migrating a node using Serviceguard Manager 85


NOTE:

◦ In case of nonroot user, an overlay appears for root authentication. On successful


authentication, the Migrate overlay will list the Esxi hosts.
◦ In case of destination Esxi host is not discovered partially or fully, then the user need to key in
the destination Esxi host to which the node can be migrated.

Figure 28: Selecting a destination host for migrate

4. The node migrates from source host to selected destination host. This operation status is logged in the
Activity sidebar.

Figure 29: Migrate operation Status

NOTE: When an active cluster node is migrated and the reattachment of the node post migration fails,
check the package and cluster logs for failures. Analyze and fix the issues to restart the cluster nodes.
To restart the node, use select Actions → Run to reattach the node back to the cluster.

86 Planning and Documenting an HA Cluster


Configuring Serviceguard and VMware HA in a cluster
VMware has a high-availability (HA) clustering product called VMware HA. It can provide some degree of
protection from failures and ensures that the VMs from the failed host are restarted on other ESX or ESXi
hosts. Serviceguard for Linux that runs in the virtual machines provides extra protection. Serviceguard for
Linux fails over an application when any of a large number of failures occurs, including:

• A failure of the application


• A failure of networking required by the application
• Failure of storage
• An OS failure of the virtual machine itself
• Failure of the physical machine

Serviceguard and VMware HA can be configured to exist in the same cluster. Configure Serviceguard and
VMware HA for Statically linked storage (SLS) environment in a cluster. For Dynamically linked storage
(DLS) environment, you must configure appropriate VCENTER_SERVER parameter in the cluster
configuration file. For more information about this parameter and its description, see Specifying a
VCENTER_SERVER or ESX_HOST.

Serviceguard support for VMware DRS


VMware’s Distributed Resource Scheduler (DRS) manages the allocation of physical resources to a set of
virtual machines deployed in a cluster of hosts, each running VMware's ESXi hypervisor. DRS performs
intelligent load balancing in real time in order to maximize the infrastructure performance while it allows
for enforcement of user defined polices. As a result, DRS identifies the host which is best suited to place
a VM and also balance the VM load across multiple hosts in a DRS cluster.

Importance of MEMBER_TIMEOUT interval in the deployment of HPE Serviceguard in a


VMware DRS environment
When using virtual machines of DRS enabled VMware cluster in Serviceguard, it is important to
understand the requirement for MEMBER_TIMEOUT parameter of Serviceguard cluster for its optimal
operation. For more information about the MEMBER_TIMEOUT parameter, see Cluster Configuration
Parameters.
During vMotion initiation through VMware DRS, the virtual machine can stall for a few seconds. In the
Serviceguard environment, if the stall time exceeds the MEMBER_TIMEOUT interval, then the guest cluster
considers the node down and this can lead to unnecessary cluster reformations and failover of
applications. For more information about reformation due to MEMBER_TIMEOUT set to low value, see
Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low. In most of the
configurations a default value of 14 seconds should be sufficient for the vMotions in DRS environment
(during migration recommendations). However, during system or network constraints VM stall time of
vMotion initiated by DRS may require more time than default the value specified for MEMBER_TIMEOUT.
In this condition MEMBER_TIMEOUT must be increased to a value which is sufficient for allowing migration
to succeed in VMware DRS environment. For more information on how to change the
MEMBER_TIMEOUT value see, Modifying the MEMBER_TIMEOUT Parameter.

Configuring Serviceguard and VMware HA in a cluster 87


Prerequisites to deploy Serviceguard in a VMware DRS environment

• All hosts that run Serviceguard virtual machines must be managed by a vCenter Server system. For
more information about vCenter Server system see, Specifying a VCENTER_SERVER or
ESX_HOST.
• All the virtual machines of Serviceguard cluster are from the hosts which are part of the same DRS
cluster.

NOTE: When you configure a Serviceguard cluster between VMs coming from DRS enabled hosts
and VMs from non-DRS enabled hosts, then there is a possibility that application can failover to VMs
running on non-DRS hosts. If this is not the desired behavior, then it is recommended to have
application package to be configured with VMs nodes that are part of DRS enabled hosts. For more
information about configuring nodes for a package see, configured_node under failover_policy
section.

• All hosts which are part of VMware DRS cluster must have shared storage.
• All virtual machines which are part of Serviceguard cluster must have shared storage configuration as
defined by Serviceguard.
• The VMware HA and FT features in VMware cluster is disabled.
• Serviceguard packages are configured with dynamically linked storages (DLS) disk access mode.
For more information about configuring DLS storage see, Storage configuration type in a VMware
environment.

• (Optional) VMware tools are installed on all virtual machines of the cluster.

For more information about additional requirements for VMware DRS and vMotion, see Serviceguard
support for VMware Migrate (vMotion).
For more information about VMware DRS cluster and its other requirements from VMware see, vSphere
Resource Management guide of VMware.

Configuration settings of DRS feature in a VMware custer


All the hosts that are running Serviceguard cluster virtual machines can be part of a VMware cluster with
the DRS feature enabled. For more information about how to enable DRS in VMware cluster environment
see, VMware documents.
DRS automation level
When you enable the DRS feature on VMware cluster, it allows you to select the level of automation you
want the DRS to use regarding migration of virtual machines, such as automatic or manual migration. The
virtual machines migrate across multiple hosts in a DRS cluster based on a threshold value. This
threshold is a measure of resources such as CPU and memory utilization imbalance across hosts.
VMware DRS cluster supports three levels of automation such as:

• Manual
• Partially Automated
• Fully Automated

Serviceguard cluster with VMware DRS cluster supports all three types of DRS automation levels. The
DRS automation level allows you to specify whether the recommendations are automatically applied by
the system (in case of fully automated mode) or allows you to apply the recommendations (in case of
manual mode) before the virtual machine is migrated.

88 Prerequisites to deploy Serviceguard in a VMware DRS environment


For more information about automation level and DRS migration threshold settings, see vSphere
Resource Management Guide from VMware.

Affinity rules in VMware DRS cluster


You can control the placement of virtual machines on hosts within a cluster by using affinity rules. DRS
supports two types of affinity rules.
VM-to-VM affinity rules
VM-to-VM anti-affinity rules define a set of VMs that are to be kept on separate hosts. These rules are
typically used for availability and are mandatory and hence DRS does not make any recommendations
that violates these rules.
During DRS load balancing there could be a possibility of moving all virtual machines onto single host.
This becomes a cluster-in-a-box deployment model of Serviceguard. This type of configuration is not
recommended since the host becomes a single point of failure (SPOF). To avoid this type of issues VM-
to-VM anti-affinity rules must be defined in DRS environment to keep VMs of Serviceguard cluster on
different hosts. This increases the vailability of applications which are running on Serviceguard clustered
VMs.
For reasons like application dependency across VMs to improve performance, you could keep a set of
virtual machines on single host. Then VM-to-VM affinity rules should be defined in DRS environment.
When you set VM-to-VM affinity rules, you must ensure that more than half of the VMs of Serviceguard
cluster is not on single host.
VM-to-host affinity rules
A VM-to-host affinity rule specifies whether or not the selected virtual machine DRS group can run on the
members of a specific host DRS group. In Serviceguard environment you can define these rules such that
a set of two virtual machines of Serviceguard cluster do not run on the same host of a DRS cluster group
to prevent SPOF situation.
For more information about how to create VM-to-VM affinity rules and VM-to-host affinity rules with host
DRS group and virtual machine DRS group see, VMware vSphere Resource Management Guide.

vSphere DPM for a DRS cluster in Serviceguard environment


The vSphere Distributed Power Management (DPM) feature allows a DRS cluster to reduce its power
consumption by powering hosts on and off based on cluster resource utilization.
vSphere DPM monitors the cumulative demand of all virtual machines in the cluster for memory and CPU
resources and compares this to the total available resource capacity of all hosts in the cluster. If excess
capacity is found, vSphere DPM places one or more hosts in standby mode and powers them off after
migrating their virtual machines to other hosts. Conversely, when capacity is deemed to be inadequate,
DRS brings hosts out of standby mode (powers them on) and uses vMotion to migrate virtual machines to
them. When making these calculations, vSphere DPM considers not only current demand, but it also
honors any user-specified virtual machine resource reservations.
Enabling DPM feature on VMware DRS cluster with Serviceguard environment is supported on all levels
of automation of DPM.

NOTE: VMware DPM feature will be supported only if the standby hosts are part of VMware DRS cluster.

Enhanced vMotion Compatibility (EVC) with VMware DRS


You can use EVC to ensure vMotion compatibility for the hosts in a cluster. EVC ensures that all hosts in
a cluster present the same CPU feature set to virtual machines, even if the actual CPUs on the hosts
differ. This prevents migrations with vMotion from failing due to incompatible CPUs.

Affinity rules in VMware DRS cluster 89


Serviceguard environment supports the EVC in VMware DRS configuration when the VMs are used as
Serviceguard cluster.

Serviceguard Metrocluster and Continentalcluster support with VMware DRS


If you enable DRS feature in VMware cluster and its virtual machines are Serviceguard members of a
Metrocluster or a Continentalcluster, then you must configure VMware DRS cluster with following
considerations.

• The VMware DRS cluster is configured within the hosts of the single data center site.
• The VM-VM affinity rules or VM-Host affinity rules are configured such that the virtual machines are
within the datacenter site.

Limitations of VMware DRS in a Serviceguard cluster

• VMware DRS is supported only with Serviceguard DLS type of storage configurations with vCenter
details (VCENTER_SERVER parameter) in cluster configuration file. SLS type storage configuration is
not supported.
• Storage DRS is not supported.
• VMware DRS with HA and FT feature is not supported in Serviceguard environment.
• VMware DRS in Serviceguard cluster is not supported with lock LUN.
• Configuring VMware DRS cluster across the Metrocluster or Coninentalcluster datacenter is not
supported. For complete understanding and description of VMware DRS features and its settings see,
VMware documentation.

Using Serviceguard with VMware Site Recovery Manager


The Site Recovery Manager (SRM) is a site migration and disaster recovery solution from VMware that
provides automated migration, recovery and validation of virtual infrastructure across sites. It is integrated
with VMware vCenter Server and VMware vSphere Web Client. With SRM, the virtual infrastructure can
be prepared to test, migrate and recover between the protected, and recovery vCenter Server sites.

How Site Recovery Manager works


VMware Site Recovery Manager builds, manages, tests, and executes disaster recovery procedures for
virtual infrastructure implementations.

• Uses the storage replication mechanism between the protected site and the recovery site for disaster
recovery of the protected site virtual infrastructure.
• It helps create virtual machine groups which will be administered or recovered together.
• It helps create a recovery plan for the virtual machines located at protected site. Recovery plan
execution results in recovery of virtual machines at the recovery site.
• You can test the recovery plan any time at the recovery site to assess the preparedness of recovery
site.

Configuring HPE Serviceguard with VMware SRM


HPE Serviceguard cluster can be configured on Virtual machines integrated with VMware SRM for
disaster recovery. Serviceguard with SRM integrated VMs require data replication to be configured with
storage arrays for recovery of virtual machines and workloads.

90 Serviceguard Metrocluster and Continentalcluster support with VMware DRS


By configuring HPE Serviceguard with SRM, you can monitor the applications running on the VMs and
provide failover for the applications instead of just failing over the VMs. Configuring HPE Serviceguard
with SRM environment allows the virtual machine workloads to be protected against failures which is
independent of location of their operation. It also facilitates the seamless movement of workloads across
sites integrated with Serviceguard for High Availability protection.

Installing and configuring VMware SRM


Install and configure VMware SRM and vCenter Server at both the protected site and recovery site. For
more information about how to install and configure VMware SRM see, the VMware Site Recovery
Manager Installation and Configuration Guide.

Site Recovery Manager replication technologies


SRM supports two different replication technologies, Array-Based Replication (ABR) and vSphere
Replication.
To implement both the replication technologies, the virtual machines must first be configured for
replication before the sites are protected by Site Recovery Manager.

NOTE: Currently only ABR is supported as the data replication mechanism to configure Serviceguard and
SRM.

Prerequisites to deploy Serviceguard in VMware SRM environment


Ensure that you have completed the following tasks before you begin to deploy the Serviceguard in
VMware SRM environment.

• Install Serviceguard Enterprise license on all the nodes of Serviceguard cluster.


• Install and configure VMware SRM and vCenter Server at both, the protected and the recovery site.
• Install an appropriate Storage Replication Adapter (SRA) on the SRM server at the protected and
recovery sites.
• Set up a protection group at the protected site for the virtual machines to configure Serviceguard
cluster.
• Form a Serviceguard cluster using the protected VMs of a single protection group.
• Configure all the VMs in the protection group with the same priority in the SRM recovery plan. This is
to ensure that all the VMs in a Serviceguard cluster failover at the same time.
• Install VMware tools on all the nodes configured in the Serviceguard cluster.
• Add both the protected and recovery site vCenter IP information and FQDN with short names in
the /etc/hosts file of all the Serviceguard clusters.

Scenarios for configuring SG in VMware SRM environment


In relation to the protected site, you can configure the recovery site to use the same or a different IP
address topology which is determined by your infrastructure requirements.
You can leverage the capabilities such as stretched VLANs or relocatable VLANs to have the same IP
address in protected and recovery sites.
You can also have a completely different set of IP addresses at the recovery site for the VMs. Site
Recovery Manager allows the recovery plans to automatically assign valid IP address to virtual machines
based on their location of operation. If your infrastructure uses different IP addresses between the sites

Installing and configuring VMware SRM 91


then Serviceguard configuration require additional steps be followed which is described in section Non-
uniform IP addresses between protected and recovery sites on page 93.
Uniform IP addresses between protected and recovery site
Complete the configuration steps when IP address are same across the sites.
Network configuration requirements
VMware SRM environment that is configured with uniform IP network across the protected and the
recovery sites does not require any special network configuration steps to be followed for recovery of
VMs configured with Serviceguard cluster.
For more information about networking, see Using Serviceguard with Virtual Machines.
Storage configuration requirements
In the Serviceguard environment, you can configure the storage for Statically Linked Storage (SLS) or
Dynamically Linked Storage (DLS). For more information about shared storage configuration in VMware
see, Storage configuration type in a VMware environment.
Configuring Statically Linked Storage type
In the SRM environment, if you have configured the packages for Statically Linked Storage (SLS) through
Raw Device Mapping (RDM), then set the AUTOSTART_CMCLD to 1 in the $SGAUTOSTART file in the
protected site cluster nodes. This automatically starts the cluster and packages at the recovery site when
SRM initiates the planned migration or disaster recovery. For more information on automatic cluster start
up see, Automatic Cluster Startup or Setting up Autostart Features.
For further configuration details on RDM refer the white paper Using HPE Serviceguard for Linux with
VMware virtual machines. For more information on how the RDM configuration is done in VMware SRM
environment see, VMware documents.

NOTE: While configuring RDM, ensure that the RDM mapping file is stored on the data store which is
created on RCVG LUN. That is, store the RDM mapping file on a VMFS volume covered by a replicated
LUN. If the mapping file is not available on the VMFS volume, then an RDM mapping file will not be
available in the recovery site for the recovery VM to use.

Configuring Dynamically Linked Storage type

Prerequisites

• In the SRM environment, if you have configured the packages with DLS through Virtual Machine File
System (VMFS) or a mix of SLS and DLS type of configurations, then do not set AUTOSTART_CMCLD
to 1 in the $SGAUTOSTART file in the protected site cluster configuration.

• The DLS configuration is supported with only VCENTER_SERVER option in the cluster configuration. All
the hosts that run Serviceguard virtual machines in SRM environment must be managed by a vCenter
Server system. For more information about vCenter Server system see, Specifying a
VCENTER_SERVER or ESX_HOST.
• When DLS disk configuration is used for application data, then the DLS disk must be configured on
replicated LUN. For more information on storage configuration with DLS in VMware environment see,
Storage configuration type in a VMware environment.
• When you run the recovery plan, the datastore names used by packages might get renamed with the
prefix snap-xxx. This leads to the DLS packages start process to fail. For the successful recovery of
the packages with DLS type of configuration, you must complete the following steps to fix the
datastore name in the protected and the recovery sites.

92 Planning and Documenting an HA Cluster


Procedure

1. In the vSphere web client, clickSite Recovery > Sites and select a site.
2. On the Manage tab, click Advanced Settings.
3. Click Storage Provider.
4. Click Edit to modify the storage provider settings.
5. Select the storageProvider.fixRecoveredDatastoreNames check box.
6. Click OK to save the changes.

Configuring a recovery plan


After you configure the recovery plan successfully for DLS type of storage, you must add the custom
recovery step. The custom recovery step runs commands or presents messages to the user during a
recovery. Site Recovery Manager can run custom steps either on the SRM server or in a VM that is part
of the recovery plan.
In Serviceguard environment, you must create a custom recovery step on each protected VM of
Serviceguard cluster. The step adds a post power-on script to successfully start the cluster and packages
at the recovery site.
Adding custom recovery step in the SRM recovery plan

Procedure

1. In the vSphere web client, select Site Recovery > Recovery Plans and select a recovery plan.
2. On the Related Objects tab, click Virtual Machines.
3. Right-click on the virtual machine and select Configure Recovery.
4. On the Recovery Properties tab, select Post-Power on Steps.
5. Click the plus icon to add the custom recovery step.
A dialog box appears.
6. Enter the following command on a single line with no carriage returns.
/bin/bash $SGSBIN/cmsrmconfig –u <username of the recovery site vCenter> -p
<Password of recovery site vCenter> &> < directory/output_file>
Where, $SGSBIN refers to /usr/local/cmcluster/bin/ in RHEL and /opt/cmcluster/bin in
SLES virtual machines and <directory/output_file> refers to the script location.

The execution status of the cmsrmconfig script is directed to the syslog. View the syslog for the
status of the script execution.
7. Repeat step 6 for every VM in the protection group.

Non-uniform IP addresses between protected and recovery sites


In certain deployment environments, the protected and recovery sites may have non-uniform IP subnet
configurations. When a virtual machine is failed over, SRM automatically changes the network
configuration parameters such as IP address and the default gateway of the virtual network interface
cards in the virtual machine.
When non-uniform IP topology is used at the protected and recovery sites, Serviceguard cluster must be
configured with network details compatible with local (recovery) site IP configuration.

Planning and Documenting an HA Cluster 93


Rules and restrictions for configuring non-uniform IP addresses

• Using DHCP for SRM IP customization on Serviceguard cluster VMs is not supported as it assigns IP
addresses dynamically to network interfaces of VMs. This leads to the failure of the cluster and
package operations such as start.
• VMware SRM does not support having two IPv6 addresses for a network interface. However
Serviceguard supports this feature. Due to this restriction, if any cluster with more than one IPv6
address with VMware SRM is not supported.
• Manual IP customization for bond interface is not supported by SRM. Hence do not configure bond
interface in Serviceguard, instead configure multiple heartbeat networks.

Network configuration requirements


There are multiple IP customization modes in SRM. Serviceguard leverages the Manual IP
customization mode to integrate with SRM. In this mode, you can choose static IP address option and
manually enter the IP address of each network interface of the virtual machine. For more information
about how to use IP customization see the SRM Administration Guide.
Generating the network map file
When network IP addresses are different at the protected and recovery site, an appropriate network map
information must be provided for SRM recovery plan to assign the valid IP address for cluster heartbeat
network and package relocatable IP addresses. Use the cmnwmap command to create the network map
file of the configured cluster.
Run the cmnwmap command with -c option on any one of the cluster nodes in the protected site to create
the network map file. The command reads the existing cluster configuration details and creates the map
file of all the network resources configured in the cluster. The map file, srm-ip-config is saved to
$SGCONF/run/srm/ directory location.
Also if any package uses the IP or SUBNET information, then the command captures it in the map file
along with network information of the cluster. The information can contain any one or all of these
information types:

• PACKAGE_NAME

• MONITORED_SUBNET

• PACKAGE_NETMASK_IP

• IP_ADDRESS

This network map file also contains the quorum server information of the protected site.
Update and distribute the network map file
After creating the network map file, you must update the file with network information such as heartbeat
network, package relocatable IP address, subnet motioning details, and quorum server details which are
valid at recovery site.

CAUTION: Hewlett Packard Enterprise strongly recommends that recovery site values must be
correct and valid. Be extremely cautious while entering the values for recovery site. Any wrong
value will lead to unrecoverable errors and unsuccessful recovery of cluster at the recovery site.

Use the -d option to distribute the network map file to all the configured cluster nodes. During the copy
operation, if any of the cluster nodes are not reachable or down, then the command fails to copy the file to
that node. You must manually copy the map file to the node or run the -d command again when the node
comes up or becomes reachable.

94 Planning and Documenting an HA Cluster


See the manpage of the cmnwmap to know more about the description and options.

NOTE: The recovery site IP network information of srm-ip-config file must match the recovery plan
customization IP of each VM. Also the package IP and monitored subnet information must be in
accordance with the recovery site IP addresses.

Storage configuration requirements


For non-uniform IP addresses do not configure the Autostart feature for both SLS and DLS package
configurations. For DLS package configurations, follow the same procedure as described in the section
Configuring Dynamically Linked Storage type.
Configuring a recovery plan
In Serviceguard environment, after you successfully configure the recovery plan for non-uniform IP
addresses, you must add the custom recovery step for both SLS and DLS type of storage. The step adds
a post power-on script to successfully modify the Serviceguard configuration database as per the
recovery site network information of the srm-ip-config map file. Also the script starts the cluster and
packages at the recovery site.
To add the custom recovery step, follow the procedure as described under Configuring a recovery plan.

Post protection behavior


Post successful recovery of the virtual machines from protected site to recovery site, the recovery site
becomes protected site. But the virtual machines are not protected yet. If the former protected site is
operational, you can reverse the direction of protection to use the former protected site as a new
recovery.
For more information about how to protect the VMs again see, the SRM Administration Guide.
Once the protected site becomes the recovery site, you can run the recovery plan again to move the
virtual machines back to recovery site. This operation doesn’t require any modification to srm-ip-
config file unless there was some change in Serviceguard cluster or package network configuration.
Post recovery, if any changes to Serviceguard IP or packages IP information, then you must run the
cmnwmap command again to create a new network map file. Also use the -d option to distribute the
network map file again to all virtual machines in the cluster.

NOTE: Any change in VMs network configuration at any of the sites require that the srm-ip-config file
must be recreated, updated, and distributed to all virtual machines in the cluster. This file must be kept
up-to-date and maintained to be in sync with network and quorum server configuration details of
Serviceguard configured cluster nodes.

Quorum Server configuration options for SRM recovery


Quorum server can be configured in multiple ways when Serviceguard cluster is used in SRM
environment. As a general requirement, Quorum Server must be up and running before the SRM
recovery plan can bring up the Serviceguard cluster nodes. If there is a change in quorum server
information between the protected and recovery sites, then both quorum servers information must be
updated in /etc/hosts. Quorum server must be up and running before you start the priority group
clustered VMs.

NOTE: At the time of recovery plan execution, unavailability of quorum service to the Serviceguard cluster
VMs which are being recovered will lead to the failure of cluster and package start operation.

Multiple ways of configuring the quorum server


Quorum server placed at third site which is reachable from both protected and recovery sites.

Post protection behavior 95


• When the same IP/subnet address is stretched across the sites, no change to Serviceguard cluster
configuration is required as quorum server present at third location is reachable from both the sites.
• If there is any change in the IP/subnet address across the sites, then srm-ip-config file must be
updated and retained for any site recovery operation with quorum server configuration such as
QUORUM_NAME and QUORUM_IP_1 details

SRM site local Quorum Server

• Quorum Server network configuration should not be changed for cluster nodes configured with uniform
IP address across the protected and recovery sites.
• With non-uniform IP address configuration used between the protected and recovery sites, the srm-
ip-config file must be updated and retained for site recovery operation with quorum server
configuration details such as QUORUM_NAME and QUORUM_IP_1.

• Prior to recovery of Serviceguard cluster nodes at the recovery site, Quorum Server node must be
recovered first by assigning higher priority to it for recovery operation.
• Quorum Server node network details corresponding to protected and recovery site should be updated
in /etc/hosts file of all cluster nodes.

• The number of IP addresses and network interfaces used for quorum server node must be the same
for protected and recovery sites.

Restrictions

• Serviceguard does not support including multiple priority group virtual machines of SRM site to be part
of a cluster.
• In the Serviceguard cluster you cannot configure some network interfaces to have uniform IP
addresses, whereas the rest of the interfaces to have non-uniform IP addresses across the SRM sites.
All nodes of a cluster must have either uniform or non-uniform IP address configuration between the
protected and recovery sties.
• For non-uniform IP/subnet addresses between the sites, Serviceguard is supported with only manual
IP customization of SRM..
• Serviceguard does not support migrating from one type of networking address (IPv4/IPv6) to other
type of address (IPv6/IPv4) during recovery. That is, an IPv4 address cannot be migrated to a IPv6
address and the other way around.
• SAP HANA toolkit package configuration should not be included for SRM recovery.
• Oracle data guard toolkit is supported only with a LUN which is part of RCVG group.
• Storage vMotion with VMware SRM in Serviceguard environment is not supported.
• When the recovery plan is tested on the Serviceguard cluster nodes running at protected site, SRM
recovery plan does not bring up the cluster and packages at recovery site.
• SRM network map file srm-ip-config must not be created or present under $SGRUN for uniform IP
address configuration between the sites. Keeping this file in uniform IP configuration will lead to failure
of recovery plan.

96 Restrictions
Summary of Recommendations

• HPE recommends to keep the shutdown action for a clustered VM as “Shutdown guest OS before
power off”.
• HPE recommends to keep a sync option of a storage between sites in such a way that required
configuration files of Serviceguard cluster must be available at the recovery site during recovery plan
execution.
• Type of replication (synchronous or asynchronous) selected for ABR depends on your business
requirements. HPE recommends that srm-ip-config residing disks of cluster nodes must be kept in
sync with the protected and recovery sites configured with non-uniform IP addresses. Failure to
comply with this recommendation might lead to unsuccessful SRM recovery.
• As the bond interface configuration is not supported, HPE recommends to configure the cluster in
SRM environment with dual heartbeat interface.

Hardware Planning
Hardware planning requires examining the physical hardware itself. One useful procedure is to sketch the
hardware configuration in a diagram that shows adapter cards and buses, cabling, disks and peripherals.
You may also find it useful to record the information on the Hardware worksheet Hardware Worksheet
on page 380 indicating which device adapters occupy which slots and updating the details as you create
the cluster configuration. Use one form for each node (server).

SPU Information
SPU information includes the basic characteristics of the server systems you are using in the cluster.
You may want to record the following on the Hardware worksheet Hardware Worksheet on page 380 :
Server Series Number
Enter the series number, for example, DL980 G7.
Host Name
Enter the name to be used on the system as the host name.
Memory Capacity
Enter the memory in MB.
Number of I/O slots
Indicate the number of slots.

LAN Information
While a minimum of one LAN interface per subnet is required, at least two LAN interfaces are needed to
eliminate single points of network failure.
Hewlett Packard Enterprise recommends that you configure heartbeats on all subnets, including those to
be used for client data.
Collect the following information for each LAN interface:
Subnet Name
The IP address for the subnet. Note that heartbeat IP addresses must be on the same subnet on
each node.

Summary of Recommendations 97
Interface Name
The name of the LAN card as used by this node to access the subnet. This name is shown by
ifconfig after you install the card.
IP Address
The IP address to be used on this interface.
An IPv4 address is a string of 4 digits separated with decimals, in this form:
nnn.nnn.nnn.nnn
An IPV6 address is a string of 8 hexadecimal values separated with colons, in this form:
xxx:xxx:xxx:xxx:xxx:xxx:xxx:xxx
For more details of IPv6 address format, see IPv6 Network Support.
Kind of LAN Traffic
The purpose of the subnet. Valid types include the following:

• Heartbeat
• Client Traffic

Label the list to show the subnets that belong to a bridged net.
This information is used in creating the subnet groupings and identifying the IP addresses used in the
cluster and package configuration files.

Shared Storage
FibreChannel and iSCSI can be used for clusters of up to 32 nodes.

FibreChannel
FibreChannel cards can be used to connect up to 32 nodes to a disk array containing storage. After
installation of the cards and the appropriate driver, the LUNs configured on the storage unit are presented
to the operating system as device files, which can be used to build LVM volume groups.

NOTE:
Multipath capabilities are supported by FibreChannel HBA device drivers and the Linux Device Mapper.
Check with the storage device documentation for details.

iSCSI
You can use the storage link based on IP to connect up to 32 nodes to a disk array containing storage.
The LUNs configured on the storage unit are presented to the operating system as device files, which can
be used to build LVM volume groups.
You can use the worksheet to record the names of the device files that correspond to each LUN for the
Fibre-Channel-attached and iSCSI attached storage unit.

Disk I/O Information


You may want to use the Hardware worksheet in Appendix C to record the following information for each
disk connected to each disk device adapter on the node:

98 Shared Storage
Bus Type
Indicate the type of bus. Supported buses are SAS (Serial Attached SCSI) and FibreChannel.
LUN Number
Indicate the number of the LUN as defined in the storage unit.
Slot Number
Indicate the slot number(s) into which the SCSI or FibreChannel interface card(s) are inserted in the
backplane of the computer.
Address
Enter the bus hardware path number, which is the numeric part of the host parameter, which can be
seen on the system by using the following command:

cat /proc/scsi/scsi

Disk Device File


Enter the disk device file name for each SCSI disk or LUN.
This information is needed when you create the mirrored disk configuration using LVM. In addition, it is
useful to gather as much information as possible about your disk configuration.
You can obtain information about available disks by using the following commands; your system may
provide other utilities as well.

• ls /dev/sd* (Smart Array cluster storage)

• ls /dev/hd* (non-SCSI/FibreChannel disks)

• ls /dev/sd* (SCSI and FibreChannel disks)

• du
• df

• mount

• vgdisplay -v

• lvdisplay -v

• vxdg list (VxVM)


• vxprint (VxVM)

See the manpages for these commands for information about specific usage. The commands should be
issued from all nodes after installing the hardware and rebooting the system. The information will be
useful when doing LVM and cluster configuration.

Hardware Configuration Worksheet


The hardware configuration worksheet Hardware Worksheet on page 380 will help you organize and
record your specific cluster hardware configuration. Make as many copies as you need.

Power Supply Planning


There are two sources of power for your cluster which you will have to consider in your design: line power
and uninterruptible power supplies (UPS). Loss of a power circuit should not bring down the cluster.

Hardware Configuration Worksheet 99


Frequently, servers, mass storage devices, and other hardware have two or three separate power
supplies, so they can survive the loss of power to one or more power supplies or power circuits. If a
device has redundant power supplies, connect each power supply to a separate power circuit. This way
the failure of a single power circuit will not cause the complete failure of any critical device in the cluster.
For example, if each device in a cluster has three power supplies, you will need a minimum of three
separate power circuits to eliminate electrical power as a single point of failure for the cluster. In the case
of hardware with only one power supply, no more than half of the nodes should be on a single power
source. If a power source supplies exactly half of the nodes, it must not also supply the cluster lock LUN
or quorum server, or the cluster will not be able to re-form after a failure. See Cluster Lock Planning on
page 100 for more information.
To provide a high degree of availability in the event of power failure, use a separate UPS at least for each
node’s SPU and for the cluster lock disk (if any). If you use a quorum server, or quorum server cluster,
make sure each quorum server node has a power source separate from that of every cluster it serves. If
you use software mirroring, make sure power supplies are not shared among different physical volume
groups; this allows you to set up mirroring between physical disks that are not only on different I/O buses,
but also connected to different power supplies.
To prevent confusion, label each hardware unit and power supply unit clearly with a different unit number.
Indicate on the Power Supply Worksheet the specific hardware units you are using and the power supply
to which they will be connected. Enter the following label information on the worksheet:
Host Name
Enter the host name for each SPU.
Disk Unit
Enter the disk drive unit number for each disk.
Tape Unit
Enter the tape unit number for each backup device.
Other Unit
Enter the number of any other unit.
Power Supply
Enter the power supply unit number of the UPS to which the host or other device is connected.
Be sure to follow UPS, power circuit, and cabinet power limits as well as SPU power limits.

Power Supply Configuration Worksheet


The Power Supply Planning worksheet Power Supply Worksheet on page 381 will help you organize
and record your specific power supply configuration. Make as many copies as you need.

Cluster Lock Planning


The purpose of the cluster lock is to ensure that only one new cluster is formed in the event that exactly
half of the previously clustered nodes try to form a new cluster. It is critical that only one new cluster is
formed and that it alone has access to the disks specified in its packages. You can specify a lock LUN or
a quorum server as the cluster lock. For more information about the cluster lock, seeCluster Lock on
page 35.

100 Power Supply Configuration Worksheet


NOTE:

• You cannot use more than one type of lock in the same cluster.
• An iSCSI storage device does not support configuring a lock LUN.

Cluster Lock Requirements


A one-node cluster does not require a lock. Two-node clusters require the use of a cluster lock, and a lock
is recommended for larger clusters as well. Clusters larger than four nodes can use only a quorum server
as the cluster lock.
For information on configuring lock LUNs and the Quorum Server, see Setting up a Lock LUN on page
178, section Specifying a Lock LUN on page 195, and HPE Serviceguard Quorum Server Version A.
12.00.30 Release Notes at http://www.hpe.com/info/linux-serviceguard-docs (Select HP
Serviceguard Quorum Server Software).

Planning for Expansion


Bear in mind that a cluster with more than 4 nodes cannot use a lock LUN. So if you plan to add enough
nodes to bring the total to more than 4, you should use a quorum server.

Using a Quorum Server


The Quorum Server is described under Use of the Quorum Server as a Cluster Lock on page 36. See
also Cluster Lock on page 35.
A quorum server:

• Can be used with up to 150 clusters, not exceeding 300 nodes total.
• Can support a cluster with any supported number of nodes.
• Can support a cluster with any supported number of nodes.
• Can communicate with the cluster on up to two subnets (a primary and an alternate).

IMPORTANT:
If you plan to use a Quorum Server, make sure you read the HPE Serviceguard Quorum Server
Version A.12.00.30 Release Notes before you proceed. You can find them at http://www.hpe.com/
info/linux-serviceguard-docs (Select HP Serviceguard Quorum Server Software). You should
also consult the Quorum Server white papers at the same location.

Quorum Server Worksheet


You can use the Quorum Server Worksheet Quorum Server Worksheet on page 381 to identify a
quorum server for use with one or more clusters. You may want to record the following:
Quorum Server Host
The host name for the quorum server.
IP Address
The IP address(es) by which the quorum server will communicate with the cluster nodes.

Cluster Lock Requirements 101


Supported Node Names
The name (39 characters or fewer) of each cluster node that will be supported by this quorum server.
These entries will be entered into qs_authfile on the system that is running the quorum server
process.

Configuring Asymmetric nodes in a Disaster Recovery Deployment


With the new Smart Quorum server feature introduced in A.12.00.30, you can now deploy asymmetric
configurations in a disaster recovery setup, where dissimilar number of nodes can be configured in the
production and DR sites.
Without the smart quorum feature when a failure occurs the Serviceguard clustering algorithm looks for at
least 50% of the previously known members to be available, before it can approach the quorum server for
a split brain scenario or a site failure. In case of an asymmetric configuration, a site failure means that
more than 50% of the members go down at the same time, which causes the cluster to go down, and in
case of a split brain it means that the majority site always survives.
For example, with the previous quorum algorithm in a symmetric configuration, if we have 2 nodes on a
primary site, and 2 nodes on a recovery site, and if a split occurs then both the sites try to approach the
Quorum server, and then the quorum is granted to the first request that reaches to the Quorum server. In
case of site failure, the surviving site will still have 50% of the last known membership, and can recover
with arbitration from the Quorum server. With the same algorithm, if you have an asymmetric
configuration (2 on primary and 1 on DR), then in a network split the primary site always survives. In case
of a primary site failure, the DR site will not be able to recover as the DR site has less than 50% of the
previously known cluster membership.
With the new Smart Quorum feature, the sub clusters on either site always approaches the quorum
server; even if they had less than 50% of the nodes available compared to the previous known
membership.
For example, if we have 2 nodes on a primary site and 1 node on a secondary site, and if a split occurs in
the network then both the site approach the Quorum server. The Quorum server then looks for the site
with preferred request (site running the most critical workload), and grants the quorum to the site with the
preferred request. Again, in case of primary site failure, the surviving DR site still approaches the quorum
server in-spite of having less than 50% of the previous known membership and survives.
The support for uneven number of nodes in a disaster recovery deployment is now also possible with this
new feature. Earlier with symmetric configurations, the Serviceguard cluster algorithm looks for the cluster
membership, and when it is an equal split then it goes to the quorum server. For example, in a symmetric
configuration if we have 2 nodes on a primary site, and 2 nodes on a recovery site, and if a split occurs
then both the sites try to approach the Quorum server, and then the quorum is granted to the first request
that reaches to the Quorum server. However, with the new Smart Quorum feature, if we have 2 nodes on
a primary site and 1 node on a secondary site, and if a split occurs in the network then both the sites
approach the Quorum server. The Quorum server then looks for the site with preferred request and grants
the quorum to the site with the preferred request.
The following are the requirements to configure asymmetric nodes in a disaster recovery deployment:

• You must upgrade all the nodes in the cluster and Quorum Server to version A.12.00.30.
• Configure the cluster to use Smart Quorum. For more information on Smart Quorum features, see
Understanding the Smart Quorum on page 324.
• Choose your one preferred workload.

To configure asymmetric nodes in a disaster recovery deployment:

102 Configuring Asymmetric nodes in a Disaster Recovery Deployment


1. Configure the sites in the cluster and associate the cluster nodes to the appropriate sites using
SITE_NAME and SITE parameters. For more information on how to configure sites in a cluster, see
the parameter descriptions under Cluster Configuration Parameters on page 111.
2. Choose the one preferred workload package, which is the most critical. The site running the preferred
workload package is the preferred site, and sends the preferred quorum request.
3. Configure a new site controller package with the failover_policy parameter set tosite_preferred.
Set the site parameters in the package configuration file.
cmmakepkg
For Example,

sc_site SiteA
sc_site SiteB
You must choose your preferred workload package for the following:

• When preferred workload is running, the site running it will always win the quorum.
• When preferred workload is not running, the site to reach the Quorum Server first wins the quorum.

You will also notice that the required monitor scripts and the generic resource is generated by the site
controller.
For Example,
Generic Resource Snapshot:

generic_resource_name sitecontroller_genres
generic_resource_evaluation_type during_package_start
generic_resource_up_criteria >1
For Example,
Monitoring service Snapshot:

generic_resource_up_criteria >1
service_name sitecontroller_service
service_cmd "$SGCONF/scripts/sg/sc_mon.sh monitor"
service_restart none
service_fail_fast_enabled no
service_halt_timeout 300
service_halt_on_maintenance no

4. Set the dependency on the most critical package created in step 2. In the following example, the
dependency is set to the most critical package, which is CRITICAL_PKG = UP.
For Example:

dependency_name mydep
dependency_condition CRITICAL_PKG = UP
dependency_location same_node

5. Enable the Smart Quorum in the cluster. To use Smart Quorum feature, you must enable
QS_SMART_QUORUM parameter in the cluster configuration file. For more information about this

Planning and Documenting an HA Cluster 103


parameter, see Cluster Configuration Parameters on page 111 and the cmquerycl (1m)
manpage.

IMPORTANT: You must have both Serviceguard and Quorum Server version A.12.00.30, or later to
support the Smart Quorum server feature.

Volume Manager Planning


When designing your disk layout using LVM or VxVM, you must consider the following:

• The volume groups that contain high availability applications, services, or data must be on a bus or
buses available to the primary node and all adoptive nodes.
• High availability applications, services, and data should be placed in volume groups that are separate
from non-high availability applications, services, and data.
• You must group high availability applications, services, and data, whose control needs to be
transferred together, on a single volume group or a series of volume groups.
• You must not group two different high availability applications, services, or data, whose control needs
to be transferred independently, on the same volume group.
• Your root disk must not belong to a volume group that can be activated on another node.

Volume Groups and Physical Volume Worksheet


You can organize and record your physical disk configuration by identifying which physical disks, LUNs,
or disk array groups will be used in building each volume group for use with high availability applications.
Use the Volume Group and Physical Volume worksheet Volume Group and Physical Volume
Worksheet on page 382.

VxVM Planning
You can create storage groups using the LVM (Logical Volume Manager, described in the previous
section) or using Veritas VxVM software.
When designing a storage configuration using VxVM disk groups, consider the following:

• High availability applications, services, and data must be placed in separate disk groups from non-high
availability applications, services, and data.
• You must not group two different high availability applications, services, or data, whose control needs
to be transferred independently, onto the same disk group.
• Your root disk can belong to an LVM or VxVM volume group that is not shared among cluster nodes.

Cluster Configuration Planning


A cluster should be designed to provide the quickest possible recovery from failures. The actual time
required to recover from a failure depends on several factors:

104 Volume Manager Planning


• The length of the MEMBER_TIMEOUT; see the description of this parameter under Cluster
Configuration Parameters on page 111 for recommendations.
• The design of the run and halt instructions in the package control script. They should be written for fast
execution.
• The application and database recovery time. They should be designed for the shortest recovery time.

In addition, you must provide consistency across the cluster so that:

• User names are the same on all nodes.


• UIDs are the same on all nodes.
• GIDs are the same on all nodes.
• Applications in the system area are the same on all nodes.
• System time is consistent across the cluster.
• Files that could be used by more than one node, such as /usr or /opt files, must be the same on all
nodes.

Easy Deployment
Easy deployment is a feature that provides a quick and simple way to create a cluster. Easy deployment
automates the security, shared storage, and networking configuration required by the package and
cluster. Also, easy deployment simplifies cluster lock configuration. These can be achieved by using
cmquerycl, cmpreparestg, and cmdepolycl commands as described in the following sections.

cmquerycl -N
The cmquerycl queries available network configuration and generates network template.
Example
cmquerycl –n node1 –n node2 –N <network_template_file>
The template file generated from the previous command must be populated with the appropriate IP
addresses and subnet details. This file is then provided as input to the cmdeploycl command (see
section Full Network Probing on page 195) which applies the network configuration during cluster
creation.

cmpreparestg
The cmpreparestg creates LVM volume groups and VxVM disk groups from available shared disks. It
can also modify existing LVM volume groups and VxVM disk groups that are configured on one or more of
the cluster nodes. For more information, see cmpreparestg (1m) manpage.
Examples
Create a new LVM volume group “lvmvg” and import it to nodes node1, node2, node3, and node4:
cmpreparestg -l lvmvg -n node1 -n node2 -n node3 -n node4 -p /dev/sdz
Create a new VxVM disk group “cvmdg” and import it to nodes node1, node2, node3, and node4:
cmpreparestg -g cvmdg -n node1 -n node2 -n node3 -n node4 -p /dev/sdz

Easy Deployment 105


cmdeploycl
The cmdeploycl creates the cluster with the previously generated network and storage configuration.
For more information, see cmdeploycl (1m) manpage.
Example
cmdeploycl -n node1 -n node2 -c node_cluster -L /dev/sdz –N
network_template_file

cmpreparecl
The cmpreparecl script allows you to ease the process of setting up the servers participating in the
cluster. It also checks for the availability of ports used by Serviceguard Linux, starts the xinetd services,
updates specific files, and sets up the firewall. As of Serviceguard A.11.20.10, the cmpreparecl script is
supported.

NOTE: After you run the cmpreparecl script, you can start the cluster configuration.

Advantages

• Simple ways to configure the system before you create a cluster.


• Configuration for all the nodes can be done from one of the nodes in the cluster.

Limitations

• All the nodes that are part of the cluster must be known before hand.

NOTE: After the configuration is complete, you cannot add the nodes.

• Does not set up lock LUN or quorum server.


• Does not ensure that all other network connections between the servers are valid.

Before You Start

IMPORTANT: The nodes which are given as inputs should not have cluster configured in them.
Before you start, you should have done the planning and preparation as described in previous
sections. You must also do the following:

• Install Serviceguard on each node that is to be configured into the cluster; see Installing and
Updating Serviceguard .
You must have superuser capability on each node.

• Make sure all the nodes have access to at least one fully configured network.
• Make sure all the subnets used by the prospective nodes are accessible to all the nodes.

Using cmpreparecl to Configure the System


The following example illustrates how to prepare two nodes using the cmpreparecl command:

1. Verify the prerequisites for cluster configuration:

106 cmdeploycl
cmpreparecl –n <node1> —n <node2> -p

2. Run the cmpreparecl command with the nodes on which the cluster needs to be configured:
cmpreparecl –n <node1> —n <node2>

3. The cmpreparecl command performs the following actions:

a. Verifies the availability of ports required by Serviceguard. For information about port requirements
on Red Hat Enterprise Linux Server and SUSE Linux Enterprise Server, see the following
documents available at http://www.hpe.com/info/linux-serviceguard-docs:

• HPE Serviceguard for Linux Base edition Release Notes


• HPE Serviceguard for Linux Advanced edition Release Notes
• HPE Serviceguard for Linux Enterprise edition Release Notes

b. Confirms the runlevels of xinetd and set xinetd to run at boot.

c. Enables the ident protocol daemon. Starts authd on Red Hat Enterprise Linux Server and starts
identd on SUSE Linux Enterprise Server.

d. Restarts the xinetd service.

e. Sets the Serviceguard manual pages paths.


f. Sets the AUTOSTART_CMCLD=1. In SUSE Linux Enterprise Server 11 environment, the
RUN_PARALLEL parameter in the /etc/sysconfig/boot file, is set to "NO".

g. The host names of the nodes and quorum if specified, their IP addresses are validated and
updated in the /etc/hosts file.

h. The /etc/lvm/lvm.conf and /etc/lvm/lvm_$(uname -n).conf files are updated to enable


VG Activation Protection.
i. Creates and deploys the firewall rules.
If firewall is disabled on the system, the rules are stored at /tmp/sg_firewall_rules. An
appropriate log message is displayed on how to run this file for the rules to be applied.

NOTE: The modified files are backed up in the same directory as the original files with ".original"
extension and the output is logged to the /tmp/cmpreparecl.log file. This log file is a cumulative
log of the configuration done on the node. Each time you run cmpreparecl, logs are appended with
appropriate time stamp.

For more information, and other options, see manpages for cmpreparecl (1m).

Heartbeat Subnet and Cluster Re-formation Time


The speed of cluster re-formation depends on the number of heartbeat subnets.
If the cluster has only a single heartbeat network, and a network card on that network fails, heartbeats will
be lost while the failure is being detected and the IP address is being switched to a standby interface. The
cluster may treat these lost heartbeats as a failure and re-form without one or more nodes. To prevent
this, a minimum MEMBER_TIMEOUT value of 14 seconds is required for clusters with a single heartbeat
network.

Heartbeat Subnet and Cluster Re-formation Time 107


If there is more than one heartbeat subnet, and there is a failure on one of them, heartbeats will go
through another, so you can configure a smaller MEMBER_TIMEOUT value.

NOTE: For heartbeat configuration requirements, see the discussion of the HEARTBEAT_IP parameter
later in this chapter. For more information about managing the speed of cluster re-formation, see the
discussion of the MEMBER_TIMEOUT parameter, and further discussion under What Happens when a
Node Times Out, and, for troubleshooting, Cluster Re-formations Caused by MEMBER_TIMEOUT
Being Set too Low.

About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode


Serviceguard supports three possibilities for resolving the nodes' hostnames (and Quorum Server
hostnames, if any) to network address families:

• IPv4-only
• IPv6-only
• Mixed

IPv4-only means that Serviceguard will try to resolve the hostnames to IPv4 addresses only.

IMPORTANT: You can configure an IPv6 heartbeat, or stationary or relocatable IP address, in any
mode: IPv4-only, IPv6-only, or mixed. You can configure an IPv4 heartbeat, or stationary or
relocatable IP address, in IPv4-only or mixed mode.

IPv6-only means that Serviceguard will try to resolve the hostnames to IPv6 addresses only.
Mixed means that when resolving the hostnames, Serviceguard will try both IPv4 and IPv6 address
families.
You specify the address family the cluster will use in the cluster configuration file (by setting
HOSTNAME_ADDRESS_FAMILY to IPV4, IPV6, or ANY), or by means of the -a of cmquerycl (1m);
see Specifying the Address Family for the Cluster Hostnames. The default is IPV4. See the
subsections that follow for more information and important rules and restrictions.

What Is IPv4–only Mode?


IPv4 is the default mode: unless you specify IPV6 or ANY (either in the cluster configuration file or via
cmquerycl -a) Serviceguard will always try to resolve the nodes' hostnames (and the Quorum Server's,
if any) to IPv4 addresses, and will not try to resolve them to IPv6 addresses. This means that you must
ensure that each hostname can be resolved to at least one IPv4 address.

NOTE:
This applies only to hostname resolution. You can have IPv6 heartbeat and data LANs no matter what the
HOSTNAME_ADDRESS_FAMILY parameter is set to. (IPv4 heartbeat and data LANs are allowed in IPv4
and mixed mode.)

What Is IPv6-Only Mode?


If you configure IPv6-only mode (HOSTNAME_ADDRESS_FAMILY set to IPV6, or cmquerycl -a
ipv6), then all the hostnames and addresses used by the cluster — including the heartbeat and
stationary and relocatable IP addresses, and Quorum Server addresses if any — must be or resolve to
IPv6 addresses. The single exception to this is each node's IPv4 loopback address, which cannot be
removed from /etc/hosts.

108 About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode
NOTE: How the clients of IPv6-only cluster applications handle hostname resolution is a matter for the
discretion of the system or network administrator; there are no Hewlett Packard Enterprise requirements
or recommendations specific to this case.

In IPv6-only mode, all Serviceguard daemons will normally use IPv6 addresses for communication among
the nodes, although local (intra-node) communication may occur on the IPv4 loopback address.
For more information about IPv6, see IPv6 Network Support.
Rules and Restrictions for IPv6-Only Mode

• Serviceguard does not support VMware SUSE Linux Enterprise Server guests running in IPv6–only
mode as cluster nodes.
• Red Hat 5 and later versions clusters are not supported.

NOTE: This also applies if HOSTNAME_ADDRESS_FAMILY is set to ANY; Red Hat 5 supports only
IPv4-only clusters.

• All addresses used by the cluster must be in each node's /etc/hosts file. In addition, the file must
contain the following entry:
::1 localhost ipv6-localhost ipv6-loopback
For more information and recommendations about hostname resolution, see Configuring Name
Resolution.

• All addresses must be IPv6, apart from the node's IPv4 loopback address, which cannot be removed
from /etc/hosts.

• The node's public LAN address (by which it is known to the outside world) must be the last address
listed in /etc/hosts.
Otherwise there is a possibility of the address being used even when it is not configured into the
cluster.

• You must use $SGCONF/cmclnodelist, not ~/.rhosts or /etc/hosts.equiv, to provide root


access to an unconfigured node.

NOTE: This also applies if HOSTNAME_ADDRESS_FAMILY is set to ANY. See Allowing Root
Access to an Unconfigured Node for more information.

• If you use a Quorum Server, you must make sure that the Quorum Server hostname (and the alternate
Quorum Server address specified by QS_ADDR, if any) resolve to IPv6 addresses, and you must use
Quorum Server version A.12.00.00. See the latest Quorum Server release notes for more information;
you can find them at http://www.hpe.com/info/linux-serviceguard-docs.

NOTE: The Quorum Server itself can be an IPv6–only system; in that case it can serve IPv6–only and
mixed-mode clusters, but not IPv4–only clusters.

• If you use a Quorum Server, and the Quorum Server is on a different subnet from cluster, you must
use an IPv6-capable router.
• Hostname aliases are not supported for IPv6 addresses, because of operating system limitations.

NOTE: This applies to all IPv6 addresses, whether HOSTNAME_ADDRESS_FAMILY is set to IPV6 or
ANY.

Planning and Documenting an HA Cluster 109


• Cross-subnet configurations are not supported in IPv6-only mode.
• Virtual machines are not supported.
You cannot have a virtual machine that is either a node or a package if
HOSTNAME_ADDRESS_FAMILY is set to ANY or IPV6.

Recommendations for IPv6-Only Mode

If you decide to migrate the cluster to IPv6-only mode, you should plan to do so while the cluster is down.

What Is Mixed Mode?


If you configure mixed mode (HOSTNAME_ADDRESS_FAMILY set to ANY, or cmquerycl -a any)
then the addresses used by the cluster, including the heartbeat, and Quorum Server addresses if any, can
be IPv4 or IPv6 addresses. Serviceguard will first try to resolve a node's hostname to an IPv4 address,
then, if that fails, will try IPv6.
Rules and Restrictions for Mixed Mode

• Red Hat 5 and Red Hat 6 clusters are not supported.

NOTE: This also applies if HOSTNAME_ADDRESS_FAMILY is set to IPv6; Red Hat 5 supports only
IPv4-only clusters.

• The hostname resolution file on each node (for example, /etc/hosts) must contain entries for all the
IPv4 and IPv6 addresses used throughout the cluster, including all STATIONARY_IP and
HEARTBEAT_IP addresses as well any private addresses. There must be at least one IPv4 address in
this file (in the case of /etc/hosts, the IPv4 loopback address cannot be removed). In addition, the
file must contain the following entry:
::1 localhost ipv6-localhost ipv6-loopback
For more information and recommendations about hostname resolution, see Configuring Name
Resolution.

• You must use $SGCONF/cmclnodelist, not ~/.rhosts or /etc/hosts.equiv, to provide root


access to an unconfigured node.
See Allowing Root Access to an Unconfigured Node for more information.

• Hostname aliases are not supported for IPv6 addresses, because of operating system limitations.

NOTE: This applies to all IPv6 addresses, whether HOSTNAME_ADDRESS_FAMILY is set to IPV6 or
ANY.

• Cross-subnet configurations are not supported.


This also applies if HOSTNAME_ADDRESS_FAMILY is set to IPV6. See Cross-Subnet
Configurations for more information about such configurations.

• Virtual machines are not supported.


You cannot have a virtual machine that is either a node or a package if
HOSTNAME_ADDRESS_FAMILY is set to ANY or IPV6.

110 What Is Mixed Mode?


Cluster Configuration Parameters
You need to define a set of cluster parameters. These are stored in the binary cluster configuration file,
which is distributed to each node in the cluster. You configure these parameters by editing the cluster
configuration template file created by means of the cmquerycl command, as described under
Configuring the Cluster on page 193.

NOTE: See Reconfiguring a Cluster on page 285 for a summary of changes you can make while the
cluster is running.

The following parameters must be configured:


CLUSTER_NAME
The name of the cluster as it will appear in the output of cmviewcl and other commands, and as it
appears in the cluster configuration file.
The cluster name must not contain any of the following characters: space, slash (/), backslash (\),
and asterisk (*).

NOTE: In addition, the following characters must not be used in the cluster name if you are using the
Quorum Server: at-sign (@), equal-sign (=), or-sign (|), semicolon (;).
These characters are deprecated, meaning that you should not use them, even if you are not using
the Quorum Server.

All other characters are legal. The cluster name can contain up to 39 characters.

CAUTION: Make sure that the cluster name is unique within the subnets configured on the
cluster nodes; under some circumstances Serviceguard may not be able to detect a duplicate
name and unexpected problems may result.
In particular make sure that two clusters with the same name do not use the same quorum
server; this could result in one of the clusters failing to obtain the quorum server’s arbitration
services when it needs them, and thus failing to re-form.

HOSTNAME_ADDRESS_FAMILY
Specifies the Internet Protocol address family to which Serviceguard will try to resolve cluster node
names and Quorum Server host names. Valid values are IPV4, IPV6, and ANY. The default is IPV4.

• IPV4 means Serviceguard will try to resolve the names to IPv4 addresses only.

• IPV6 means Serviceguard will try to resolve the names to IPv6 addresses only.

◦ ANY means Serviceguard will try to resolve the names to both IPv4 and IPv6 addresses.

IMPORTANT: See About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed
Mode for important information. See also the latest Serviceguard release notes at
http://www.hpe.com/info/linux-serviceguard-docs.

QS_HOST
The fully-qualified hostname or IP address of a host system outside the current cluster that is
providing quorum server functionality. It must be (or resolve to) an IPv4 address on Red Hat 5. On
SLES 11, it can be (or resolve to) either an IPv4 or an IPv6 address if
HOSTNAME_ADDRESS_FAMILY is set to ANY, but otherwise must match the setting of

Cluster Configuration Parameters 111


HOSTNAME_ADDRESS_FAMILY. This parameter is used only when you employ a quorum server for
tie-breaking services in the cluster. You can also specify an alternate address (QS_ADDR) by which
the cluster nodes can reach the quorum server.
For more information, see Cluster Lock Planning on page 100 and Specifying a Quorum Server.
See also “Configuring Serviceguard to Use the Quorum Server” in the latest version HPE
Serviceguard Quorum Server Version A.12.00.30 Release Notes, at http://www.hpe.com/info/linux-
serviceguard-docs (Select HP Serviceguard Quorum Server Software).

IMPORTANT: See also About Hostname Address Families: IPv4-Only, IPv6-Only, and
Mixed Mode for important information about requirements and restrictions in an IPv6–only
cluster.

Can be changed while the cluster is running; see What Happens when You Change the Quorum
Configuration Online for important information.
QS_ADDR
An alternate fully-qualified hostname or IP address for the quorum server. It must be (or resolve to) an
IPv4 address on Red Hat 5 and Red Hat 6. On SLES 11, it can be (or resolve to) either an IPv4 or an
IPv6 address if HOSTNAME_ADDRESS_FAMILY is set to ANY, but otherwise must match the setting
of HOSTNAME_ADDRESS_FAMILY. This parameter is used only if you use a quorum server and
want to specify an address on an alternate subnet by which it can be reached. On SLES 11, the
alternate subnet need not use the same address family as QS_HOST if
HOSTNAME_ADDRESS_FAMILY is set to ANY. For more information, see Cluster Lock Planning
on page 100 and Specifying a Quorum Server.

IMPORTANT: For special instructions that may apply to your version of Serviceguard and the
Quorum Server see “Configuring Serviceguard to Use the Quorum Server” in the latest version
HPE Serviceguard Quorum Server Version A.12.00.30 Release Notes, at http://
www.hpe.com/info/linux-serviceguard-docs (Select HP Serviceguard Quorum Server
Software).

Can be changed while the cluster is running; see What Happens when You Change the Quorum
Configuration Online for important information.
QS_POLLING_INTERVAL
The time (in microseconds) between attempts to contact the quorum server to make sure it is running.
Default is 300,000,000 microseconds (5 minutes). Minimum is 10,000,000 (10 seconds). Maximum is
2,147,483,647 (approximately 35 minutes).
Can be changed while the cluster is running; see What Happens when You Change the Quorum
Configuration Online for important information.
QS_TIMEOUT_EXTENSION
You can use the QS_TIMEOUT_EXTENSION to increase the time interval after which the current
connection (or attempt to connect) to the quorum server is deemed to have failed; but do not do so
until you have read the HPE Serviceguard Quorum Server Version A.12.00.30 Release Notes, and in
particular the following sections in that document: “About the QS Polling Interval and Timeout
Extension”, “Network Recommendations”, and “Setting Quorum Server Parameters in the Cluster
Configuration File”.
Can be changed while the cluster is running; see What Happens when You Change the Quorum
Configuration Online for important information.
QS_SMART_QUORUM
This parameter can be set to either ON or OFF. By default, QS_SMART_QUORUM parameter is
commented. This can be enabled only in a site-aware cluster (that is, where sites are configured). If

112 Planning and Documenting an HA Cluster


QS_SMART_QUORUM parameter is enabled (ON), then quorum server decides which site will
survive in an event of a split between the sites based on the workload status information. In case of
network partition between the sites, smart quorum grants quorum to the site that is running the critical
workload. Thus, it avoids unnecessary failover of application. Also, this feature supports the
deployment of asymmetric configurations where the two sites can have unequal number of nodes.
QS_ARBITRATION_WAIT
You can use QS_ARBITRATION_WAIT parameter only if QS_SMART_QUORUM parameter is
enabled. This is the time (in microseconds) for which quorum server will wait for both the sites to send
their quorum grant request along with workload status. By default, QS_ARBITRATION_WAIT
parameter is disabled. The default value for QS_ARBITRATION_WAIT is 3 seconds. The maximum
supported value is 5 minutes.
SITE_NAME
The name of a site to which nodes (see NODE_NAME) belong. Can be used only in a site-aware
disaster recovery cluster, which requires Metrocluster (additional Hewlett Packard Enterprise
software); see the documents listed under Cross-Subnet Configurations for more information.
You can define multiple SITE_NAMEs. SITE_NAME entries must precede any NODE_NAME entries.
See also SITE.

IMPORTANT: SITE_NAME must be 39 characters or less, and are case-sensitive. Duplicate


SITE_NAME entries are not allowed.

NODE_NAME
The hostname of each system that will be a node in the cluster.

CAUTION: Make sure that the node name is unique within the subnets configured on the
cluster nodes; under some circumstances Serviceguard may not be able to detect a duplicate
name and unexpected problems may result.

Do not use the full domain name. For example, enter ftsys9, not ftsys9.cup.hp.com. A cluster
can contain up to 32 nodes.

IMPORTANT: Node names must be 39 characters or less, and are case-sensitive; for each
node, the node_name in the cluster configuration file must exactly match the corresponding
node_name in the package configuration file (see Configuring Packages and Their
Services ) and these in turn must exactly match the hostname portion of the name specified in
the node’s networking configuration. (Using the above example, ftsys9 must appear in
exactly that form in the cluster configuration and package configuration files, and as
ftsys9.cup.hp.com in the DNS database).

The parameters immediately following NODE_NAME in this list (ESX_HOST,


NETWORK_INTERFACE, HEARTBEAT_IP, STATIONARY_IP, CLUSTER_LOCK_LUN,
CAPACITY_NAME, and CAPACITY_VALUE) apply specifically to the node identified by the preceding
NODE_NAME entry.
ESX_HOST
Specifies ESX_HOST host name, alias or IP address of Esxi host. For every VMware virtual machine
that is configured as a Serviceguard node, you can specify the ESX_HOST on which the VM resides.
This is an optional parameter.

CAUTION: Make sure that the Esxi host name is unique within the subnets configured on the
cluster nodes; under certain circumstances Serviceguard may not be able to detect a duplicate
name or IP address and unexpected problems may result.

Planning and Documenting an HA Cluster 113


NOTE: Each NODE_NAME can have only one ESX_HOST. You must not use ESX_HOST
parameter if VCENTER_SERVER parameter is specified in the cluster configuration file. Both
VCENTER_SERVER and ESX_HOST parameters are mutually exclusive.
The IP address host name resolution for the Esxi hosts used in the cluster configuration must be
populated in the local /etc/hosts file of every cluster nodes. For more information, see
Configuring Name Resolution.
All Esxi host addresses must be IPv4.

VCENTER_SERVER
Specifies host name, aliases, or IP address of vCenter server. This parameter is cluster-wide. Only
one vCenter server is supported. If you specify VCENTER_SERVER parameter, then you must not
specify ESX_HOST parameter for any of the VMWare guest nodes configured in the cluster. This is
an optional parameter.

CAUTION: Make sure that the VMware vCenter name is unique within the subnets configured
on the cluster nodes; under certain circumstances Serviceguard may not be able to detect a
duplicate name or IP address and unexpected problems may result.

NOTE: The IP address host name resolution for the VMware vCenter server used in the cluster
configuration must be populated in the local /etc/hosts file of every cluster nodes. For more
information, see Configuring Name Resolution.
All VMware vCenter host addresses must be IPv4.

CLUSTER_LOCK_LUN
The pathname of the device file to be used for the lock LUN on each node. The pathname can
contain up to 39 characters.
See Setting up a Lock LUN on page 178 and Specifying a Lock LUN on page 195
Can be changed while the cluster is running; see Updating the Cluster Lock LUN Configuration
Online. See also What Happens when You Change the Quorum Configuration Online for
important information.

NOTE: An iSCSI storage device does not support configuring a lock LUN.

SITE
The name of a site (defined by SITE_NAME) to which the node identified by the preceding
NODE_NAME entry belongs. Can be used only in a site-aware disaster recovery cluster, which
requires Metrocluster (additional Hewlett Packard Enterprise software); see the documents listed
under Cross-Subnet Configurations for more information.
If SITE is used, it must be used for each node in the cluster (that is, all the nodes must be associated
with some defined site, though not necessarily the same one).
If you are using SITEs, you can restrict the output of cmviewcl (1m) to a given site by means of the
-S <sitename> option. In addition, you can configure a site_preferred or
site_preferred_manual failover_policy for a package.

IMPORTANT: SITE must be 39 characters or less, and are case-sensitive; each SITE entry
must exactly match with one of the SITE_NAME entries. Duplicate SITE entries are not
allowed.

114 Planning and Documenting an HA Cluster


NETWORK_INTERFACE
The name of each LAN that will be used for heartbeats or for user data on the node identified by the
preceding NODE_NAME. An example is eth0. See also HEARTBEAT_IP, STATIONARY_IP, and
About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode.

NOTE: Any subnet that is configured in this cluster configuration file as a SUBNET for IP monitoring
purposes, or as a monitored_subnet in a package configuration file (see Package Configuration
Planning) must be specified in the cluster configuration file via NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. Similarly, any subnet that is used by a package for relocatable
addresses should be configured into the cluster via NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. For more information about relocatable addresses, see
Stationary and Relocatable IP Addresses and Monitored Subnets on page 56 and the
descriptions of the package ip_ parameters.

For information about changing the configuration online, see Changing the Cluster Networking
Configuration while the Cluster Is Running on page 290.
HEARTBEAT_IP
IP notation indicating this node's connection to a subnet that will carry the cluster heartbeat.

NOTE: Any subnet that is configured in this cluster configuration file as a SUBNET for IP monitoring
purposes, or as a monitored_subnet in a package configuration file ( see Package Configuration
Planning) must be specified in the cluster configuration file via NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. Similarly, any subnet that is used by a package for relocatable
addresses should be configured into the cluster via NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. For more information about relocatable addresses, see
Stationary and Relocatable IP Addresses and Monitored Subnets on page 56 and the
descriptions of the package ip_ parameters.

If HOSTNAME_ADDRESS_FAMILY is set to IPV4 or ANY, a heartbeat IP address can be either an


IPv4 or an IPv6 address, with the exceptions noted below. If HOSTNAME_ADDRESS_FAMILY is set
to IPV6, all heartbeat IP addresses must be IPv6 addresses.
For more details of the IPv6 address format, see IPv6 Address Types. Heartbeat IP addresses on a
given subnet must all be of the same type: IPv4 or IPv6 site-local or IPv6 global.
For information about changing the configuration online, see Changing the Cluster Networking
Configuration while the Cluster Is Running on page 290.
Heartbeat configuration requirements:
The cluster needs at least two network interfaces for the heartbeat in all cases, using one of the
following minimum configurations:

• two heartbeat subnets;


or
• one heartbeat subnet using bonding (mode 0, mode 1, or mode 4) with two slaves.

You cannot configure more than one heartbeat IP address on an interface; only one HEARTBEAT_IP
is allowed for each NETWORK_INTERFACE.

Planning and Documenting an HA Cluster 115


NOTE: The Serviceguard cmapplyconf, cmcheckconf, and cmquerycl commands check that
these minimum requirements are met, and produce a warning if they are not met at the immediate
network level. If you see this warning, you need to check that the requirements are met in your
overall network configuration.
If you are using virtual machine guests as nodes, you have a valid configuration (and can ignore the
warning) if there is one heartbeat network on the guest, backed by a network using NIC bonding as
in the second bullet above (VMware ESX Server).

Considerations for cross-subnet:


IP addresses for a given heartbeat path are usually on the same subnet on each node, but it is
possible to configure the heartbeat on multiple subnets such that the heartbeat is carried on one
subnet for one set of nodes and another subnet for others, with the subnets joined by a router.
This is called a cross-subnet configuration, and in this case at least two heartbeat paths must be
configured for each cluster node, and each heartbeat subnet on each node must be physically routed
separately to the heartbeat subnet on another node (that is, each heartbeat path must be physically
separate). See Cross-Subnet Configurations.

NOTE: IPv6 heartbeat subnets are not supported in a cross-subnet configuration.

NOTE: The use of a private heartbeat network is not advisable if you plan to use Remote Procedure
Call (RPC) protocols and services. RPC assumes that each network adapter device or I/O card is
connected to a route-able network. An isolated or private heartbeat LAN is not route-able, and could
cause an RPC request-reply, directed to that LAN, to timeout without being serviced.
NFS, NIS and NIS+, and CDE are examples of RPC based applications that are frequently used.
Other third party and home-grown applications may also use RPC services through the RPC API
libraries. If necessary, consult with the application vendor.

STATIONARY_IP
This node's IP address on each subnet that does not carry the cluster heartbeat, but is monitored for
packages.

NOTE: Any subnet that is configured in this cluster configuration file as a SUBNET for IP monitoring
purposes, or as a monitored_subnet in a package configuration file (see Package Configuration
Planning) must be specified in the cluster configuration file via NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. Similarly, any subnet that is used by a package for relocatable
addresses should be configured into the cluster via NETWORK_INTERFACE and either
STATIONARY_IP or HEARTBEAT_IP. For more information about relocatable addresses, see
Stationary and Relocatable IP Addresses and Monitored Subnets on page 56 and the
descriptions of the package ip_ parameters.

If HOSTNAME_ADDRESS_FAMILY is set to IPV4 or ANY, a stationary IP address can be either an


IPv4 or an IPv6 address, with the exceptions noted below. If HOSTNAME_ADDRESS_FAMILY is set
to IPV6, all the IP addresses used by the cluster must be IPv6 addresses.
If you want to separate application data from heartbeat messages, define one or more monitored non-
heartbeat subnets here. You can identify any number of subnets to be monitored.
A stationary IP address can be either an IPv4 or an IPv6 address. For more information about IPv6
addresses, see IPv6 Address Types.
For information about changing the configuration online, see Changing the Cluster Networking
Configuration while the Cluster Is Running on page 290.

116 Planning and Documenting an HA Cluster


CAPACITY_NAME, CAPACITY_VALUE
Node capacity parameters. Use the CAPACITY_NAME and CAPACITY_VALUE parameters to define
a capacity for this node. Node capacities correspond to package weights; node capacity is checked
against the corresponding package weight to determine if the package can run on that node.
CAPACITY_NAME name can be any string that starts and ends with an alphanumeric character, and
otherwise contains only alphanumeric characters, dot (.), dash (-), or underscore (_). Maximum
length is 39 characters. CAPACITY_NAME must be unique in the cluster.
CAPACITY_VALUE specifies a value for the CAPACITY_NAME that precedes it. It must be a floating-
point value between 0 and 1000000. Capacity values are arbitrary as far as Serviceguard is
concerned; they have meaning only in relation to the corresponding package weights.
Capacity definition is optional, but if CAPACITY_NAME is specified, CAPACITY_VALUE must also be
specified; CAPACITY_NAME must come first.

NOTE: cmapplyconf will fail if any node defines a capacity and any package has
min_package_node as its failover_policy or automatic as its failback_policy.

To specify more than one capacity for a node, repeat these parameters for each capacity. You can
specify a maximum of four capacities per cluster, unless you use the reserved CAPACITY_NAME
package_limit; in that case, you can use only that capacity throughout the cluster.
For all capacities other than package_limit, the default weight for all packages is zero, though you
can specify a different default weight for any capacity other than package_limit; see the entry for
WEIGHT_NAME and WEIGHT_DEFAULT later in this list.
See About Package Weights for more information.
Can be changed while the cluster is running; will trigger a warning if the change would cause a
running package to fail.
MEMBER_TIMEOUT
The amount of time, in microseconds, after which Serviceguard declares that the node has failed and
begins re-forming the cluster without this node.
Default value: 14 seconds (14,000,000 microseconds).
This value leads to a failover time of between approximately 18 and 22 seconds, if you are using a
quorum server, or a Fiber Channel cluster lock, or no cluster lock. Increasing the value to 25 seconds
increases the failover time to between approximately 29 and 39 seconds. The time will increase by
between 5 and 13 seconds if you are you using a SCSI cluster lock or dual Fibre Channel cluster
lock).
Maximum supported value: 300 seconds (300,000,000 microseconds).
If you enter a value greater than 60 seconds (60,000,000 microseconds), cmcheckconf and
cmapplyconf will note the fact, as confirmation that you intend to use a large value.
Minimum supported values:

• 3 seconds for a cluster with more than one heartbeat subnet.


• 14 seconds for a cluster that has only one heartbeat LAN.

With the lowest supported value of 3 seconds, a failover time of 4 to 5 seconds can be achieved.

Planning and Documenting an HA Cluster 117


NOTE: The failover estimates provided here apply to the Serviceguard component of failover; that is,
the package is expected to be up and running on the adoptive node in this time, but the application
that the package runs may take more time to start.
For most clusters that use a lock LUN, a minimum MEMBER_TIMEOUT of 14 seconds is
appropriate.
For most clusters that use a MEMBER_TIMEOUT value lower than 14 seconds, a quorum server is
more appropriate than a lock LUN. The cluster will fail if the time it takes to acquire the disk lock
exceeds 0.2 times the MEMBER_TIMEOUT. This means that if you use a disk-based quorum device
(lock LUN), you must be certain that the nodes in the cluster, the connection to the disk, and the disk
itself can respond quickly enough to perform 10 disk writes within 0.2 times the
MEMBER_TIMEOUT.

Keep the following guidelines in mind when deciding how to set the value.
Guidelines: You need to decide whether it's more important for your installation to have fewer (but
slower) cluster re-formations, or faster (but possibly more frequent) re-formations:

• To ensure the fastest cluster re-formations, use the minimum value applicable to your cluster. But
keep in mind that this setting will lead to a cluster re-formation, and to the node being removed
from the cluster and rebooted, if a system hang or network load spike prevents the node from
sending a heartbeat signal within the MEMBER_TIMEOUT value. More than one node could be
affected if, for example, a network event such as a broadcast storm caused kernel interrupts to be
turned off on some or all nodes while the packets are being processed, preventing the nodes from
sending and processing heartbeat messages.
See Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low on page 356
for troubleshooting information.

• For fewer re-formations, use a setting in the range of 10 to 25 seconds (10,000,000 to 25,000,000
microseconds), keeping in mind that a value larger than the default will lead to slower re-
formations than the default. A value in this range is appropriate for most installations

See also What Happens when a Node Times Out on page 77, Cluster Daemon: cmcld on page
28, and the white paper Optimizing Failover Time in a Serviceguard Environment (version A.11.19
and later) at http://www.hpe.com/info/linux-serviceguard-docs.
Can be changed while the cluster is running.
AUTO_START_TIMEOUT
The amount of time a node waits before it stops trying to join a cluster during automatic cluster
startup. All nodes wait this amount of time for other nodes to begin startup before the cluster
completes the operation. The time should be selected based on the slowest boot time in the cluster.
Enter a value equal to the boot time of the slowest booting node minus the boot time of the fastest
booting node plus 600 seconds (ten minutes).
Default is 600,000,000 microseconds.
Can be changed while the cluster is running.
NETWORK_POLLING_INTERVAL
Specifies the Interval at which Serviceguard periodically polls all the LAN Interfaces (link-level and the
ones configured for IP MONITOR)
Default is 2,000,000 microseconds (2 seconds). This means that the network manager will poll each
network interface every 2 seconds, to make sure it can still send and receive information.
The minimum value is 1,000,000 (1 second) and the maximum value supported is 30 seconds.

118 Planning and Documenting an HA Cluster


For example,

• If NETWORK_POLLING_INTERVAL is defined to be 6,000,000 (6 seconds), then the polling


happens at 6th second, 12th second and so on.
If NETWORK_POLLING_INTERVAL is defined to be 9,000,000 (9 seconds), then the polling
happens at 9th second, 18th second and so on.

• Serviceguard also uses this parameter to calculate the number of consecutive packets that each
LAN interface can miss/receive to mark a LAN interface DOWN/UP.
When an interface is monitored at IP-Level, and the NETWORK_POLLING_INTERVAL is defined
to be 8 seconds or more, then the number of consecutive packets that each LAN interface can
miss/receive to be marked DOWN/UP is 2.
For example, If NETWORK_POLLING_INTERVAL is defined to be 10 seconds, then the detection
of failure/recovery for a interface at IP Level will happen between 10 to 20 seconds.

The following are the failure/recovery detection times for different values of Network Polling Interval
(NPI) for an IP monitored Ethernet interface:

Table 6: Failure Recovery Detection Times for an IP Monitored Ethernet


Interface

Values of Network Failure/Recovery Detection Times (in seconds)


Polling Interval (NPI)
(in seconds)

1 ~ NPI x 8 - NPI x 9

2 ~ NPI x 4 - NPI x 5

3 ~ NPI x 3 - NPI x 4

4 to 8 ~ NPI x 2 - NPI x 3

>=8 ~ NPI x 1- NPI x 2

IMPORTANT: Hewlett Packard Enterprise strongly recommends using the default. Changing
this value can affect how quickly the link-level and IP-level monitors detect a network failure.
See Monitoring LAN Interfaces and Detecting Failure: Link Level on page 60.

Can be changed while the cluster is running.


CONFIGURED_IO_TIMEOUT_EXTENSION
The number of microseconds by which to increase the time Serviceguard waits after detecting a node
failure, so as to ensure that all pending I/O on the failed node has ceased.
This parameter must be set in the following cases.

• For extended-distance clusters using software mirroring across data centers over links between
iFCP switches; it must be set to the switches' maximum R_A_TOV value.

Planning and Documenting an HA Cluster 119


NOTE: CONFIGURED_IO_TIMEOUT_EXTENSION is supported only with iFCP switches that
allow you to get their R_A_TOV value.

• For switches and routers connecting an NFS server and cluster-node clients that can run
packages using the NFS-mounted file system; see Planning for NFS-mounted File Systems on
page 126.
To set the value for the CONFIGURED_IO_TIMEOUT_EXTENSION, you must first determine the
Maximum Bridge Transit Delay (MBTD) for each switch and router. The value should be in the
vendors' documentation. Set the CONFIGURED_IO_TIMEOUT_EXTENSION to the sum of the
values for the switches and routers. If there is more than one possible path between the NFS
server and the cluster nodes, sum the values for each path and use the largest number.

CAUTION: Serviceguard supports NFS-mounted file systems only over switches and
routers that support MBTD. If you are using NFS-mounted file systems, you must set
CONFIGURED_IO_TIMEOUT_EXTENSION as described here.

For more information about MBTD, see the white paper Support for NFS as a filesystem type with
HPE Serviceguard A.11.20 on HP-UX and Linux available at http://www.hpe.com/info/linux-
serviceguard-docs.

• For clusters in which both of the above conditions apply.


In this case, set the CONFIGURED_IO_TIMEOUT_EXTENSION to the higher of the two values
you get from following the instructions in the preceding two bullets.

Default is 0. The value can range from zero to 2147483647.


SUBNET
IP address of a cluster subnet for which IP Monitoring can be turned on or off (see IP_MONITOR).
The subnet must be configured into the cluster, via NETWORK_INTERFACE and either
HEARTBEAT_IP or STATIONARY_IP. All entries for IP_MONITOR and POLLING_TARGET apply to
this subnet until the next SUBNET entry; SUBNET must be the first of each trio.
By default, each of the cluster subnets is listed under SUBNET, and, if at least one gateway is
detected for that subnet, IP_MONITOR is set to ON and POLLING_TARGET entries are populated
with the gateway addresses, enabling target polling; otherwise the subnet is listed with IP_MONITOR
set to OFF.
By default, IP_MONITOR parameter is set to OFF. If a gateway is detected for the SUBNET in
question, and POLLING_TARGET entries are populated with the gateway addresses, setting
IP_MONITOR parameter to ON enables target polling. For more information, see the description for
POLLING_TARGET.
See Monitoring LAN Interfaces and Detecting Failure: IP Level on page 60 for more information.
Can be changed while the cluster is running; must be removed, with its accompanying IP_MONITOR
and POLLING_TARGET entries, if the subnet in question is removed from the cluster configuration.
IP_MONITOR
Specifies whether or not the subnet specified in the preceding SUBNET entry will be monitored at the
IP layer.
To enable IP monitoring for the subnet, set IP_MONITOR to ON; to disable it, set it to OFF.
By default, IP_MONITOR parameter is set to OFF. If a gateway is detected for the SUBNET in
question, and POLLING_TARGET entries are populated with the gateway addresses, setting

120 Planning and Documenting an HA Cluster


IP_MONITOR parameter to ON enables target polling. For more information, see the description for
POLLING_TARGET.
Hewlett Packard Enterprise recommends you use target polling because it enables monitoring beyond
the first level of switches, but if you want to use peer polling instead, set IP_MONITOR to ON for this
SUBNET, but do not use POLLING_TARGET (comment out or delete any POLLING_TARGET entries
that are already there).
If a network interface in this subnet fails at the IP level and IP_MONITOR is set to ON, the interface
will be marked down. If it is set to OFF, failures that occur only at the IP-level will not be detected.
Can be changed while the cluster is running; must be removed if the preceding SUBNET entry is
removed.
POLLING_TARGET
The IP address to which polling messages will be sent from all network interfaces on the subnet
specified in the preceding SUBNET entry, if IP_MONITOR is set to ON. This is called target polling.
Each subnet can have multiple polling targets; repeat POLLING_TARGET entries as needed.
If IP_MONITOR is set to ON, but no POLLING_TARGET is specified, polling messages are sent
between network interfaces on the same subnet (peer polling). Hewlett Packard Enterprise
recommends you use target polling; see How the IP Monitor Works on page 61 for more
information.

NOTE: cmquerycl (1m) detects first-level routers in the cluster (by looking for gateways in each
node's routing table) and lists them here as polling targets. If you run cmquerycl with the -w full
option (for full network probing) it will also verify that the gateways will work correctly for monitoring
purposes.

Can be changed while the cluster is running; must be removed if the preceding SUBNET entry is
removed.
WEIGHT_NAME, WEIGHT_DEFAULT
Default value for this weight for all packages that can have weight; see Rules and Guidelines on
page 149. WEIGHT_NAME specifies a name for a weight that exactly corresponds to a
CAPACITY_NAME specified earlier in the cluster configuration file. (A package has weight; a node
has capacity.) The rules for forming WEIGHT_NAME are the same as those spelled out for
CAPACITY_NAME earlier in this list.
These parameters are optional, but if they are defined, WEIGHT_DEFAULT must follow
WEIGHT_NAME, and must be set to a floating-point value between 0 and 1000000. If they are not
specified for a given weight, Serviceguard will assume a default value of zero for that weight. In either
case, the default can be overridden for an individual package via the weight_name and weight_value
parameters in the package configuration file.
For more information and examples, see Defining Weights on page 147.

IMPORTANT: CAPACITY_NAME, WEIGHT_NAME, and weight_value must all match exactly.

NOTE: A weight (WEIGHT_NAME, WEIGHT_DEFAULT) has no meaning on a node unless a


corresponding capacity (CAPACITY_NAME, CAPACITY_VALUE) is defined for that node.

For the reserved weight and capacity package_limit, the default weight is always one. This default
cannot be changed in the cluster configuration file, but it can be overridden for an individual package
in the package configuration file.

Planning and Documenting an HA Cluster 121


cmapplyconf will fail if you define a default for a weight but do not specify a capacity of the same
name for at least one node in the cluster. You can define a maximum of four WEIGHT_DEFAULTs per
cluster.
Can be changed while the cluster is running.
(Access Control Policies)
Specify three things for each policy: USER_NAME, USER_HOST, and USER_ROLE. Policies set in
the configuration file of a cluster and its packages must not be conflicting or redundant. For more
information, see Controlling Access to the Cluster on page 200.
MAX_CONFIGURED_PACKAGES
This parameter sets the maximum number of packages that can be configured in the cluster. The
minimum value is 0, and the maximum value, which is also the default, is 300.
Can be changed while the cluster is running.
ROOT_DISK_MONITOR
Set this parameter to ON or OFF. Health of root disk of cluster nodes is monitored if the parameter is
enabled. By default, the ROOT_DISK_MONITOR parameter is disabled (set to OFF). For more
information see, Root Disk Monitoring.
ROOT_DISK_MONITOR_INTERVAL
The number of microseconds at which Serviceguard validates if the root disk is healthy on nodes
configured for monitoring. You can set the value of the ROOT_DISK_MONITOR_INTERVAL parameter
in microseconds between 1000000 and 1800000000.
By default the ROOT_DISK_MONITOR_INTERVAL parameter value is set to 30000000 microseconds.
ROOT_DISK_MONITOR_EXCLUDE_NODES
Specify the names of one or more cluster nodes to disable root disk monitoring in the cluster. If you do
not specify any node name or names and if the ROOT_DISK_MONITOR parameter is set to ON, then
root disk monitoring is enabled for all nodes in the cluster.
GENERIC_RESOURCE_NAME
Defines the logical name used to identify a generic resource in a cluster. This name corresponds to
the generic resource name used by the cmgetresource(1m) and cmsetresource(1m)
commands. Multiple generic_resource_name entries can be specified in a cluster. The length and
formal restrictions for the name are the same as for cluster_name.
Each name must be unique within a cluster.
You can configure a maximum of 10 cluster generic resources per cluster. Each generic resource is
defined by six parameters:

• GENERIC_RESOURCE_NAME
• GENERIC_RESOURCE_TYPE
• GENERIC_RESOURCE_CMD
• GENERIC_RESOURCE_SCOPE
• GENERIC_RESOURCE_RESTART
• GENERIC_RESOURCE_HALT_TIMEOUT

122 Planning and Documenting an HA Cluster


The following is an example of defining a cluster generic resource parameters:

GENERIC_RESOURCE_NAME app_mon
GENERIC_RESOURCE_TYPE simple
GENERIC_RESOURCE_CMD /usr/bin/app_monitor.sh
GENERIC_RESOURCE_SCOPE NODE
GENERIC_RESOURCE_RESTART 35
GENERIC_RESOURCE_HALT_TIMEOUT 60000000
GENERIC_RESOURCE_TYPE
Generic resources can be of two types - Simple and Extended.
Simple generic resource type

• For a simple resource, the monitoring mechanism is based on the status of the resource.
• The status can be UP, DOWN, or UNKNOWN.
• The default status is UNKNOWN; UP and DOWN can be set using the cmsetresource(1m)
command.

Extended generic resource type

• For an extended resource, the monitoring mechanism is based on the current value of the
resource.
• The default current value is 0.
• Valid values are positive integer values ranging from 1 to 2147483647.

NOTE: You can get or set the status/value of a simple/extended generic resource using the
cmgetresource(1m) and cmsetresource(1m) commands respectively. See Getting and
Setting the Status/Value of a Simple/Extended Cluster Generic Resource on page 211 and the
manpages for more information.
A single cluster can have a combination of simple and extended resources, but a given generic
resource cannot be configured as a simple resource and as an extended resource. It must be either
simple generic resource or extended generic resource.

GENERIC_RESOURCE_CMD
It is the command to be executed to start and stop the monitoring of a cluster generic resource. The
command that runs the program or function for this GENERIC_RESOURCE_CMD, for
example, /usr/bin/cpu_monitor.sh
Only Serviceguard environment variables defined in the /etc/cmcluster.conf file or an absolute
pathname can be used with generic resource command; neither the PATH variable nor any other
environment variable is passed to the command. The default shell is /bin/sh. For example,

• GENERIC_RESOURCE_CMD $SGCONF/cluster_generic_resource/script.sh

• GENERIC_RESOURCE_CMD /etc/cmcluster/cluster_generic_resource/script.sh

• GENERIC_RESOURCE_CMD /usr/local/cmcluster/conf/generic_resource/
script.sh

Planning and Documenting an HA Cluster 123


• GENERIC_RESOURCE_CMD /opt/cmcluster/conf/generic_resource/script.sh

• GENERIC_RESOURCE_CMD $SGSBIN/cmresserviced /dev/sdd1

Care must be taken when defining the cluster generic resource run commands. Each run command
must be executed in the following order:

• The cmruncl/cmrunnode command executes the run command.

• Serviceguard monitors the process ID (PID) of the process the run command creates.
• When the command exits, Serviceguard generic resource daemon (cmresourced) determines
that a failure has occurred and takes appropriate action of restarting the generic resource
command. This is true based on value for parameter generic resource restart.
• If a generic resource run command is a shell script that runs some other command and then exits,
Serviceguard will consider this normal exit as a failure.

Ensure that each generic resource run command is the name of an actual generic resource and that
its process remains alive until the actual cluster stops.
GENERIC_RESOURCE_SCOPE
You can set the scope type to node.

The Node Scope generic resource value is unique to a node reflecting the health of an underlaying
monitored resource. The status or the value for the resource being monitored can be different on
various nodes in the cluster.

GENERIC_RESOURCE_RESTART
The number of times Serviceguard will attempt to re-run the GENERIC_RESOURCE_CMD. Valid values
are unlimited, none or any positive integer value. Default is none.
If the value is unlimited, the generic resource command will be restarted an infinite number of times. If
the value is none, the command will not be restarted.
GENERIC_RESOURCE_HALT_TIMEOUT
The length of time, in microseconds, Serviceguard will wait for the command to halt before forcing
termination of the cluster generic resource process. The maximum value is 120000000 microseconds
(2 minutes). The value should be large enough to allow any cleanup required by the generic resource
to complete.
If no value is specified, a zero timeout will be assumed, meaning that Serviceguard will not wait for
any time before terminating the process.
The GENERIC_RESOURCE_HALT_TIMEOUT is a number of micro seconds. This timeout is used to
determine the length of time Serviceguard will wait for the command specified under
GENERIC_RESOURCE_CMD to halt before a SIGKILL signal is sent to force the termination of the
command. In the event of a halt, Serviceguard will first send a SIGTERM signal to terminate the
command. If the command does not halt, Serviceguard will wait for the specified
GENERIC_RESOURCE_HALT_TIMEOUT, then send the SIGKILL signal to force the command to
terminate. This timeout value should be large enough to allow all cleanup processes associated with
the command to complete. If the GENERIC_RESOURCE_HALT_TIMOEOUT is not specified, a zero
timeout will be assumed, meaning the cluster software will not wait at all before sending the SIGKILL
signal to halt the command.
The maximum value recommended for GENERIC_RESOURCE_HALT_TIMEOUT is 120000000 (2
minutes).

124 Planning and Documenting an HA Cluster


Cluster Configuration: Next Step
When you are ready to configure the cluster, proceed to Configuring the Cluster on page 193. If you
find it useful to record your configuration ahead of time, use the Cluster Configuration worksheet Cluster
Configuration Worksheet on page 382.

Package Configuration Planning


Planning for packages involves assembling information about each group of highly available services.
The document HPE Serviceguard Developer’s Toolbox User Guide, December 2012 provides a guide for
integrating an application with Serviceguard using a suite of customizable scripts known as "Serviceguard
Developer’s Toolbox" intended for use with modular packages only. The “Serviceguard Developer’s
Toolbox” is available free of charge and can be downloaded at http://www.hpe.com/software/
sgdtoolbox.

Logical Volume and File System Planning


Use logical volumes in volume groups as the storage infrastructure for package operations on a cluster.
When the package moves from one node to another, it must still be able to access the same data on the
same disk as it did when it was running on the previous node. This is accomplished by activating the
volume group and mounting the file system that resides on it.
In Serviceguard, high availability applications, services, and data are located in volume groups that are on
a shared bus. When a node fails, the volume groups containing the applications, services, and data of the
failed node are deactivated on the failed node and activated on the adoptive node (the node the
packages move to). In order for this to happen, you must configure the volume groups so that they can be
transferred from the failed node to the adoptive node.

NOTE: To prevent an operator from accidentally activating volume groups on other nodes in the cluster,
versions A.11.16.07 and later of Serviceguard for Linux include a type of VG activation protection. This is
based on the “hosttags” feature of LVM2.
This feature is not mandatory, but Hewlett Packard Enterprise strongly recommends you implement it as
you upgrade existing clusters and create new ones. See Enabling Volume Group Activation Protection
on page 185 for instructions. However, if you are using PR feature this step is not required.

As part of planning, you need to decide the following:

• What volume groups are needed?


• How much disk space is required, and how should this be allocated in logical volumes?
• What file systems need to be mounted for each package?
• Which nodes need to import which logical volume configurations.
• If a package moves to an adoptive node, what effect will its presence have on performance?
• What hardware/software resources need to be monitored as part of the package? You can then
configure these as generic resources in the package and write appropriate monitoring scripts for
monitoring the resources. Alternatively monitoring script logic can be included in cluster configuration
by using cluster generic resource. For more information see, Using the Cluster Generic Resources
Monitoring Service.

Cluster Configuration: Next Step 125


NOTE: Generic resources influence the package based on their status. The actual monitoring of the
resource should be done in a script and this must be configured as a service. The script sets the
status of the resource based on the availability of the resource. See Monitoring Script for Generic
Resources on page 392. If you have configured the generic resource under cluster see, Monitoring
Script for Cluster Generic Resources.

Create a list by package of volume groups, logical volumes, and file systems. Indicate which nodes need
to have access to common file systems at different times.
Hewlett Packard Enterprise recommends that you use customized logical volume names that are different
from the default logical volume names (lvol1, lvol2, etc.). Choosing logical volume names that
represent the high availability applications that they are associated with (for example, lvoldatabase)
will simplify cluster administration.
To further document your package-related volume groups, logical volumes, and file systems on each
node, you can add commented lines to the /etc/fstab file. The following is an example for a database
application:

# /dev/vg01/lvoldb1 /applic1 ext3 defaults 0 1 # These six entries are


# /dev/vg01/lvoldb2 /applic2 ext3 defaults 0 1 # for information purposes
# /dev/vg01/lvoldb3 raw_tables ignore ignore 0 0 # only. They record the
# /dev/vg01/lvoldb4 /general ext3 defaults 0 2 # logical volumes that
# /dev/vg01/lvoldb5 raw_free ignore ignore 0 0 # exist for Serviceguard's
# /dev/vg01/lvoldb6 raw_free ignore ignore 0 0 # HA package. Do not uncomment.
Create an entry for each logical volume, indicating its use for a file system or for a raw device.

CAUTION: Do not use /etc/fstab to mount file systems that are used by Serviceguard
packages.

For information about creating, exporting, and importing volume groups, see Creating the Logical
Volume Infrastructure on page 181.

Planning for NFS-mounted File Systems


As of Serviceguard A.11.20.00, you can use NFS-mounted (imported) file systems as shared storage in
packages.
The same package can mount more than one NFS-imported file system, and can use both cluster-local
shared storage and NFS imports.

NOTE: Ensure that the NFS module is loaded during boot time for the configurations using NFS file
systems as part of the package configuration.

The following rules and restrictions apply.

• NFS mounts are supported for modular failover packages.


• So that Serviceguard can ensure that all I/O from a node on which a package has failed is flushed
before the package restarts on an adoptive node, all the network switches and routers between the
NFS server and client must support a worst-case timeout, after which packets and frames are
dropped. This timeout is known as the Maximum Bridge Transit Delay (MBTD).

126 Planning for NFS-mounted File Systems


IMPORTANT: Find out the MBTD value for each affected router and switch from the vendors'
documentation; determine all of the possible paths; find the worst case sum of the MBTD values
on these paths; and use the resulting value to set the Serviceguard
CONFIGURED_IO_TIMEOUT_EXTENSION parameter. For instructions, see the discussion of
this parameter under Cluster Configuration Parameters on page 111.
Switches and routers that do not support MBTD value must not be used in a Serviceguard NFS
configuration. This might lead to delayed packets that in turn could lead to data corruption.

• Networking among the Serviceguard nodes must be configured in such a way that a single failure in
the network does not cause a package failure.
• Only NFS client-side locks (local locks) are supported.
Server-side locks are not supported.

• Because exclusive activation is not available for NFS-imported file systems, you must take the
following precautions to ensure that data is not accidentally overwritten.

◦ The server must be configured so that only the cluster nodes have access to the file system.
◦ The NFS file system used by a package must not be imported by any other system, including other
nodes in the cluster.
◦ The nodes should not mount the file system on boot; it should be mounted only as part of the
startup for the package that uses it.
◦ The NFS file system should be used by only one package.
◦ While the package is running, the file system should be used exclusively by the package.
◦ If the package fails, do not attempt to restart it manually until you have verified that the file system
has been unmounted properly.

In addition, you should observe the following guidelines.

• Hewlett Packard Enterprise recommends that you avoid a single point of failure by ensuring that the
NFS server is highly available.

NOTE: If network connectivity to the NFS Server is lost, the applications using the imported file system
may hang and it may not be possible to kill them. If the package attempts to halt at this point, it may
not halt successfully.

• Do not use the automounter; otherwise package startup may fail.


• If storage is directly connected to all the cluster nodes and shared, configure it as a local file system
rather than using NFS.
• An NFS file system should not be mounted on more than one mount point at the same time.
• Access to an NFS file system used by a package should be restricted to the nodes that can run the
package.

For more information, see HPE Serviceguard Toolkit for NFS on Linux User Guide available at http://
www.hpe.com/info/linux-serviceguard-docs. This guide includes instructions for setting up a sample
package that uses an NFS-imported file system.
See also the description of fs_name on page 242, fs_type on page 242, and the other file system-
related package parameters.

Planning and Documenting an HA Cluster 127


Planning for Expansion
You can add packages to a running cluster. This process is described in Cluster and Package
Maintenance on page 255.
When adding packages, be sure not to exceed the value of max_configured_packages as defined in the
cluster configuration file (see Cluster Configuration Parameters on page 111). You can modify this
parameter while the cluster is running if you need to.

Choosing Switching and Failover Behavior


To determine the failover behavior of a failover package (see Package Types on page 41), you define the
policy that governs where Serviceguard will automatically start up a package that is not running. In
addition, you define a failback policy that determines whether a package will be automatically returned to
its primary node when that is possible.
The following table describes different types of failover behavior and the settings in the package
configuration file that determine each behavior. See Package Parameter Explanations on page 226for
more information.

Table 7: Package Failover Behavior

Switching Behavior Parameters in Configuration File

Package switches normally after


detection of service or network failure, • node_fail_fast_enabled set to no. (Default)
generic resource failure or when a • service_fail_fast_enabled set to no for all services.
configured dependency is not met. Halt (Default)
script runs before switch takes place.
(Default) • auto_run set to yes for the package. (Default)

Package fails over to the node with the


fewest active packages. failover_policy set to min_package_node.

Package fails over to the node that is


next on the list of nodes. (Default) failover_policy set to configured_node. (Default)

Package is automatically halted and


restarted on its primary node if the failback_policy set to automatic.
primary node is available and the
package is running on a non-primary
node.

Package can be manually returned to its


primary node if it is running on a non- • failback_policy set to manual. (Default)
primary node, but this does not happen • failover_policy set to configured_node. (Default)
automatically.

All packages switch following a system


reboot on the node when a specific • service_fail_fast_enabled set to yes for a specific service.
service fails. Halt scripts are not run. • auto_run set to yes for all packages.

Table Continued

128 Planning for Expansion


Switching Behavior Parameters in Configuration File

All packages switch following a system


reboot on the node when any service • service_fail_fast_enabled set to yes for all services.
fails. • auto_run set to yes for all packages.

All packages switch following a system


reset (an immediate halt without a • service_fail_fast_enabled set to yes for a specific service.
graceful shutdown) on the node when a • auto_run set to yes for all packages.
specific service fails. Halt scripts are not
run.

All packages switch following a system


reset on the node when any service • service_fail_fast_enabled set to yes for all services.
fails. An attempt is first made to reboot • auto_run set to yes for all packages.
the system prior to the system reset.

Failover packages can be also configured so that IP addresses switch from a failed NIC to a standby NIC
on the same node and the same physical subnet.

Configuring DLS based VMDK (VMFS/RDM) in the Package


You can select a Hypervisor for a cluster on Serviceguard Manager. For information, see the
Serviceguard Manager online help.

1. Create a package configuration file that contains the VMFS module:


cmmakepkg $SGCONF/pkg1/vmfspkg.conf
Package template is created.
This file must be edited before it can be used.

2. Edit the package configuration file and specify the VMFS parameters (as shown in the snippet):
If the disk_type is of VMFS, then the package configuration looks as below:
vmdk_file_name cluster/abc.vmdk
datastore_name sg_datastore
scsi_controller 1:1
disk_type VMFS
If the disk_type is of RDM, then the package configuration looks as below:
vmdk_file_name cluster/xyz.vmdk
datastore_name sg_datastore
scsi_controller 1:2
disk_type RDM

3. After editing the package configuration file, verify the content of the package configuration file:
cmcheckconf -v -P $SGCONF/pkg1/vmfspkg.conf
cmcheckconf: Verification completed with no errors found.
Use the cmapplyconf command to apply the configuration.

Configuring DLS based VMDK (VMFS/RDM) in the Package 129


4. When verification completes without errors, apply the package configuration file. This adds the
package configuration information to the binary cluster configuration file in the $SGCONF directory and
distributes it to all the cluster nodes.
cmapplyconf -P $SGCONF/pkg1/vmfspkg.conf
Modify the package configuration ([y]/n)? y
Completed the cluster update

5. Verify that the VMFS module parameters are configured:


cmviewcl -v -f line | grep vmdk_file_name
If the disk_type is of VMFS, then the output (snippet) will be as follows:
package:vmfs_pkg1|vmdk_file_name:cluster/abc.vmdk|vmdk_file_name=cluster/abc.vmdk
package:vmfs_pkg1|vmdk_file_name:cluster/abc.vmdk|datastore_name=sg_datastore
package:vmfs_pkg1|vmdk_file_name:cluster/abc.vmdk|scsi_controller=1:1
package:vmfs_pkg1|vmdk_file_name:cluster/abc.vmdk|disk_type=VMFS
If the disk_type is of RDM, then the output (snippet) will be as follows:
package:vmfs_pkg1|vmdk_file_name:cluster/xyz.vmdk|vmdk_file_name=cluster/xyz.vmdk
package:vmfs_pkg1|vmdk_file_name:cluster/xyz.vmdk|datastore_name=sg_datastore
package:vmfs_pkg1|vmdk_file_name:cluster/xyz.vmdk|scsi_controller=1:2
package:vmfs_pkg1|vmdk_file_name:cluster/xyz.vmdk|disk_type=RDM

6. Start the package. As part of the package start, the VMDK disks will be attached to the node.
cmrunpkg vmfspkg

Online Reconfiguration of VMware VMFS Parameters


Online operation such as addition of VMware VMFS parameters in packages are supported. The
following operation can be performed online:

Addition of vmdk_file_name, datastore_name, disk_type, and scsi_controller parameter.

The following operations cannot be performed online:

• Modification of vmdk_file_name, datastore_name, disk_type, and scsi_controller parameter.

◦ Deletion of vmdk_file_name, datastore_name, disk_type, and scsi_controller parameter.

Parameters for Configuring Generic Resources


Serviceguard provides the following parameters for configuring generic resources. Configure each of
these parameters in the package configuration file for each resource the package is dependent on.

• generic_resource_name: defines the logical name used to identify a generic resource in a package.
• generic_resource_evaluation_type: defines when the status of a generic resource is evaluated. This
can be set to during_package_start or before_package_start. If not specified, DPS is
considered as default.

130 Online Reconfiguration of VMware VMFS Parameters


◦ during_package_start means the status of generic resources are evaluated during the course
of start of the package.
◦ before_package_start means resource monitoring must be started before the package start
and all the configured resources must be UP on a given node for the package to be started on that
node.

• generic_resource_up_criteria: defines a criterion to determine the 'up' condition for a generic resource.
It also determines whether a generic resource is a simple resource or an extended resource. This
parameter requires an operator and a value. The operators ==, !=, >, <, >=, and <= are allowed.
Values must be positive integer values ranging from 1 to 2147483647.

The following is an example of how to configure simple and extended resources.


Simple generic resource:
generic_resource_name sfm_disk
generic_resource_evaluation_type before_package_start
Extended generic resource:
generic_resource_name cpu_lan
generic_resource_evaluation_type during_package_start
generic_resource_up_criteria <50
For more information on the generic resource parameters, see Package Parameter Explanations on
page 226.

Configuring a Generic Resource


This section describes the step-by-step procedure to configure generic resources. You can also configure
generic resources from Serviceguard Manager. See the online help for instructions on how to configure
from Serviceguard Manager.
Configuration of generic resources in a package can be done in two ways:

• Use an existing cluster generic resource in package.


• Add a new generic resource to package.

1. Create a package configuration file that contains the generic resource module:
cmmakepkg $SGCONF/pkg1/pkg1.conf
Package template is created.
This file must be edited before it can be used.

NOTE: To generate a configuration file adding the generic resource module to an existing package
(enter the command all on one line):
cmmakepkg -i $SGCONF/pkg1/pkg1.conf -m sg/generic_resource

2. Perform this step to use an existing cluster generic resource in package. If you are adding a new
generic resource to package skip this step and proceed to step 3.

Configuring a Generic Resource 131


For example, if you have configured a generic resource sfm_cpu in cluster and to use this in package.
Edit the package configuration file and specify the generic resource parameters (as shown in the
snippet):

generic_resource_name sfm_cpu
generic_resource_evaluation_type before_package_start
generic_resource_up_criteria <= 40
For step by step procedure to add the generic resource to package see, Using Cluster Generic
Resources in package configuration.

3. Perform this step to add a new generic resource to package. If you are use an existing cluster generic
resource in package skip this step and proceed to step 4.
Edit the package configuration file and specify the generic resource parameters (as shown in the
snippet):

service_name cpu_monitor
service_cmd $SGCONF/generic_resource_monitors/cpu_monitor.sh
service_halt_timeout 10

generic_resource_name sfm_cpu
generic_resource_evaluation_type before_package_start
generic_resource_up_criteria <= 40

NOTE: Generic resources must be configured to use the monitoring script. It is the monitoring script
that contains the logic to monitor the resource and set the status of a generic resource accordingly by
using cmsetresource(1m).
These scripts must be written by end-users according to their requirements. The monitoring script
must be configured as a service in the package if the monitoring of the resource is required to be
started and stopped as a part of the package.
This can be achieved by configuring a service_name and a service_cmd, by providing the full path
name of the monitoring script as the service_cmd value as shown in the step. The service_name
and generic_resource_name need not be the same. However, it would be a good practice to do it,
so that it would be easier to identify the monitor.
Hewlett Packard Enterprise provides a template that describes how a monitoring script can be written.
For more information on monitoring scripts and the template, see Monitoring Script for Generic
Resources on page 392 and Template of a Monitoring Script on page 395.
If the generic_resource_up_criteria is specified, the given resource is considered to be an extended
generic resource, else it is a simple generic resource. For the description of generic resources
parameters, see Package Parameter Explanations on page 226. See Using the Generic
Resources Monitoring Service on page 47.

4. After editing the package configuration file, verify the content of the package configuration file:
cmcheckconf -v -P $SGCONF/pkg1/pkg1.conf
cmcheckconf: Verification completed with no errors found.
Use the cmapplyconf command to apply the configuration

5. When verification completes without errors, apply the package configuration file. This adds the
package configuration information (along with generic resources) to the binary cluster configuration file
in the $SGCONF directory and distributes it to all the cluster nodes.

132 Planning and Documenting an HA Cluster


cmapplyconf -P $SGCONF/pkg1/pkg1.conf
Modify the package configuration ([y]/n)? y
Completed the cluster update

6. Verify that the generic resources parameters are configured.


cmviewcl -v -p pkg1
UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE

pkg1 down halted disabled unowned

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS NODE_NAME NAME
Generic Resource unknown node1 sfm_cpu
Generic Resource unknown node2 sfm_cpu

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled node1
Alternate up enabled node2

Other_Attributes:
ATTRIBUTE_NAME ATTRIBUTE_VALUE
Style modular
Priority no_priority
The cmviewcl -v -f line output (snippet) will be as follows:
cmviewcl -v -f line -p pkg1 | grep generic_resource
generic_resource:sfm_cpu|name=sfm_cpu
generic_resource:sfm_cpu|evaluation_type=before_package_start
generic_resource:sfm_cpu|up_criteria="<= 40"
generic_resource:sfm_cpu|node:node1|status=unknown
generic_resource:sfm_cpu|node:node1|current_value=0
generic_resource:sfm_cpu|node:node2|status=unknown
generic_resource:sfm_cpu|node:node2|current_value=0

NOTE: The default status of a generic resource is UNKNOWN and the default current_value is "0"
unless the status/value of a simple/extended generic resource is set using the cmsetresource
command.

7. Start the package. As part of the package start, the monitoring script will start the monitoring of the
generic resource and set the status accordingly. This is valid in case of adding service_cmd into
package configuration. However in case of adding cluster generic resource into package; start of
package will not run any monitoring script as it would have started as part of
GENERIC_RESOURCE_CMD of cluster start itself.

Planning and Documenting an HA Cluster 133


cmrunpkg pkg1

Getting and Setting the Status/Value of a Simple/Extended Generic Resource


You can use the Serviceguard commands cmgetresource(1m) and cmsetresource(1m), to get or
set the status of a simple generic resource or the value of an extended generic resource. These
commands can also be used in the monitoring script or executed from the CLI. You must be a root user
(UID=0) to execute these commands. Non-root users cannot run these commands.
Using Serviceguard Command to Get the Status/Value of a Simple/Extended Generic
Resource
Use the cmgetresource command to get the status of a simple generic resource or the value of an
extended generic resource. For example:
cmgetresource -r sfm_disk
This retrieves the status of the generic resource sfm_disk if it is configured as a simple resource. If
configured as an extended resource, the current value is returned.
Using Serviceguard Command to Set the Status/Value of a Simple/Extended Generic
Resource
Use the cmsetresource command to set the status of a simple generic resource or the value of an
extended generic resource. For example:
cmsetresource -r sfm_disk -s up
This sets the status of the generic resource sfm_disk to up. This is a simple generic resource and only
the status can be set to up/down.
cmsetresource -r sfm_lan 10
This sets the current value of the generic resource sfm_lan to 10. This is an extended generic resource
and only numeric values from 1 to 2147483647 can be set.
See the man pages for more information.

Online Reconfiguration of Generic Resources


Online operations such as addition, deletion, and modification of generic resources in packages are
supported. The following operations can be performed online:

• Addition of a generic resource of generic_resource_evaluation_type set to during_package_start,


whose status is not down.
Please ensure that while adding a generic resource, the equivalent monitor is available; if not add the
monitor while adding a generic resource.

• Addition of a generic resource of generic_resource_evaluation_type set to before_package_start,


whose status is 'up'.
• Deletion of a generic resource. Please ensure that while deleting a generic resource, the equivalent
monitor is also removed. However, if a common resource is being monitored across multiple
packages, then before removing the monitor ensure that the generic resource being deleted is not
configured in other packages that are also using this monitor.

134 Getting and Setting the Status/Value of a Simple/Extended Generic Resource


• Modification of generic_resource_evaluation_type from before_package_start to
during_package_start or vice versa when the resource is 'up'.

• Modification of generic_resource_up_criteria specified for resources of evaluation type


before_package_start or during_package_start provided the new up criteria does not cause
the resource status to evaluate to 'down' (i.e., the current_value of the resource still satisfies the new
up_criteria).

◦ Modification of resource type from a simple resource to an extended resource is allowed only if the
generic_resource_evaluation_type is during_package_start in all the running packages that
currently use the resource.

Online Reconfiguration of serviceguard-xdc Modular Package Parameters


Online operations such as addition, deletion, and modification of serviceguard-xdc package parameters in
serviceguard-xdc packages are supported. The following operations can be performed online:

• Modification of xdc/xdc/rpo_target parameter.


• Modification of xdc/xdc/raid_monitor_interval parameter.
• Addition of a new MD device ( xdc/xdc/raid_device[]) along with its mirror halves ( xdc/xdc/
raid_device_0[] and xdc/xdc/raid_device_1[]).
• Deletion of an existing MD device ( xdc/xdc/raid_device[]) and its mirror halves ( xdc/xdc/
raid_device_0[] and xdc/xdc/raid_device_1[]).
• Changing one mirror half ( xdc/xdc/raid_device_0[] or xdc/xdc/raid_device_1[]) of an existing MD at
once.

The following operations cannot be performed online:

• Changing at once both the mirror halves ( xdc/xdc/raid_device_0[] and xdc/xdc/raid_device_1[]) of an


existing MD device ( xdc/xdc/raid_device[]).
• Changing the service_name attribute of "raid_monitor" service when the serviceguard-xdc package is
running.
• Adding and deleting an MD device simultaneously.
• Add an MD device and replacing a mirror half in another existing MD device simultaneously.
• Deleting an MD device and replacing a mirror half in another existing MD device simultaneously.
• Replacing multiple mirror halves simultaneously while the package is running.

About Package Dependencies


A package can have dependencies on other packages, meaning the package will not start on a node
unless the packages it depends on are running on that node.
You can make a package dependent on any other package or packages running on the same cluster
node, subject to the restrictions spelled out in Chapter 6, under dependency_condition on page 232.
Serviceguard adds two new capabilities: you can specify broadly where the package depended on must
be running, and you can specify that it must be down. These capabilities are discussed later in this
section under Extended Dependencies on page 140. You should read the next section, Simple
Dependencies on page 136, first.

Online Reconfiguration of serviceguard-xdc Modular Package Parameters 135


Simple Dependencies
A simple dependency occurs when one package requires another to be running on the same node. You
define these conditions by means of the parameters dependency_condition and dependency_location,
using the literal values UP and same_node, respectively. (For detailed configuration information, see the
package parameter definitions starting with dependency_name on page 232. For a discussion of
complex dependencies, see Extended Dependencies on page 140.
Make a package dependent on another package if the first package cannot (or should not) function
without the services provided by the second. For example, pkg1 might run a real-time web interface to a
database managed by pkg2. In this case it might make sense to make pkg1 dependent on pkg2.
In considering whether or not to create a dependency between packages, use the Rules for Simple
Dependencies on page 136 and Guidelines for Simple Dependencies on page 139 that follow.

Rules for Simple Dependencies


Assume that we want to make pkg1 depend on pkg2.

NOTE: pkg1 can depend on more than one other package, and pkg2 can depend on another package or
packages; we are assuming only two packages in order to make the rules as clear as possible.

• pkg1 will not start on any node unless pkg2 is running on that node.

• pkg1’s package_type and failover_policy constrain the type and characteristics of pkg2, as follows:

◦ If pkg1 is a multi-node package, pkg2 must be a multi-node or system multi-node package. (Note
that system multi-node packages are not supported for general use.)
◦ If pkg1 is a failover package and its failover_policy is min_package_node, must be a multi-node
or system multi-node package.
◦ If pkg1 is a failover package and its failover_policy is configured_node, pkg2 must be:

– a multi-node or system multi-node package, or


– a failover package whose failover_policy is configured_node.

• pkg2 cannot be a failover package whose failover_policy is min_package_node.

• pkg2’s node_name list must contain all of the nodes on pkg1’s.

◦ This means that if pkg1 is configured to run on any node in the cluster (*), pkg2 must also be
configured to run on any node.

NOTE: If pkg1 lists all the nodes, rather than using the asterisk (*), pkg2 must also list them.

◦ Preferably the nodes should be listed in the same order if the dependency is between packages
whose failover_policy is configured_node; cmcheckconf and cmapplyconf will warn you if
they are not.

• A package cannot depend on itself, directly or indirectly. That is, not only must pkg1 not specify itself
in the dependency_condition, but pkg1 must not specify a dependency on pkg2 if pkg2 depends on
pkg1, or if pkg2 depends on pkg3 which depends on pkg1, etc.

136 Simple Dependencies


• If pkg1 is a failover package and pkg2 is a multi-node or system multi-node package, and pkg2 fails,
pkg1 will halt and fail over to the next node on its node_name list on which pkg2 is running (and any
other dependencies, such as resource dependencies or a dependency on a third package, are met).
• In the case of failover packages with a configured_node failover_policy, a set of rules governs
under what circumstances pkg1 can force pkg2 to start on a given node. This is called dragging and
is determined by each package’s priority. See Dragging Rules for Simple Dependencies on page
137.
• If pkg2 fails, Serviceguard will halt pkg1 and any other packages that depend directly or indirectly on
pkg2.
By default, Serviceguard halts packages in dependency order, the dependent package(s) first, then
the package depended on. In our example, pkg1 would be halted first, then pkg2. If there were a third
package, pkg3, that depended on pkg1, pkg3 would be halted first, then pkg1, then pkg2.
If the halt script for any dependent package hangs, by default the package depended on will wait
forever (pkg2 will wait forever for pkg1, and if there is a pkg3 that depends on pkg1, pkg1 will wait
forever for pkg3). You can modify this behavior by means of the successor_halt_timeout parameter.
(The successor of a package depends on that package; in our example, pkg1 is a successor of pkg2;
conversely pkg2 can be referred to as a predecessor of pkg1.)

Dragging Rules for Simple Dependencies


The priority parameter gives you a way to influence the startup, failover, and failback behavior of a set of
failover packages that have a configured_node failover_policy, when one or more of those packages
depend on another or others.
The broad rule is that a higher-priority package can drag a lower-priority package, forcing it to start on, or
move to, a node that suits the higher-priority package.

NOTE: This applies only when the packages are automatically started (package switching enabled);
cmrunpkg will never force a package to halt.

Keep in mind that you do not have to set priority, even when one or more packages depend on another.
The default value, no_priority, may often result in the behavior you want. For example, if pkg1
depends on pkg2, and priority is set to no_priority for both packages, and other parameters such as
node_name and auto_run are set as recommended in this section, then pkg1 will normally follow pkg2 to
wherever both can run, and this is the common-sense (and may be the most desirable) outcome.
The following examples express the rules as they apply to two failover packages whose failover_policy is
configured_node. Assume pkg1 depends on pkg2, that node1, node2 and node3 are all specified
(in some order) under node_name in the configuration file for each package, and that failback_policy is
set to automatic for each package.

Planning and Documenting an HA Cluster 137


NOTE: Keep the following in mind when reading the examples that follow, and when actually configuring
priorities:

1. auto_run should be set to yes for all the packages involved; the examples assume that it is.

2. Priorities express a ranking order, so a lower number means a higher priority (10 is a higher priority
than 30).
Hewlett Packard Enterprise recommends assigning values in increments of 20 so as to leave gaps in
the sequence; otherwise you may have to shuffle all the existing priorities when assigning priority to a
new package.
no_priority, the default, is treated as a lower priority than any numerical value.

3. All packages with no_priority are by definition of equal priority, and there is no other way to assign
equal priorities; a numerical priority must be unique within the cluster. See priority on page 232 for
more information.

If pkg1 depends on pkg2, and pkg1’s priority is lower than or equal to pkg2’s, pkg2’s node order
dominates. Assuming pkg2’s node order is node1, node2, node3, then:

• On startup:

◦ pkg2 will start on node1, or node2 if node1 is not available or does not at present meet all of its
dependencies, etc.

– pkg1will start on whatever node pkg2 has started on (no matter where that node appears on
pkg1’s node_name list) provided all of pkg1’s other dependencies are met there.

– If the node where pkg2 has started does not meet all pkg1’s dependencies, pkg1 will not start.

• On failover:

◦ If pkg2 fails on node1, pkg2 will fail over to node2 (or node3 if node2 is not available or does not
currently meet all of its dependencies, etc.)

– pkg1 will fail over to whatever node pkg2 has restarted on (no matter where that node appears
on pkg1’s node_name list) provided all of pkg1’s dependencies are met there.

– If the node where pkg2 has restarted does not meet all pkg1’s dependencies, pkg1 will not
restart.

◦ If pkg1 fails, pkg1 will not fail over.


This is because pkg1 cannot restart on any adoptive node until pkg2 is running there, and pkg2 is
still running on the original node. pkg1 cannot drag pkg2because it has insufficient priority to do
so.

• On failback:

◦ If both packages have moved from node1 to node2 and node1becomes available, pkg2will fail
back to node1 only if pkg2’s priority is higher than pkg1’s :

138 Planning and Documenting an HA Cluster


– If the priorities are equal, neither package will fail back (unless pkg1 is not running; in that case
pkg2 can fail back).

– If pkg2’s priority is higher than pkg1’s, pkg2 will fail back to node1; pkg1 will fail back to
node1 provided all of pkg1’s other dependencies are met there;

– if pkg2 has failed back to node1 and node1 does not meet all of pkg1’s dependencies,
pkg1 will halt.

If pkg1 depends on pkg2, and pkg1’s priority is higher than pkg2’s, pkg1’s node order dominates.
Assuming pkg1’s node order is node1, node2, node3, then:

• On startup:

◦ pkg1 will select node1 to start on.

◦ pkg2will start on node1, provided it can run there (no matter where node1 appears on pkg2’s
node_name list).

– If pkg2 is already running on another node, it will be dragged to node1, provided it can run
there.

◦ If pkg2 cannot start on node1, then both packages will attempt to start on node2 (and so on).

Note that the nodes will be tried in the order of pkg1’s node_name list, and pkg2 will be dragged to
the first suitable node on that list whether or not it is currently running on another node.

• On failover:

◦ If pkg1 fails on node1, pkg1will select node2 to fail over to (or node3 if it can run there and
node2 is not available or does not meet all of its dependencies; etc.)

◦ pkg2 will be dragged to whatever node pkg1 has selected, and restart there; then pkg1 will restart
there.

• On failback:

◦ If both packages have moved to node2 and node1 becomes available, pkg1 will fail back to
node1 if both packages can run there;

– otherwise, neither package will fail back.

Guidelines for Simple Dependencies


As you can see from the above Dragging Rules for Simple Dependencies on page 137, if pkg1
depends on pkg2, it can sometimes be a good idea to assign a higher priority to pkg1, because that
provides the best chance for a successful failover (and failback) if pkg1 fails.
But you also need to weigh the relative importance of the packages. If pkg2 runs a database that is
central to your business, you probably want it to run undisturbed, no matter what happens to application
packages that depend on it. In this case, the database package should have the highest priority.

Guidelines for Simple Dependencies 139


Note that, if no priorities are set, the dragging rules favor a package that is depended on over a package
that depends on it.
Consider assigning a higher priority to a dependent package if it is about equal in real-world importance
to the package it depends on; otherwise assign the higher priority to the more important package, or let
the priorities of both packages default.
You also need to think about what happens when a package fails. If other packages depend on it,
Serviceguard will halt those packages (and any packages that depend on them, etc.) This happens
regardless of the priority of the failed package.
By default the packages are halted in the reverse of the order in which they were started; and if the halt
script for any of the dependent packages hangs, the failed package will wait indefinitely to complete its
own halt process. This provides the best chance for all the dependent packages to halt cleanly, but it may
not be the behavior you want. You can change it by means of the successor_halt_timeout parameter.
If you set successor_halt_timeout to zero, Serviceguard will halt the dependent packages in parallel with
the failed package; if you set it to a positive number, Serviceguard will halt the packages in the reverse of
the start order, but will allow the failed package to halt after the successor_halt_timeout number of
seconds whether or not the dependent packages have completed their halt scripts.
If you decide to create dependencies between packages, it is a good idea to test thoroughly, before
putting the packages into production, to make sure that package startup, halt, failover, and failback
behavior is what you expect.

Extended Dependencies
To the capabilities provided by Simple Dependencies on page 136, extended dependencies add the
following:

• You can specify whether the package depended on must be running or must be down.
You define this condition by means of the dependency_condition, using one of the literals UP or DOWN
(the literals can be upper or lower case). We'll refer to the requirement that another package be down
as an exclusionary dependency; see Rules for Exclusionary Dependencies on page 141.

• You can specify where the dependency_condition must be satisfied: on the same node, a different
node, all nodes, or any node in the cluster.
You define this by means of the dependency_location parameter, using one of the literals same_node,
different_node, all_nodes, or any_node.
different_node and any_node are allowed only if dependency_condition is UP. all_nodes is
allowed only if dependency_condition is DOWN.
See Rules for different_node and any_node Dependencies on page 141.

For more information about the dependency_ parameters, see the definitions starting with
dependency_name, and the cmmakepkg (1m) manpage.

IMPORTANT: If you have not already done so, read the discussion of Simple Dependencies
before you go on.

The interaction of the legal values of dependency_location and dependency_condition creates the
following possibilities:

• Same-node dependency: a package can require that another package be UP on the same node.

140 Extended Dependencies


This is the case covered in the section on Simple Dependencies.

• Different-node dependency: a package can require that another package be UP on a different node.

• Any-node dependency: a package can require that another package be UP on any node in the cluster.

• Same-node exclusion: a package can require that another package be DOWN on the same node. (But
this does not prevent that package from being UP on another node.)

• All-nodes exclusion: a package can require that another package be DOWN on all nodes in the cluster.

Rules for Exclusionary Dependencies

• All exclusions must be mutual.


That is, if pkg1 requires pkg2 to be DOWN, pkg2 must also require pkg1 to be DOWN.
By creating an exclusionary relationship between any two packages, you ensure that only one of them
can be running at any time — either on a given node ( same-node exclusion) or throughout the cluster
(all-nodes exclusion). A package can have an exclusionary relationship with any number of other
packages, but each such relationship must be mutual.

• Priority (discussed in detail under Dragging Rules for Simple Dependencies) must be set for at least
one of the packages in an exclusionary relationship.
In case of any failover or enabling of global switching, the package starts on the eligible node:

◦ When a higher priority package fails over to a node on which a lower priority package with mutually
exclusive dependency is running, the higher priority package can force the lower priority package
to halt or (in the case of a same-node exclusion) move to another eligible node, if any.
◦ When a lower priority package fails over to another node on which the higher priority package with
mutually exclusive dependency is running, the lower priority package fails to start.

• If you start a package manually using cmrunpkg command on a node on which another mutually
exclusive package is running, then cmrunpkg command fails with the following message:
Unable to execute command. Package <pkg_1> a same node exclusionary dependency on package <pkg_2>
cmrunpkg: Unable to start some package or package instances

• dependency_location must be either same_node or all_nodes, and must be the same for both
packages.
• Both packages must be failover packages whose failover_policy is configured_node.

Rules for different_node and any_node Dependencies


These rules apply to packages whose dependency_condition is UP and whose dependency_location is
different_node or any_node. For same-node dependencies, see Simple Dependencies; for
exclusionary dependencies, see Rules for Exclusionary Dependencies.

Planning and Documenting an HA Cluster 141


• Both packages must be failover packages whose failback_policy is configured_node.

• The priority of the package depended on must be higher than or equal to the priority of the dependent
package and the priorities of that package's dependents.

◦ For example, if pkg1 has a different_node or any_node dependency on pkg2, pkg2's priority
must be higher than or equal to pkg1's priority and the priority of any package that depends on
pkg1 to be UP. pkg2's node order dominates when Serviceguard is placing the packages.

• A package cannot depend on itself, directly or indirectly.


For example, not only must pkg1 not specify itself in the dependency_condition, but pkg1 must not
specify a dependency on pkg2 if pkg2 depends on pkg1, or if pkg2 depends on pkg3 which
depends on pkg1, etc.

• “Dragging” rules apply. See Dragging Rules for Simple Dependencies.

What Happens When a Package Fails


This discussion applies to packages that have dependents, or are depended on, or both (UP
dependencies only). When such a package fails, Serviceguard does the following:

1. Halts the packages that depend on the failing package, if any.


Serviceguard halts the dependent packages (and any packages that depend on them, etc.) This
happens regardless of the priority of the failed package.

NOTE: Dependent packages are halted even in the case of different_node or any_node
dependency. For example, if pkg1 running on node1 has a different_node or any_node
dependency on pkg2 running on node2, and pkg2 fails over to node3, pkg1 will be halted and
restarted as described below.

By default the packages are halted in the reverse of the order in which they were started; and if the
halt script for any of the dependent packages hangs, the failed package will wait indefinitely to
complete its own halt process. This provides the best chance for all the dependent packages to halt
cleanly, but it may not be the behavior you want. You can change it by means of the
successor_halt_timeout parameter. (A successor is a package that depends on another package.)
If the failed package's successor_halt_timeout is set to zero, Serviceguard will halt the dependent
packages in parallel with the failed package; if it is set to a positive number, Serviceguard will halt the
packages in the reverse of the start order, but will allow the failed package to halt after the
successor_halt_timeout number of seconds whether or not the dependent packages have completed
their halt scripts.

2. Halts the failing package.


After the successor halt timer has expired or the dependent packages have all halted, Serviceguard
starts the halt script of the failing package, regardless of whether the dependents' halts succeeded,
failed, or timed out.

3. Halts packages the failing package depends on, starting with the package this package immediately
depends on. The packages are halted only if:

142 What Happens When a Package Fails


• these are failover packages, and
• the failing package can “drag” these packages to a node on which they can all run.

Otherwise the failing package halts and the packages it depends on continue to run

4. Starts the packages the failed package depends on (those halted in step 3, if any).
If the failed package has been able to drag the packages it depends on to the adoptive node,
Serviceguard starts them in the reverse of the order it halted them in the previous step (that is, the
package that does not depend on any other package is started first).

5. Starts the failed package.


6. Starts the packages that depend on the failed package (those halted in step 1).
7. If a package has the all_nodes dependency, and if the package changes to halt_aborted state,
the dependent package does not start. However, if the dependency_condition is same_node or
any_node, the dependent package is started, even if the dependent package is in halt_aborted
state.

For More Information


For more information, see:

• The parameter descriptions for priority and dependency_, and the corresponding comments in the
package configuration template file
• The cmmakepkg (1m) manpage

• The white paper Serviceguard’s Package Dependency Feature, which you can find at http://
www.hpe.com/info/linux-serviceguard-docs

About Package Weights


Package weights and node capacities allow you to restrict the number of packages that can run
concurrently on a given node, or, alternatively, to limit the total package “weight” (in terms of resource
consumption) that a node can bear.
For example, suppose you have a two-node cluster consisting of a large system and a smaller system.
You want all your packages to be able to run on the large system at the same time, but, if the large node
fails, you want only the critical packages to run on the smaller system. Package weights allow you to
configure Serviceguard to enforce this behavior.

Package Weights and Node Capacities


You define a capacity, or capacities, for a node (in the cluster configuration file), and corresponding
weights for packages (in the package configuration file).
Node capacity is consumed by package weights. Serviceguard ensures that the capacity limit you set for
a node is never exceeded by the combined weight of packages running on it; if a node's available
capacity will be exceeded by a package that wants to run on that node, the package will not run there.
This means, for example, that a package cannot fail over to a node if that node does not currently have
available capacity for it, even if the node is otherwise eligible to run the package — unless the package
that wants to run has sufficient priority to force one of the packages that are currently running to move;
see How Package Weights Interact with Package Priorities and Dependencies on page 150

For More Information 143


Configuring Weights and Capacities
You can configure multiple capacities for nodes, and multiple corresponding weights for packages, up to
four capacity/weight pairs per cluster. This allows you considerable flexibility in managing package use of
each node's resources — but it may be more flexibility than you need. For this reason Serviceguard
provides two methods for configuring capacities and weights: a simple method and a comprehensive
method. The subsections that follow explain each of these methods.

Simple Method
Use this method if you simply want to control the number of packages that can run on a given node at any
given time. This method works best if all the packages consume about the same amount of computing
resources.
If you need to make finer distinctions between packages in terms of their resource consumption, use the
Comprehensive Method instead.
To implement the simple method, use the reserved keyword package_limit to define each node's
capacity. In this case, Serviceguard will allow you to define only this single type of capacity, and
corresponding package weight, in this cluster. Defining package weight is optional; for package_limit it
will default to 1 for all packages, unless you change it in the package configuration file.
Example 1
For example, to configure a node to run a maximum of ten packages at any one time, make the following
entry under the node's NODE_NAME entry in the cluster configuration file:
NODE_NAME node1
...
CAPACITY_NAME package_limit
CAPACITY_VALUE 10
Now all packages will be considered equal in terms of their resource consumption, and this node will
never run more than ten packages at one time. (You can change this behavior if you need to by modifying
the weight for some or all packages, as the next example shows.) Next, define the CAPACITY_NAME
and CAPACITY_VALUE parameters for the remaining nodes, setting CAPACITY_NAME to
package_limit in each case. You may want to set CAPACITY_VALUE to different values for different
nodes. A ten-package capacity might represent the most powerful node, for example, while the least
powerful has a capacity of only two or three.

NOTE: Serviceguard does not require you to define a capacity for each node. If you define the
CAPACITY_NAME and CAPACITY_VALUE parameters for some nodes but not for others, the nodes for
which these parameters are not defined are assumed to have limitless capacity; in this case, those nodes
would be able to run any number of eligible packages at any given time.

If some packages consume more resources than others, you can use the weight_name and weight_value
parameters to override the default value (1) for some or all packages. For example, suppose you have
three packages, pkg1, pkg2, and pkg3. pkg2 is about twice as resource-intensive as pkg3 which in turn
is about one-and-a-half times as resource-intensive as pkg1. You could represent this in the package
configuration files as follows:

144 Configuring Weights and Capacities


• For pkg1:
weight_name package_limit
weight_value 2

• For pkg2:
weight_name package_limit
weight_value 6

• For pkg3:
weight_name package_limit
weight_value 3

Now node1, which has a CAPACITY_VALUE of 10 for the reserved CAPACITY_NAME


package_limit, can run any two of the packages at one time, but not all three. If in addition you wanted
to ensure that the larger packages, pkg2 and pkg3, did not run on node1 at the same time, you could
raise the weight_value of one or both so that the combination exceeded 10 (or reduce node1's capacity
to 8).
Points to Keep in Mind
The following points apply specifically to the Simple Method. Read them in conjunction with the Rules
and Guidelines, which apply to all weights and capacities.

• If you use the reserved CAPACITY_NAME package_limit, then this is the only type of capacity and
weight you can define in this cluster.
• If you use the reserved CAPACITY_NAME package_limit, the default weight for all packages is 1.
You can override this default in the package configuration file, via the weight_name and weight_value
parameters, as in the example above.
(The default weight remains 1 for any package to which you do not explicitly assign a different weight
in the package configuration file.)

• If you use the reserved CAPACITY_NAME package_limit, weight_name, if used, must also be
package_limit.

• You do not have to define a capacity for every node; if you don't, the node is assumed to have
unlimited capacity and will be able to run any number of eligible packages at the same time.
• If you want to define only a single capacity, but you want the default weight to be zero rather than 1, do
not use the reserved name package_limit. Use another name (for example,
resource_quantity) and follow the Comprehensive Method. This is also a good idea if you think
you may want to use more than one capacity in the future.

To learn more about configuring weights and capacities, see the documents listed under For More
Information.

Comprehensive Method
Use this method if the Simple Method does not meet your needs. (Make sure you have read that section
before you proceed.) The comprehensive method works best if packages consume differing amounts of
computing resources, so that simple one-to-one comparisons between packages are not useful.

Comprehensive Method 145


IMPORTANT: You cannot combine the two methods. If you use the reserved capacity
package_limit for any node, Serviceguard will not allow you to define any other type of capacity
and weight in this cluster; so you are restricted to the Simple Method in that case.

Defining Capacities
Begin by deciding what capacities you want to define; you can define up to four different capacities for the
cluster.
You may want to choose names that have common-sense meanings, such as “processor”, “memory”, or
“IO”, to identify the capacities, but you do not have to do so. In fact it could be misleading to identify single
resources, such as “processor”, if packages really contend for sets of interacting resources that are hard
to characterize with a single name. In any case, the real-world meanings of the names you assign to node
capacities and package weights are outside the scope of Serviceguard. Serviceguard simply ensures that
for each capacity configured for a node, the combined weight of packages currently running on that node
does not exceed that capacity.
For example, if you define a CAPACITY_NAME and weight_name processor, and a CAPACITY_NAME
and weight_name memory, and a node has a processor capacity of 10 and a memory capacity of 1000,
Serviceguard ensures that the combined processor weight of packages running on the node at any one
time does not exceed 10, and that the combined memory weight does not exceed 1000. But Serviceguard
has no knowledge of the real-world meanings of the names processor and memory; there is no
mapping to actual processor and memory usage and you would get exactly the same results if you used
the names apples and oranges.
For example, suppose you have the following configuration:

• A two node cluster running four packages. These packages contend for resource we'll simply call A
and B.

• node1 has a capacity of 80 for A and capacity of 50 for B.

• node2 has a capacity of 60 for A and capacity of 70 for B.

• pkg1 uses 60 of the A capacity and 15 of the B capacity.

• pkg2 uses 40 of the A capacity and 15 of the B capacity.

• pkg3 uses insignificant amount (zero) of the A capacity and 35 of the B capacity.

• pkg4 uses 20 of the A capacity and 40 of the B capacity.

pkg1 and pkg2 together require 100 of the A capacity and 30 of the B capacity. This means pkg1 and
pkg2 cannot run together on either of the nodes. While both nodes have sufficient B capacity to run both
packages at the same time, they do not have sufficient A capacity.
pkg3 and pkg4 together require 20 of the A capacity and 75 of the B capacity. This means pkg3 and
pkg4 cannot run together on either of the nodes. While both nodes have sufficient A capacity to run both
packages at the same time, they do not have sufficient B capacity.

Example 2
To define these capacities, and set limits for individual nodes, make entries such as the following in the
cluster configuration file:
CLUSTER_NAME cluster_23
...
NODE_NAME node1
...

146 Planning and Documenting an HA Cluster


CAPACITY_NAME A
CAPACITY_VALUE 80
CAPACITY_NAME B
CAPACITY_VALUE 50
NODE_NAME node2
CAPACITY_NAME A
CAPACITY_VALUE 60
CAPACITY_NAME B
CAPACITY_VALUE 70
...

NOTE: You do not have to define capacities for every node in the cluster. If any capacity is not defined for
any node, Serviceguard assumes that node has an infinite amount of that capacity. In our example, not
defining capacity A for a given node would automatically mean that node could run pkg1 and pkg2 at the
same time no matter what A weights you assign those packages; not defining capacity B would mean the
node could run pkg3 and pkg4 at the same time; and not defining either one would mean the node could
run all four packages simultaneously.
When you have defined the nodes' capacities, the next step is to configure the package weights; see
Defining Weights.

Defining Weights
Package weights correspond to node capacities, and for any capacity/weight pair, CAPACITY_NAME and
weight_name must be identical.
You define weights for individual packages in the package configuration file, but you can also define a
cluster-wide default value for a given weight, and, if you do, this default will specify the weight of all
packages that do not explicitly override it in their package configuration file.

NOTE:
There is one exception: system multi-node packages cannot have weight, so a cluster-wide default weight
does not apply to them.

Defining Default Weights


To pursue the example begun under Defining Capacities, let's assume that all packages other than
pkg1 and pkg2 use about the same amount of capacity A, and all packages other than pkg3 and pkg4
use about the same amount of capacity B. You can use the WEIGHT_DEFAULT parameter in the cluster
configuration file to set defaults for both weights, as follows.
Example 3
WEIGHT_NAME A
WEIGHT_DEFAULT 20
WEIGHT_NAME B
WEIGHT_DEFAULT 15
This means that any package for which weight A is not defined in its package configuration file will have a
weight A of 20, and any package for which weight B is not defined in its package configuration file will
have a weight B of 15.

Planning and Documenting an HA Cluster 147


Given the capacities we defined in the cluster configuration file (see Defining Capacities), node1 can
run any three packages that use the default for both A and B. This would leave 20 units of spare A
capacity on this node, and 5 units of spare B capacity.

Defining Weights for Individual Packages


For each capacity you define in the cluster configuration file (see Defining Capacities) you have the
following choices when it comes to assigning a corresponding weight to a given package:

1. Configure a cluster-wide default weight and let the package use that default.
2. Configure a cluster-wide default weight but override it for this package in its package configuration file.
3. Do not configure a cluster-wide default weight, but assign a weight to this package in its package
configuration file.
4. Do not configure a cluster-wide default weight and do not assign a weight for this package in its
package configuration file.

NOTE: Option 4 means that the package is “weightless” as far as this particular capacity is concerned,
and can run even on a node on which this capacity is completely consumed by other packages.
(You can make a package “weightless” for a given capacity even if you have defined a cluster-wide
default weight; simply set the corresponding weight to zero in the package's cluster configuration file.)

Pursuing the example started under Defining Capacities, we can now use options 1 and 2 to set weights
for pkg1 through pkg4.
Example 4
In pkg1's package configuration file:
weight_name A
weight_value 60
In pkg2's package configuration file:
weight_name A
weight_value 40
In pkg3's package configuration file:
weight_name B
weight_value 35
weight_name A
weight_value 0
In pkg4's package configuration file:
weight_name B
weight_value 40

IMPORTANT: weight_name in the package configuration file must exactly match the corresponding
CAPACITY_NAME in the cluster configuration file. This applies to case as well as spelling:
weight_name a would not match CAPACITY_NAME A.
You cannot define a weight unless the corresponding capacity is defined: cmapplyconf will fail if
you define a weight in the package configuration file and no node in the package's node_name list
has specified a corresponding capacity in the cluster configuration file; or if you define a default
weight in the cluster configuration file and no node in the cluster specifies a capacity of the same
name.

148 Planning and Documenting an HA Cluster


Some points to notice about this example:

• Since we did not configure a B weight for pkg1 or pkg2, these packages have the default B weight
(15) that we set in the cluster configuration file in Example 3. Similarly, pkg4 has the default A weight
(20).
• We have configured pkg3 to have a B weight of 35, but no A weight.

• pkg1 will consume all of node2's A capacity; no other package that has A weight can run on this node
while pkg1is running there.
But node2 could still run pkg3 while running pkg1, because pkg3 has no A weight, and pkg1 is
consuming only 15 units (the default) of node2's B capacity, leaving 35 available to pkg3 (assuming
no other package that has B weight is already running there).

• Similarly, if any package that has A weight is already running on node2, pkg1 will not be able to start
there (unless pkg1 has sufficient priority to force another package or packages to move). This is true
whenever a package has a weight that exceeds the available amount of the corresponding capacity on
the node.

Rules and Guidelines


The following rules and guidelines apply to both the Simple Method and the Comprehensive Method of
configuring capacities and weights.

• You can define a maximum of four capacities, and corresponding weights, throughout the cluster.

NOTE: But if you use the reserved CAPACITY_NAME package_limit, you can define only that
single capacity and corresponding weight. See Simple Method.

• Node capacity is defined in the cluster configuration file, via the CAPACITY_NAME and
CAPACITY_VALUE parameters.
• Capacities can be added, changed, and deleted while the cluster is running. This can cause some
packages to be moved, or even halted and not restarted.
• Package weight can be defined in cluster configuration file, via the WEIGHT_NAMEand
WEIGHT_DEFAULTparameters, or in the package configuration file, via the weight_name and
weight_value parameters, or both.
• Weights can be assigned (and WEIGHT_DEFAULTs, apply) only to multi-node packages and to
failover packages whose failover_policy is configured_node and whose failback_policy is
manual.

• If you define weight (weight_name and weight_value) for a package, make sure you define the
corresponding capacity (CAPACITY_NAME and CAPACITY_VALUE) in the cluster configuration file
for at least one node on the package's node_name list . Otherwise cmapplyconf will fail when you try
to apply the package.
• Weights (both cluster-wide WEIGHT_DEFAULTs, and weights defined in the package configuration
files) can be changed while the cluster is up and the packages are running. This can cause some
packages to be moved, or even halted and not restarted.

For More Information


For more information about capacities, see the comments under CAPACITY_NAME and
CAPACITY_VALUE in:

Rules and Guidelines 149


• the cluster configuration file
• the cmquerycl (1m) manpage

• the section Cluster Configuration Parameters in this manual.

For more information about weights, see the comments under weight_name and weight_value in:

• the package configuration file


• the cmmakepkg (1m) manpage

• the section Package Parameter Explanations in this manual.

For further discussion and use cases, see the white paper Using Serviceguard’s Node Capacity and
Package Weight Feature at http://www.hpe.com/info/linux-serviceguard-docs.

How Package Weights Interact with Package Priorities and Dependencies


If necessary, Serviceguard will halt a running lower-priority package that has weight to make room for a
higher-priority package that has weight. But a running package that has no priority (that is, its priority is
set to the default, no_priority) will not be halted to make room for a down package that has no priority.
Between two down packages without priority, Serviceguard will decide which package to start if it cannot
start them both because there is not enough node capacity to support their weight.
Example 1

• pkg1 is configured to run on nodes turkey and griffon. It has a weight of 1 and a priority of 10. It
is down and has switching disabled.
• pkg2 is configured to run on nodes turkey and griffon. It has a weight of 1 and a priority of 20. It
is running on node turkey and has switching enabled.

• turkey and griffon can run one package each ( package_limit is set to 1).

If you enable switching for pkg1, Serviceguard will halt the lower-priority pkg2 on turkey. It will then
start pkg1 on turkey and restart pkg2 on griffon.
If neither pkg1 nor pkg2 had priority, pkg2 would continue running on turkey and pkg1 would run on
griffon.

Example 2

• pkg1 is configured to run on nodes turkeyand griffon. It has a weight of 1 and a priority of 10. It is
running on node turkey and has switching enabled.

• pkg2 is configured to run on nodes turkey and griffon. It has a weight of 1 and a priority of 20. It
is running on node turkey and has switching enabled.

• pkg3 is configured to run on nodes turkey and griffon. It has a weight of 1 and a priority of 30. It
is down and has switching disabled.
• pkg3 has a same_node dependency on pkg2

• turkey and griffon can run two packages each (package_limit is set to 2).

150 How Package Weights Interact with Package Priorities and Dependencies
If you enable switching for pkg3, it will stay down because pkg2, the package it depends on, is running
on node turkey, which is already running two packages (its capacity limit). pkg3 has a lower priority
than pkg2, so it cannot drag it to griffon where they both can run.

Serviceguard Load Sensitive Placement


When you configure load balancing, Serviceguard places the packages in a load sensitive mode. While
starting the packages, Serviceguard places the packages on the nodes to distribute the weights evenly on
all the nodes.
Load Balancing
When capacity and weights are configured in a cluster, the packages can either start or failover in one of
the following ways:

• To the node that has the least used capacity, if the node capacity is INFINITE.
• To the node that has the highest capacity remaining, if the node capacity is finite.

Parameters
LOAD_BALANCING
You can set the LOAD_BALANCING parameter ON or OFF in the cluster configuration file. If you enable
LOAD_BALANCING parameter, load sensitive fail over takes place. By default, the LOAD_BALANCING
parameter is set to OFF.
Keyword
You can specify the following keyword for CAPACITY_VALUE in the cluster configuration file:
INFINITE
Sets the CAPACITY_VALUE to INFINITE for a node. This means that the node is eligible to run a
package with respect to weight constrains.

NOTE: This value helps you ensure a configuration, that is, when some nodes fail, all the workloads are
balanced and placed on the remaining nodes without bringing any of the nodes down.

Limitations

• Capacity must be set to either INFINTE or a finite value for all the nodes in the cluster.

NOTE: When the LOAD_BALANCING parameter is set to ON in the cluster:

◦ You can define only ONE capacity per node.


◦ You cannot have a combination of finite and INFINITE values.

• All nodes in the cluster must be configured with only one capacity, and packages can be configured
with the corresponding weight when the LOAD_BALANCING parameter is set to ON.
• The failover policy for a package must always be CONFIGURED_NODE. Other failover policies are not
allowed when the LOAD_BALANCING parameter is set to ON.

Serviceguard Load Sensitive Placement 151


• This feature is enabled only when all the nodes in the cluster have the same patch installed or higher
versions. The patch id is PHSS_43620.
• When the LOAD_BALANCING parameter is set to ON, it does not support the reserved
CAPACITY_NAME package_limit.

Recommendation

If the LOAD_BALANCING parameter is ON, Hewlett Packard Enterprise recommends that packages
must set their weight_value other than the default value of zero.

Using Load Sensitive Package Placement


Cluster configuration file
In the cluster configuration file:

1. Set the LOAD_BALANCING parameter to ON.


2. Configure one capacity for each node in the cluster. The capacity value can either have a INFINITE
(for all the nodes in the cluster) or finite value.

Package configuration file


In the package configuration file:

1. Set the failover_policy parameter to CONFIGURED_NODE.

2. Set the failback_policy parameter to MANUAL.

3. Configure a weight corresponding to the capacity defined in the cluster configuration file.

Starting Load Sensitive Package Placement


You can start the package on the most eligible node in one of the following ways:

1. Start the package using the cmrunpkg command:


cmrunpkg –a <pkg_name>...

NOTE: cmrunpkg <pkg_name> always starts the package on the local node only. Hence, there
might be an imbalance in cluster when the load balancing is configured.

2. Apply the package configuration and enable autorun for the packages, when cluster is down:
cmruncl

3. Enable autorun for a package, so that the package starts on the most eligible node:
cmmodpkg –e <pkg_name>...

152 Planning and Documenting an HA Cluster


NOTE:

• The most eligible node is the node which has the least capacity used (when capacity is INFINITE) or
the node with maximum remaining capacity (when capacity is finite).
• In the cmviewcl -v -f line output capacity:x|percent_used field is always zero when the
corresponding capacity is infinite.

Examples
CAPACITY_VALUE parameter set to INFINITE
Scenario 1: Staring a package with cmruncl command

1. Set the LOAD_BALANCING parameter to ON:

Parameters in cluster configuration:


LOAD_BALANCING ON
2. Create a cluster with nodes having CAPACITY_VALUE parameter set to INFINITE:

NODE_NAME test1
CAPACITY_NAME test_capacity
CAPACITY_VALUE INFINITE
3. Repeat step 2 for all the nodes in the cluster.
4. Create packages with weight_name and weight_value in the package configuration file:

weight_name test_capacity
weight_value <value>

For example, consider creating 5 packages with different weight_value as shown:

package:pkg1|weight:test_capacity|name=test_capacity
package:pkg1|weight:test_capacity|value=7
package:pkg2|weight:test_capacity|name=test_capacity
package:pkg2|weight:test_capacity|value=9
package:pkg3|weight:test_capacity|name=test_capacity
package:pkg3|weight:test_capacity|value=10
package:pkg4|weight:test_capacity|name=test_capacity
package:pkg4|weight:test_capacity|value=4
package:pkg5|weight:test_capacity|name=test_capacity
package:pkg5|weight:test_capacity|value=4

5. Run the cluster:


#cmruncl
The figure shows the load distribution across the nodes with CAPACITY_VALUE parameter set to
INFINITE.

Examples 153
Figure 30: Load Distributed across the Nodes with CAPACITY_VALUE set to INFINITE

6. Now, forcefully halt Node 1 using cmhaltnode –f command to test failover of pkg3, which is
currently running on Node 1. The figure shows the failover of pkg3 on Node 3.

Figure 31: Failover of a Package

When the capacity and weights are configured in a cluster, the package fails over to the node that has
the least used capacity. Since the Node 3 has the least used capacity of 7, the pkg3 with weight of 10
fails over to Node 3.

7. Now, forcefully halt Node 3 using cmhaltnode –f command to test failover of pkg1 and pkg3, which
is currently running on Node 3. The figure shows the failover of pkg3 on Node 4 and pkg1 on Node 2.

154 Planning and Documenting an HA Cluster


Figure 32: Failover of Two Packages

When the capacity and weights are configured in a cluster, the package fails over to the node that has
the least used capacity. Since the Node 4 has the least used capacity of 8, the pkg3 with weight of 10
fails over to Node 4 and the pkg1 with weight of 7 fails over to Node 2. The figure shows that pkg1 and
pkg2 running on Node 2 and pkg4, pkg5, and pkg3 running on Node 4.

Planning and Documenting an HA Cluster 155


Figure 33: Load Distributed across the Nodes after Failover

Scenario 2: Starting packages with cmrunpkg –a command

1. Create a cluster and packages as described in step 4 of scenario 1.


2. Halt pkg3 and pkg5:
cmhaltpkg pkg3 pkg5

# cmviewcl
test1_cluster up

NODE STATUS STATE


test1 up running
test2 up running

PACKAGE STATUS STATE AUTO_RUN NODE


pkg2 up running enabled test2

NODE STATUS STATE


test3 up running

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running enabled test3

NODE STATUS STATE


test4 up running

156 Planning and Documenting an HA Cluster


PACKAGE STATUS STATE AUTO_RUN NODE
pkg4 up running enabled test4

UNOWNED_PACKAGES
PACKAGE STATUS STATE AUTO_RUN NODE
pkg3 down halted disabled unowned
pkg5 down halted disabled unowned

3. Run the following command:


cmrunpkg -a pkg3 pkg5
First, pkg3 is on Node 1, as this is the first node with least used capacity. Then, pkg5 is placed on
Node 4.

Scenario 3: Packages with dependency are placed in load sensitive placement.

1. Set the LOAD_BALANCING parameter to ON:

Parameters in cluster configuration:


LOAD_BALANCING ON
2. Create a cluster with four nodes, and CAPACITY_VALUE parameter for each node set to INFINITE:

NODE_NAME node1
CAPACITY_NAME test_capacity
CAPACITY_VALUE INFINITE
3. Create four packages with the following weight_name and weight_value and dependency
conditions:

package:pkg1|weight:test_capacity|name=test_capacity
package:pkg1|weight:test_capacity|value=1
package:pkg2|weight:test_capacity|name=test_capacity
package:pkg2|weight:test_capacity|value=2
package:pkg3|weight:test_capacity|name=test_capacity
package:pkg3|weight:test_capacity|value=3
package:pkg4|weight:test_capacity|name=test_capacity
package:pkg4|weight:test_capacity|value=4

For pkg1 following are the dependencies:

dependency_name: pkg2dep
dependency_condition: pkg2=up
dependency_location: same_node

For pkg3 following are the dependencies:

dependency_name: pkg4dep
dependency_condition: pkg4=up
dependency_location: different_node

The packages are sorted according to the cumulative weight of the dependency tree, where, pkg1 and
pkg2 have cumulative weight of 3 and pkg3 and pkg4 have cumulative weight of 7.

4. Run the cmrunpkg -a command:

Planning and Documenting an HA Cluster 157


cmrunpkg -a pkg1 pkg2 pkg3 pkg4
Now, pkg4 is placed on Node 1 as this is the leaf package with weight of 4 and pkg2 is placed on
Node 1 as this has the same node dependency with weight of 2. Based on the increasing weight, pkg3
is placed on Node 3 with weight of 3 and pkg1 is placed on Node 4 as this has different node
dependency with weight of 1.

Scenario 4: Packages with and without dependencies configured for load sensitive placement:

1. Set the LOAD_BALANCING parameter to ON:

Parameters in cluster configuration:


LOAD_BALANCING ON
2. Create a cluster with four nodes, and CAPACITY_VALUE parameter for each node set to INFINITE:

NODE_NAME test1
CAPACITY_NAME test_capacity
CAPACITY_VALUE INFINITE
3. Create eight packages with the following weight_name and weight_value and dependency
conditions:

package:pkg1|weight:test_capacity|name=test_capacity
package:pkg1|weight:test_capacity|value=13
package:pkg2|weight:test_capacity|name=test_capacity
package:pkg2|weight:test_capacity|value=25
package:pkg3|weight:test_capacity|name=test_capacity
package:pkg3|weight:test_capacity|value=4
package:pkg4|weight:test_capacity|name=test_capacity
package:pkg4|weight:test_capacity|value=22
package:pkg5|weight:test_capacity|name=test_capacity
package:pkg5|weight:test_capacity|value=19
package:pkg6|weight:test_capacity|name=test_capacity
package:pkg6|weight:test_capacity|value=11
package:mnpkg1|weight:test_capacity|name=test_capacity
package:mnpkg1|weight:test_capacity|value=50
package:mnpkg2|weight:test_capacity|name=test_capacity
package:mnpkg2|weight:test_capacity|value=3

For pkg1 following are the dependencies:

dependency_name: pkg2dep
dependency_condition: pkg2=up
dependency_location: same_node

For pkg3 following are the dependencies:

dependency_name: pkg4dep
dependency_condition: pkg4=up
dependency_location: different_node

For pkg5 following are the dependencies:

dependency_name: pkg6dep
dependency_condition: pkg6=up
dependency_location: any_node

mnpkg1 is configured to run on:

158 Planning and Documenting an HA Cluster


node_name Node 1
node_name Node 4

All other packages are configured to run on all nodes.


The packages are sorted according to the cumulative weight of the dependency tree, where:

• pkg1 and pkg2 have cumulative weight of 38


• pkg3 and pkg4 have cumulative weight of 26
• pkg5 and pkg6 have cumulative weight of 30
• mnpkg1 has weight of 50
• mnpkg2 has weight of 3

4. Run the cmrunpkg -a command:


cmrunpkg -a pkg1 pkg2 pkg3 pkg4 pkg5 pkg6 mnpkg1 mnpkg2
The leaf packages are sorted based on the complete weight of the dependency tree:

mnpkg1, pkg2, pkg6, pkg4, mnpkg2, pkg5, pkg1, pkg3.

Each package is placed based on the capacity used on each node:

• mnpkg1 is placed on Node 1 and Node 4 which is of weight 50


• pkg2 is placed on Node 2 which is of weight 25
• pkg6 and pkg4 are placed on Node 3 which is of weight 33
• pkg5 which has different node dependency from pkg6 and is placed on the least loaded node out of
Node 1, Node 2, and Node 4 since pkg6 is running on Node 3. So, pkg5 is placed on Node 2.

Similarly, the remaining packages are placed according to the dependency condition and load as follows:

On Node 1: mnpkg1, pkg3, and mnpkg2


On Node 2: pkg2, pkg5, pkg1, and mnpkg2
On Node 3: pkg6, pkg4, and mnppkg2
On Node 4: mnpkg1 and mnpkg2

NOTE: In rare scenarios, you might happen to see that even though the packages are sorted according
to the weight, they do not start on the nodes they are expected to. This is because there might be a delay
in starting the package due to dependency conditions not being satisfied. This might result in starting a
package of lesser weight before a heavy weight package.
For example, in scenario 3 the load sensitive placement tries to place pkg6 and assigns Node3. Since it is
unable to start the package, pkg4 is considered for placement since it has different node dependency and
pkg6 is not considered. Therefore, the placement of package is as follows:

On Node 1: mnpkg1 and mnpkg2


On Node 2: pkg2, pkg3, pkg1, and mnpkg2
On Node 3: pkg5, pkg6, pkg4, and mnppkg2
On Node 4: mnpkg1 and mnpkg2

CAPACITY_VALUE parameter set to finite


Scenario 1: Starting packages with cmrunpkg –a command

Planning and Documenting an HA Cluster 159


1. Set the LOAD_BALANCING parameter to ON:

Parameters in cluster configuration:


LOAD_BALANCING ON
2. Create a cluster with nodes having CAPACITY_VALUE parameter set to a finite value:

NODE1 test1
CAPACITY_NAME test_capacity
CAPACITY_VALUE 11
NODE2 test2
CAPACITY_NAME test_capacity
CAPACITY_VALUE 14
NODE3 test3
CAPACITY_NAME test_capacity
CAPACITY_VALUE 8
NODE4 test4
CAPACITY_NAME test_capacity
CAPACITY_VALUE 7
3. Create packages with weight_name and weight_value in the package configuration file:

weight_name test_capacity
weight_value <value>

For example, consider creating five packages with different weight_value as shown:

package:pkg1|weight:test_capacity|name=test_capacity
package:pkg1|weight:test_capacity|value=4
package:pkg2|weight:test_capacity|name=test_capacity
package:pkg2|weight:test_capacity|value=3
package:pkg3|weight:test_capacity|name=test_capacity
package:pkg3|weight:test_capacity|value=7
package:pkg4|weight:test_capacity|name=test_capacity
package:pkg4|weight:test_capacity|value=10
package:pkg5|weight:test_capacity|name=test_capacity
package:pkg5|weight:test_capacity|value=6

4. Run the cmrunpkg -a command:


cmrunpkg -a pkg1 pkg2 pkg3 pkg4 pkg5
The figure shows the load distribution across the nodes with CAPACITY_VALUE parameter set to
finite.

160 Planning and Documenting an HA Cluster


Figure 34: Load Distributed across the Nodes with CAPACITY_VALUE set to Finite

First, the pkg4 is placed on Node 2, as this is the first node with highest remaining capacity.
Then, pkg3 is placed on Node 1, pkg5 is placed on Node 3, pkg1 is placed on Node 4, and pkg2 is
placed on Node 1.

5. Now, forcefully halt Node 4 using cmhaltnode –f command to test failover of pkg1, which is
currently running on Node 4. The figure shows the failover of pkg1 on Node 2.

Figure 35: Failover of a Package

When finite capacity is configured in a cluster, the package fails over to the node that has the highest
remaining capacity. Since the Node 2 has the highest remaining capacity of 4, the pkg1 with weight of
4 fails over to Node 2.

Planning and Documenting an HA Cluster 161


The figure shows that pkg2 and pkg3 running on Node 1, pkg4 and pkg1 running on Node 2, and pkg5
running on Node 3.

Figure 36: Load Distributed across the Nodes after Failover

About External Scripts


As of Serviceguard A.11.18, the package configuration template for modular packages explicitly provides
for external scripts. It can be run either:

• On package startup and shutdown, as essentially the first and last functions the package performs.
These scripts are invoked by means of the parameter external_pre_script; or
• During package execution, after volume-groups and file systems are activated, and IP addresses are
assigned, and before the service and resource functions are executed; and again, in the reverse order,
on package shutdown. These scripts are invoked by means of the parameter external_script.

NOTE: Only Serviceguard environment variables defined in the /etc/cmcluster.conf file or absolute
pathname can be used with external_pre_script and external_script parameters.

The scripts are also run when the package is validated by cmcheckconf and cmapplyconf.
A package can make use of both kinds of script, and can launch more than one of each kind; in that case
the scripts will be executed in the order they are listed in the package configuration file (and in the reverse
order when the package shuts down).
In some cases you can rename or replace an external script while the package that uses it is running; see
Renaming or Replacing an External Script Used by a Running Package on page 295.
Each external script must have three entry points: start, stop, and validate, and should exit with one
of the following values:

• 0 - indicating success.

• 1 - indicating the package will be halted, and should not be restarted, as a result of failure in this script.

• 2 - indicating the package will be restarted on another node, or halted if no other node is available.

162 About External Scripts


NOTE: In the case of the validate entry point, exit values 1 and 2 are treated the same; you can use
either to indicate that validation failed.

The script can make use of a standard set of environment variables (including the package name,
SG_PACKAGE, and the name of the local node, SG_NODE) exported by the package manager or the
master control script that runs the package; and can also call a function to source in a logging function
and other utility functions. One of these functions, sg_source_pkg_env(), provides access to all the
parameters configured for this package, including package-specific environment variables configured via
the pev_ parameter.

NOTE: Some variables, including SG_PACKAGE, and SG_NODE, are available only at package run and
halt time, not when the package is validated. You can use SG_PACKAGE_NAME at validation time as a
substitute for SG_PACKAGE.

For more information, see the template in $SGCONF/examples/external_script.template.


A sample script follows. It assumes there is another script called monitor.sh, which will be configured
as a Serviceguard service to monitor some application. The monitor.sh script (not included here) uses
a parameter PEV_MONITORING_INTERVAL, defined in the package configuration file, to periodically poll
the application it wants to monitor; for example:
PEV_MONITORING_INTERVAL 60
At validation time, the sample script makes sure the PEV_MONITORING_INTERVAL and the monitoring
service are configured properly; at start and stop time it prints out the interval to the log file.
#!/bin/sh
# Source utility functions.
if [[ -z $SG_UTILS ]]
then
. $SGCONF.conf
SG_UTILS=$SGCONF/scripts/mscripts/utils.sh
fi

if [[ -f ${SG_UTILS} ]]; then


. ${SG_UTILS}
if (( $? != 0 ))
then
echo "ERROR: Unable to source package utility functions file: ${SG_UTILS}"
exit 1
fi
else
echo "ERROR: Unable to find package utility functions file: ${SG_UTILS}"
exit 1
fi

# Get the environment for this package through utility function


# sg_source_pkg_env().
sg_source_pkg_env $*

function validate_command
{

typeset -i ret=0
typeset -i i=0
typeset -i found=0
# check PEV_ attribute is configured and within limits
if [[ -z PEV_MONITORING_INTERVAL ]]
then
sg_log 0 "ERROR: PEV_MONITORING_INTERVAL attribute not configured!"
ret=1
elif (( PEV_MONITORING_INTERVAL < 1 ))
then
sg_log 0 "ERROR: PEV_MONITORING_INTERVAL value ($PEV_MONITORING_INTERVAL) not within legal limits!"
ret=1
fi
# check monitoring service we are expecting for this package is configured
while (( i < ${#SG_SERVICE_NAME[*]} ))
do

Planning and Documenting an HA Cluster 163


case ${SG_SERVICE_CMD[i]} in
*monitor.sh*) # found our script
found=1
break
;;
*)
;;
esac
(( i = i + 1 ))
done
if (( found == 0 ))
then
sg_log 0 "ERROR: monitoring service not configured!"
ret=1
fi
if (( ret == 1 ))
then
sg_log 0 "Script validation for $SG_PACKAGE_NAME failed!"
fi
return $ret
}

function start_command
{ sg_log 5 "start_command"

# log current PEV_MONITORING_INTERVAL value, PEV_ attribute can be changed


# while the package is running
sg_log 0 "PEV_MONITORING_INTERVAL for $SG_PACKAGE_NAME is $PEV_MONITORING_INTERVAL"
return 0
}

function stop_command
{

sg_log 5 "stop_command"
# log current PEV_MONITORING_INTERVAL value, PEV_ attribute can be changed
# while the package is running
sg_log 0 "PEV_MONITORING_INTERVAL for $SG_PACKAGE_NAME is $PEV_MONITORING_INTERVAL"
return 0
}
typeset -i exit_val=0
case ${1} in
start)
start_command $*
exit_val=$?
;;
stop)
stop_command $*
exit_val=$?
;;
validate)
validate_command $*
exit_val=$?
;;
*)
sg_log 0 "Unknown entry point $1"
;;
esac
exit $exit_val

Using Serviceguard Commands in an External Script


You can use Serviceguard commands (such as cmmodpkg) in an external script. These commands must
not interact with the package itself (that is, the package that runs the external script) but can interact with
other packages. But be careful how you code these interactions.
If a Serviceguard command interacts with another package, be careful to avoid command loops. For
instance, a command loop might occur under the following circumstances. Suppose a pkg1 script does a
cmmodpkg -d of pkg2, and a pkg2 script does a cmmodpkg -d of pkg1. If both pkg1 and pkg2 start at
the same time, the pkg1 script now tries to cmmodpkg pkg2. But that cmmodpkg command has to wait
for pkg2 startup to complete. The pkg2 script tries to cmmodpkg pkg1, but pkg2 has to wait for pkg1
startup to complete, thereby causing a command loop.

164 Using Serviceguard Commands in an External Script


To avoid this situation, it is a good idea to specify a run_script_timeout and
halt_script_timeout for all packages, especially packages that use Serviceguard commands in their
external scripts. If a timeout is not specified and your package has a command loop as described above,
inconsistent results can occur, including a hung cluster.

NOTE:
cmhalt operations interact with all the packages and should not be used from external scripts.

Determining Why a Package Has Shut Down


You can use an external script to find out why a package has shut down.
Serviceguard sets the environment variable SG_HALT_REASON in the package control script to one of
the following values when the package halts:

• failure - set if the package halts because of the failure of a subnet, resource, or service it depends
on
• user_halt - set if the package is halted by a cmhaltpkg or cmhaltnode command, or by
corresponding actions in Serviceguard Manager
• automatic_halt - set if the package is failed over automatically because of the failure of a package
it depends on, or is failed back to its primary node automatically (failback_policy = automatic)

You can add custom code to the package to interrogate this variable, determine why the package halted,
and take appropriate action. For modular packages, put the code in the package’s external script (see
About External Scripts).
For example, if a database package is being halted by an administrator (SG_HALT_REASON set to
user_halt) you would probably want the custom code to perform an orderly shutdown of the database;
on the other hand, a forced shutdown might be needed if SG_HALT_REASON is set to failure,
indicating thatthe package is halting abnormally (for example, because of the failure of a service it
depends on).
last_halt_failed Flag
cmviewcl -v -f line displays a last_halt_failed flag.

NOTE:
last_halt_failed appears only in the line output of cmviewcl, not the default tabular format; you
must use the -f line option to see it.

The value of last_halt_failed is no if the halt script ran successfully, or has not run since the node
joined the cluster, or has not run since the package was configured to run on the node; otherwise it is
yes.

About Cross-Subnet Failover


It is possible to configure a cluster that spans subnets joined by a router, with some nodes using one
subnet and some another. This is known as a cross-subnet configuration; see Cross-Subnet
Configurations. In this context, you can configure packages to fail over from a node on one subnet to a
node on another.
The implications for configuring a package for cross-subnet failover are as follows:

• For modular packages, you must configure two new parameters in the package configuration file to
allow packages to fail over across subnets:

Determining Why a Package Has Shut Down 165


◦ ip_subnet_node - to indicate which nodes a subnet is configured on
◦ monitored_subnet_access - to indicate whether a monitored subnet is configured on all nodes
(FULL) or only some (PARTIAL). (Leaving monitored_subnet_access unconfigured for a monitored
subnet is equivalent to FULL).

• You should not use the wildcard (*) for node_name in the package configuration file, as this could
allow the package to fail over across subnets when a node on the same subnet is eligible; failing over
across subnets can take longer than failing over on the same subnet. List the nodes in order of
preference instead of using the wildcard.
• Deploying applications in this environment requires careful consideration; see Implications for
Application Deployment.
• If a monitored_subnet is configured for PARTIALmonitored_subnet_access in a package’s
configuration file, it must be configured on at least one of the nodes on the node_name list for that
package.
Conversely, if all of the subnets that are being monitored for this package are configured for PARTIAL
access, each node on the node_name list must have at least one of these subnets configured.

◦ As in other cluster configurations, a package will not start on a node unless the subnets configured
on that node, and specified in the package configuration file as monitored subnets, are up.

Implications for Application Deployment


Because the relocatable IP address will change when a package fails over to a node on another subnet,
you need to make sure of the following:

• The hostname used by the package is correctly remapped to the new relocatable IP address.
• The application that the package runs must be configured so that the clients can reconnect to the
package’s new relocatable IP address.
In the worst case (when the server where the application was running is down), the client may
continue to retry the old IP address until TCP’s tcp_timeout is reached (typically about ten minutes), at
which point it will detect the failure and reset the connection.

For more information, see the white paper Technical Considerations for Creating a Serviceguard Cluster
that Spans Multiple IP Subnets at http://www.hpe.com/info/linux-serviceguard-docs (Select HP
Serviceguard).

Configuring a Package to Fail Over across Subnets: Example


To configure a package to fail over across subnets, you need to make some additional edits to the
package configuration file.
Suppose that you want to configure a package, pkg1, so that it can fail over among all the nodes in a
cluster comprising NodeA, NodeB, NodeC, and NodeD.
NodeA and NodeB use subnet 15.244.65.0, which is not used by NodeC and NodeD; and NodeC and
NodeD use subnet 15.244.56.0, which is not used by NodeA and NodeB. (See Obtaining Cross-
Subnet Information for sample cmquerycl output).

166 Implications for Application Deployment


Configuring node_name
First you need to make sure that pkg1 will fail over to a node on another subnet only if it has to. For
example, if it is running on NodeA and needs to fail over, you want it to try NodeB, on the same subnet,
before incurring the cross-subnet overhead of failing over to NodeC or NodeD.
Assuming nodeA is pkg1’s primary node (where it normally starts), create node_name entries in the
package configuration file as follows:
node_name nodeA
node_name nodeB
node_name nodeC
node_name nodeD

Configuring monitored_subnet_access
In order to monitor subnet 15.244.65.0 or 15.244.56.0, depending on where pkg1 is running, you
would configure monitored_subnet and monitored_subnet_access in pkg1’s package configuration file
as follows:
monitored_subnet 15.244.65.0
monitored_subnet_access PARTIAL
monitored_subnet 15.244.56.0
monitored_subnet_access PARTIAL

NOTE:
Configuring monitored_subnet_access as FULL (or not configuring monitored_subnet_access) for either
of these subnets will cause the package configuration to fail, because neither subnet is available on all
the nodes.

Configuring ip_subnet_node
Now you need to specify which subnet is configured on which nodes. In our example, you would do this
by means of entries such as the following in the package configuration file:
ip_subnet 15.244.65.0
ip_subnet_node nodeA
ip_subnet_node nodeB
ip_address 15.244.65.82
ip_address 15.244.65.83
ip_subnet 15.244.56.0
ip_subnet_node nodeC
ip_subnet_node nodeD
ip_address 15.244.56.100
ip_address 15.244.56.101

Configuring a Package: Next Steps


When you are ready to start configuring a package, proceed to Configuring Packages and Their
Services ; start with Choosing Package Modules. (If you find it helpful, you can assemble your package

Configuring a Package: Next Steps 167


configuration data ahead of time on a separate worksheet for each package; blank worksheets are in
Blank Planning Worksheets on page 380).

Planning for Changes in Cluster Size


If you intend to add additional nodes to the cluster online (while it is running) ensure that they are
connected to the same heartbeat subnets and to the same lock disks as the other cluster nodes.
In selecting a cluster lock configuration, be careful to anticipate any potential need for additional cluster
nodes. Remember that while a two-node cluster must use a cluster lock, a cluster of more than four
nodes must not use a lock LUN, but can use a quorum server. So if you will eventually need five nodes,
you should build an initial configuration that uses a quorum server.
If you intend to remove a node from the cluster configuration while the cluster is running, ensure that the
resulting cluster configuration will still conform to the rules for cluster locks described above. See Cluster
Lock Planning on page 100 for more information.
If you are planning to add a node online, and a package will run on the new node, ensure that any
existing cluster-bound volume groups for the package have been imported to the new node. Also, ensure
that the MAX_CONFIGURED_PACKAGES parameter is set high enough to accommodate the total
number of packages you will be using; see Cluster Configuration Parameters on page 111.

168 Planning for Changes in Cluster Size


Building an HA Cluster Configuration
This chapter and the next take you through the configuration tasks required to set up a Serviceguard
cluster. You carry out these procedures on one node, called the configuration node, and Serviceguard
distributes the resulting binary file to all the nodes in the cluster. In the examples in this chapter, the
configuration node is named ftsys9, and the sample target node is called ftsys10.
This chapter covers the following major topics:

• Preparing Your Systems


• Configuring the Cluster on page 193
• Managing the Running Cluster on page 213

Configuring packages is described in the next chapter.


Use the Serviceguard manpages for each command to obtain full information about syntax and usage.

Preparing Your Systems


Before configuring your cluster, ensure that Serviceguard is installed on all cluster nodes, and that all
nodes have the appropriate security files, kernel configuration and NTP (network time protocol)
configuration.

Installing and Updating Serviceguard


For information about installing and updating Serviceguard, see the following Release Notes at http://
www.hpe.com/info/linux-serviceguard-docs:

• HPE Serviceguard for Linux Base edition 12.00.40 Release Notes


• HPE Serviceguard for Linux Advanced edition 12.00.40 Release Notes
• HPE Serviceguard for Linux Enterprise edition 12.00.40 Release Notes

Understanding the Location of Serviceguard Files


Serviceguard uses a special file, /etc/cmcluster.conf, to define the locations for configuration and
log files within the Linux file system. The different distributions may use different locations. The following
are example locations for a Red Hat distribution:
############################## cmcluster.conf ###########################
#
# Highly Available Cluster file locations
#
# This file must not be edited
#########################################################################
SGROOT=/usr/local/cmcluster # SG root directory
SGCONF=/usr/local/cmcluster/conf # configuration files
SGSBIN=/usr/local/cmcluster/bin # binaries
SGLBIN=/usr/local/cmcluster/bin # binaries
SGLIB=/usr/local/cmcluster/lib # libraries
SGRUN=/usr/local/cmcluster/run # location of core dumps from daemons
SGAUTOSTART=/usr/local/cmcluster/conf/cmcluster.rc # SG Autostart file

Building an HA Cluster Configuration 169


The following are example locations for a SUSE distribution:
############################## cmcluster.conf ###########################
#
# Highly Available Cluster file locations
#
# This file must not be edited
#########################################################################
SGROOT=/opt/cmcluster # SG root directory
SGCONF=/opt/cmcluster/conf # configuration files
SGSBIN=/opt/cmcluster/bin # binaries
SGLBIN=/opt/cmcluster/bin # binaries
SGLIB=/opt/cmcluster/lib # libraries
SGRUN=/opt/cmcluster/run # location of core dumps from daemons
SGAUTOSTART=/opt/cmcluster/conf/cmcluster.rc # SG Autostart file
Throughout this document, system filenames are usually given with one of these location prefixes. Thus,
references to $SGCONF/<FileName> can be resolved by supplying the definition of the prefix that is
found in this file. For example, if SGCONF is /usr/local/cmcluster/conf, then the complete
pathname for file $SGCONF/cmclconfig would be /usr/local/cmcluster/conf/cmclconfig.

Enabling Serviceguard Command Access


To allow the creation of a Serviceguard configuration, you should complete the following steps on all
cluster nodes before running any Serviceguard commands. Alternatively, you can also use cmpreparecl
to configure the nodes. For more information, see cmpreparecl(1M) .

1. Make sure the root user’s path includes the Serviceguard executables. If the Serviceguard commands
are not accessible, run the following commands:
. /etc/profile.d/serviceguard.sh (for Bourne-type shells)
. /etc/profile.d/serviceguard.csh (for C-type shells)

2. Edit the /etc/man.config file for Red Hat Enterprise Linux Server and /etc/manpath.config
file for SUSE Linux Enterprise Server to include the following:
For Red Hat Enterprise Linux Server:
MANPATH /usr/local/cmcluster/doc/man
For SUSE Linux Enterprise Server:
‘MANDATORY_MANPATH` /opt/cmcluster/doc/man
This will allow use of the Serviceguard man pages.

NOTE: Update the $MANPATH environment variable with /opt/cmcluster/doc/man/.

3. Enable use of Serviceguard variables.


If the Serviceguard variables are not defined on your system, then include the file /etc/
cmcluster.conf in your login profile for user root:
. /etc/cmcluster.conf
You can confirm the access to the one of the variables as follows:
cd $SGCONF

170 Enabling Serviceguard Command Access


Configuring Root-Level Access
The subsections that follow explain how to set up root access between the nodes in the prospective
cluster. (When you proceed to configuring the cluster, you will define various levels of non-root access as
well; see Controlling Access to the Cluster).

NOTE: For more information and advice, see the white paper Securing Serviceguard at http://
www.hpe.com/info/linux-serviceguard-docs (Select HP Serviceguard -> White Papers).

Allowing Root Access to an Unconfigured Node


To enable a system to be included in a cluster, you must enable Linux root access to the system by the
root user of every other potential cluster node. The Serviceguard mechanism for doing this is the file
$SGCONF/cmclnodelist. This is sometimes referred to as a “bootstrap” file because Serviceguard
consults it only when configuring a node into a cluster for the first time; it is ignored after that. It does not
exist by default, but you will need to create it.
You may want to add a comment such as the following at the top of the file:
###########################################################
# Do not edit this file!
# Serviceguard uses this file only to authorize access to an
# unconfigured node. Once the node is configured,
# Serviceguard will not consult this file.
###########################################################
The format for entries in cmclnodelist is as follows:
[hostname] [user] [#Comment]
For example:
gryf root #cluster1, node1
sly root #cluster1, node2
bit root #cluster1, node3
This example grants root access to the node on which this cmclnodelist file resides to root users on
the nodes gryf, sly, and bit.
Serviceguard also accepts the use of a “+” in the cmclnodelist file; this indicates that the root user on
any Serviceguard node can configure Serviceguard on this node.

IMPORTANT: If $SGCONF/cmclnodelist does not exist, Serviceguard will look at ~/.rhosts.


Hewlett Packard Enterprise strongly recommends that you use cmclnodelist.

NOTE: When you upgrade a cluster from Version A.11.15 or earlier, entries in $SGCONF/cmclnodelist
are automatically updated to Access Control Policies in the cluster configuration file. All non-root user-
hostname pairs are assigned the role of Monitor.

Ensuring that the Root User on Another Node Is Recognized


The Linux root user on any cluster node can configure the cluster. This requires that Serviceguard on one
node be able to recognize the root user on another.
Serviceguard uses the identd daemon to verify user names, and, in the case of a root user, verification
succeeds only if identd returns the username root. Because identd may return the username for the

Configuring Root-Level Access 171


first match on UID 0, you must check /etc/passwd on each node you intend to configure into the
cluster, and ensure that the entry for the root user comes before any other entry with a UID of 0.
About identd
Hewlett Packard Enterprise strongly recommends that you use identd for user verification, so you
should make sure that each prospective cluster node is configured to run it. identd is usually started
from /etc/init.d/xinetd.
(It is possible to disable identd, though Hewlett Packard Enterprise recommends against doing so. If for
some reason you have to disable identd, see Disabling identd on page 217).
For more information about identd, see the white paper Securing Serviceguard at http://www.hpe.com/
info/linux-serviceguard-docs (Select HP Serviceguard -> White Papers), and the identd manpage.

Configuring Name Resolution


Serviceguard uses the name resolution services built into Linux.
Serviceguard nodes can communicate over any of the cluster’s shared networks, so the network
resolution service you are using (such as DNS, NIS, or LDAP) must be able to resolve each of their
primary addresses on each of those networks to the primary hostname of the node in question.
In addition, Hewlett Packard Enterprise recommends that you define name resolution in each
node’s /etc/hosts file, rather than rely solely on a service such as DNS. Configure the name service
switch to consult the /etc/hosts file before other services. See Safeguarding against Loss of Name
Resolution Services for instructions.

NOTE: If you are using private IP addresses for communication within the cluster, and these addresses
are not known to DNS (or the name resolution service you use) these addresses must be listed in /etc/
hosts.
For requirements and restrictions that apply to IPv6–only clusters and mixed-mode clusters, see Rules
and Restrictions for IPv6-Only Mode and Rules and Restrictions for Mixed Mode, respectively, and the
latest version of the Serviceguard release notes.

For example, consider a two node cluster (gryf and sly) with two private subnets and a public subnet.
These nodes will be granting access by a non-cluster node (bit) which does not share the private
subnets. The /etc/hosts file on both cluster nodes should contain:
15.145.162.131 gryf.uksr.hp.com gryf
10.8.0.131 gryf.uksr.hp.com gryf
10.8.1.131 gryf.uksr.hp.com gryf
15.145.162.132 sly.uksr.hp.com sly
10.8.0.132 sly.uksr.hp.com sly
10.8.1.132 sly.uksr.hp.com sly
15.145.162.150 bit.uksr.hp.com bit
Keep the following rules in mind when creating entries in a Serviceguard node's/etc/hosts:

1. NODE_NAME in the cluster configuration file must be identical to the hostname which is the first
element of a fully qualified domain name (a name with four elements separated by periods). This
hostname is what is returned by the hostname(1) command. For example, the NODE_NAME should
be gryf rather than gryf.uksr.hp.com. For more information, see the NODE_NAME entry under
Cluster Configuration Parameters on page 111.

172 Configuring Name Resolution


NOTE: Since Serviceguard recognizes only the hostname, gryf.uksr.hp.com and
gryf.cup.hp.com cannot be nodes in the same cluster, Serviceguard identifies them as the same
host gryf.

2. All primary IP addresses configured.

NOTE: Serviceguard recognizes only the hostname (the first element) in a fully qualified domain name (a
name like those in the example above). This means, for example, that gryf.uksr.hp.com and
gryf.cup.hp.com cannot be nodes in the same cluster, as Serviceguard would see them as the same
host gryf.

If applications require the use of hostname aliases, the Serviceguard hostname must be one of the
aliases in all the entries for that host. For example, if the two-node cluster in the previous example were
configured to use the alias hostnames alias-node1 and alias-node2, then the entries in /etc/
hosts should look something like this:
15.145.162.131 gryf.uksr.hp.com gryf1 alias-node1
10.8.0.131 gryf.uksr.hp.com gryf2 alias-node1
10.8.1.131 gryf.uksr.hp.com gryf3 alias-node1
15.145.162.132 sly.uksr.hp.com sly1 alias-node2
10.8.0.132 sly.uksr.hp.com sly2 alias-node2
10.8.1.132 sly.uksr.hp.com sly3 alias-node2

IMPORTANT: Serviceguard does not support aliases for IPv6 addresses.


For information about configuring an IPv6–only cluster, or a cluster that uses a combination of IPv6
and IPv4 addresses for the nodes' hostnames, see About Hostname Address Families: IPv4-
Only, IPv6-Only, and Mixed Mode.

Safeguarding against Loss of Name Resolution Services


When you employ any user-level Serviceguard command (including cmviewcl), the command uses the
name service you have configured (such as DNS) to obtain the addresses of all the cluster nodes. If the
name service is not available, the command could hang or return an unexpected networking error
message.

NOTE: If such a hang or error occurs, Serviceguard and all protected applications will continue working
even though the command you issued does not. That is, only the Serviceguard configuration commands
(and corresponding Serviceguard Manager functions) are affected, not the cluster daemon or package
services.

The procedure that follows shows how to create a robust name-resolution configuration that will allow
cluster nodes to continue communicating with one another if a name service fails.

Procedure

1. Edit the /etc/hosts file on all nodes in the cluster. Add name resolution for all heartbeat IP
addresses, and other IP addresses from all the cluster nodes; see Configuring Name Resolution for
discussion and examples.

NOTE: For each cluster node, the public-network IP address must be the first address listed. This
enables other applications to talk to other nodes on public networks.

2. If you are using DNS, make sure your name servers are configured in /etc/resolv.conf, for
example:

Safeguarding against Loss of Name Resolution Services 173


domain cup.hp.com

search cup.hp.com

nameserver 15.243.128.51

nameserver 15.243.160.51

3. Edit or create the /etc/nsswitch.conf file on all nodes and add the following text, if it does not
already exist:

• for DNS, enter (two lines) :


hosts: files [NOTFOUND=continue UNAVAIL=continue] dns [NOTFOUND=return UNAVAIL=return]

• for NIS, enter (two lines) :


hosts: files [NOTFOUND=continue UNAVAIL=continue] nis [NOTFOUND=return UNAVAIL=return]

If a line beginning with the string hosts: already exists, then make sure that the text immediately to
the right of this string is (on one line):
files [NOTFOUND=continue UNAVAIL=continue] dns [NOTFOUND=return UNAVAIL=return]

or

files [NOTFOUND=continue UNAVAIL=continue] nis [NOTFOUND=return UNAVAIL=return]

This step is critical, allowing the cluster nodes to resolve hostnames to IP addresses while DNS, NIS,
or the primary LAN is down.
4. Create a $SGCONF/cmclnodelist file on all nodes that you intend to configure into the cluster, and
allow access by all cluster nodes. See .

NOTE: Hewlett Packard Enterprise recommends that you also make the name service itself highly
available, either by using multiple name servers or by configuring the name service into a Serviceguard
package.

Ensuring Consistency of Kernel Configuration


Make sure that the kernel configurations of all cluster nodes are consistent with the expected behavior of
the cluster during failover. In particular, if you change any kernel parameters on one cluster node, they
may also need to be changed on other cluster nodes that can run the same packages.

Enabling the Network Time Protocol


Hewlett Packard Enterprise strongly recommends that you enable network time protocol (NTP) services
on each node in the cluster. The use of NTP, which runs as a daemon process on each system, ensures
that the system time on all nodes is consistent, resulting in consistent timestamps in log files and
consistent behavior of message services. This ensures that applications running in the cluster are
correctly synchronized. The NTP services daemon, xntpd, should be running on all nodes before you
begin cluster configuration. The NTP configuration file is /etc/ntp.conf.

Channel Bonding
Channel bonding of LAN interfaces is implemented by the use of the bonding driver, which is installed in
the kernel at boot time.

174 Ensuring Consistency of Kernel Configuration


Bonding can be defined in different modes:

• Mode 0, which is used for load balancing, uses all slave devices within the bond in parallel for data
transmission. This can be done when the LAN interface cards are connected to an Ethernet switch,
with the ports on the switch configured as Fast EtherChannel trunks. Two switches should be cabled
together as an HA grouping to allow package failover.
• For high availability, in which one slave serves as a standby for the bond and the other slave transmits
data, install the bonding module in mode 1. This is most appropriate for dedicated heartbeat
connections that are cabled through redundant network hubs or switches that are cabled together.
• Mode 4, which is used for dynamic link aggregation, creates aggregation groups that share the same
speed and duplex settings. It utilizes all slaves in the active aggregator according to the 802.3ad
specification.

NOTE: Hewlett Packard Enterprise recommends you to set the LACP (Link Aggregation Control
Protocol) timer value to short at the switches to enable faster detection and reconciliation from the link
failures.

Implementing Channel Bonding (Red Hat)


This section applies to Red Hat installations. If you are using a SUSE distribution, skip ahead to the next
section.
If the bonding driver is installed, the networking software recognizes bonding definitions that are created
in the /etc/sysconfig/network-scripts directory for each bond. For example, the file named
ifcfg-bond0 defines bond0 as the master bonding unit, and the ifcfg-eth0 and ifcfg-eth1
scripts define each individual interface as a slave.
For more information on networking bonding, make sure you have installed the kernel-doc rpm, and
see:
/usr/share/doc/kernel-doc-<version>/Documentation/networking/bonding.txt

NOTE:
Hewlett Packard Enterprise recommends that you do the bonding configuration from the system console,
because you will need to restart networking from the console when the configuration is done.

Sample Configuration
Configure the following files to support LAN redundancy. For a single failover only one bond is needed.

1. Create a bond0 file, ifcfg-bond0.


Create the configuration in the /etc/sysconfig/network-scripts directory. For example, in the
file, ifcfg-bond0, bond0 is defined as the master (for your installation, substitute the appropriate
values for your network instead of 192.168.1.1).
Include the following information in the ifcfg-bond0 file:
DEVICE=bond0
IPADDR=192.168.1.1
NETMASK=255.255.255.0
NETWORK=192.168.1.0
BROADCAST=192.168.1.255
ONBOOT=yes

Implementing Channel Bonding (Red Hat) 175


BOOTPROTO=none
USERCTL=no
For Red Hat 5 and Red Hat 6 only, add the following line to the ifcfg-bond0file:
BONDING OPTS=’miimon=100 mode=1’

2. Create an ifcfg-ethn file for each interface in the bond. All interfaces should have SLAVE and
MASTER definitions. For example, in a bond that uses eth0 and eth1, edit the ifcfg-eth0 file to
appear as follows:
DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
Edit the ifcfg-eth1 file to appear as follows:
DEVICE=eth1
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none
For Red Hat 5 and Red Hat 6 only, add a line containing the hardware (MAC) address of the interface
to the corresponding ifcfg-ethn slave file, for example:
HWADDR=00:12:79:43:5b:f4

3. Add the following lines to /etc/modprobe.conf:


alias bond0 bonding options bond0 miimon=100 mode=1
Use MASTER=bond1 for bond1 if you have configured a second bonding interface, then add the
following after the first bond (bond0): options bond1 -o bonding1 miimon=100 mode=1

NOTE: During configuration, you need to make sure that the active slaves for the same bond on each
node are connected the same hub or switch. You can check on this by examining the file /proc/net/
bonding/bond<x>/info on each node. This file will show the active slave for bond x.

Restarting Networking
Restart the networking subsystem. From the console of either node in the cluster, execute the following
command on a Red Hat system:

/etc/rc.d/init.d/network restart

NOTE: It is better not to restart the network from outside the cluster subnet, as there is a chance the
network could go down before the command can complete.

The command prints bringing up network statements.


If there was an error in any of the bonding configuration files, the network might not function properly. If
this occurs, check each configuration file for errors, then try to restart the network again.

176 Building an HA Cluster Configuration


Viewing the Configuration
You can test the configuration and transmit policy with ifconfig. For the configuration created above,
the display should look like this:

/sbin/ifconfig

bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4


inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:0

eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4


inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:100
Interrupt:10 Base address:0x1080

eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4


inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:9 Base address:0x1400

Implementing Channel Bonding (SUSE)


If you are using a Red Hat distribution, use the procedures described in the previous section. The
following applies only to the SUSE distributions.
First run yast/yast2 and configure Ethernet devices as DHCP so they create the ifcfg-eth-id-
<mac> files.
Next modify each of ifcfg-eth-id-<mac> files that you want to bond, they are located in /etc/
sysconfig/network, and change them from:
BOOTPROTO='dhcp'
MTU=''
REMOTE_IPADDR=''
STARTMODE='onboot'
UNIQUE='gZD2.ZqnB7JKTdX0'
_nm_name='bus-pci-0000:00:0b.0'
to:
BOOTPROTO='none'
STARTMODE='onboot'
UNIQUE='gZD2.ZqnB7JKTdX0'
_nm_name='bus-pci-0000:00:0b.0'

NOTE: Do not change the UNIQUE and _nm_name parameters. You can leave MTU and
REMOTE_IPADDR in the file as long as they are not set.

Implementing Channel Bonding (SUSE) 177


Next, in /etc/sysconfig/network, edit your ifcfg-bond0 file so it looks like this:
BROADCAST='172.16.0.255'
BOOTPROTO='static'
IPADDR='172.16.0.1'
MTU=''
NETMASK='255.255.255.0'
NETWORK='172.16.0.0'
REMOTE_IPADDR=''
STARTMODE='onboot'
BONDING_MASTER='yes'
BONDING_MODULE_OPTS='miimon=100 mode=1'
BONDING_SLAVE0='eth0'
BONDING_SLAVE1='eth1'
The above example configures bond0 with mii monitor equal to 100 and active-backup mode. Adjust
the IP, BROADCAST, NETMASK, and NETWORK parameters to correspond to your configuration.
As you can see, you are adding the configuration options BONDING_MASTER,
BONDING_MODULE_OPTS, and BONDING_SLAVE. BONDING_MODULE_OPTS are the additional
options you want to pass to the bonding module. You cannot pass max_bonds as an option, and you do
not need to because the ifup script will load the module for each bond needed.
BONDING_SLAVE tells ifup which Ethernet devices to enslave to bond0. So if you wanted to bond four
Ethernet devices you would add:
BONDING_SLAVE2='eth2'
BONDING_SLAVE3='eth3'

NOTE: Use ifconfig to find the relationship between eth IDs and the MAC addresses.

For more networking information on bonding, see /usr/src/linux<kernel_version>/


Documentation/networking/bonding.txt .

Restarting Networking
Restart the networking subsystem. From the console of any node in the cluster, execute the following
command on a SUSE system:
/etc/init.d/network restart

NOTE:
It is better not to restart the network from outside the cluster subnet, as there is a chance the network
could go down before the command can complete.

If there is an error in any of the bonding configuration files, the network may not function properly. If this
occurs, check each configuration file for errors, then try to start the network again.

Setting up a Lock LUN


Serviceguard supports the usage of either a partitioned disk or a whole LUN as a lock LUN. This section
describes how to create a lock LUN on a partitioned disk and on a whole LUN.

178 Setting up a Lock LUN


NOTE:

• An iSCSI storage device does not support configuring a lock LUN.


• A storage device of type Dynamically linked storage configuration does not support configuring lock
LUN. For description about Dynamically linked storage configuration, see Table 4: Storage
configuration type in a VMware Environment on page 71

Creating a Lock LUN on a Partitioned Disk


The lock LUN can be created on a partition of one cylinder of at least 100K defined (via the fdisk
command) as type Linux (83).
You will need the pathnames for the lock LUN as it is seen on each cluster node. On one node, use the
fdisk command to define a partition of 1 cylinder, type 83, on this LUN. Here is an example:
Respond to the prompts as shown in the following table to set up the lock LUN partition:

fdisk <Lock LUN Device File>

Table 8: Changing Linux Partition Types

Prompt Response Action Performed

1. Command (m for help): n Create new partition

2. Partition number (1-4): 1 Partition affected

3. Hex code (L to list codes): 83 Set partition to type to Linux, default

4. Command (m for help): 1 Define first partition

5. Command (m for help): 1 Set size to 1 cylinder

6. Command (m for help): p Display partition data

7. Command (m for help): w Write data to the partition table

The following example of the fdisk dialog shows that the disk on the device file /dev/sdc is set to
Smart Array type partition, and appears as follows:

fdisk /dev/sdc

Command (m for help): n


Partition number (1-4): 1
HEX code (type L to list codes): 83
Command (m for help): 1
Command (m for help): 1
Command (m for help): p
Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders
Units = cylinders of 2048 * 512 bytes
Device Boot Start End Blocks Id System
/dev/sdc 1 1 1008 83 Linux
Command (m for help): w

Building an HA Cluster Configuration 179


The partition table has been altered!

NOTE: Follow these rules:

• Do not try to use LVM to configure the lock LUN.


• The partition type must be 83.
• Do not create any filesystem on the partition used for the lock LUN.
• Do not use md to configure multiple paths to the lock LUN.

• Hewlett Packard Enterprise recommends that you can configure the same multipath device name for
all the LUNs on various nodes of a cluster, so that the administration of the cluster can be made
easier.

Rules and Restrictions for Lock LUN


On Red Hat Enterprise Linux 7, Serviceguard supports alias name only ending with alphabets to create
lock LUN.
On Red Hat Enterprise Linux 7, Serviceguard supports only user friendly named mapper device to create
lock LUN. For information about how to setup user friendly named mapper device, see Red Hat
Enterprise Linux 7 DM Multipath Configuration and Administration available at https://
access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/pdf/DM_Multipath/
Red_Hat_Enterprise_Linux-7-DM_Multipath-en-US.pdf.
To transfer the disk partition format to other nodes in the cluster use the command:

sfdisk -R <device>

where <device> corresponds to the same physical device as on the first node. For example,
if /dev/sdc is the device name on the other nodes use the command:

sfdisk -R /dev/sdc

You can check the partition table by using the command:

fdisk -l /dev/sdc

NOTE: fdisk may not be available for SUSE on all platforms. In this case, using YAST2 to set up the
partitions is acceptable.

Creating a Lock LUN on a Whole LUN


The lock LUN can be created on a whole LUN of at least 100K starting with the below patches.
On Red Hat Enterprise Linux server, you have to install the following patches on Serviceguard Linux
Version A.11.20.00:

• SGLX_00339 for Red Hat Enterprise Linux 5 (x86_64 architecture)


• SGLX_00340 for Red Hat Enterprise Linux 6 (x86_64 architecture)

Patches can be downloaded from Hewlett Packard Enterprise Support Center at http://www.hpe.com/
info/hpesc.

180 Building an HA Cluster Configuration


NOTE: On SUSE Linux Enterprise Server, the patches are not required as this feature is supported on
Serviceguard Linux Version A.11.20.10 main release.

Support for Lock LUN Devices


The following table describes the support for lock LUN devices on udev and device mapper:

If udev device is selected as This is supported, but the same udev rules must be used across all
lock LUN. nodes in the cluster for the whole LUN or the partitioned LUN.

If /dev/disk/by-id, /dev/ This is not supported on a whole LUN or a partitioned LUN.


disk/by-path, and /dev/
disk/by-uuid device is
selected as lock LUN.

If /dev/dm-xx is selected as This is not supported on a whole LUN or a partitioned LUN.


lock LUN.

If /dev/mpath/mpathX is This is supported on a whole LUN and a partitioned LUN.


selected as lock LUN.

If /dev/mapper/mpathX This is supported on a whole LUN and a partitioned LUN.


(user-friendly names) is selected
as lock LUN.

If /dev/mapper/xxx(aliases) This is supported on a whole LUN and a partitioned LUN.


is selected as lock LUN.

Setting Up and Running the Quorum Server


If you will be using a quorum server rather than a lock LUN, the Quorum Server software must be
installed on a system other than the nodes on which your cluster will be running, and must be running
during cluster configuration.
For detailed discussion, recommendations, and instructions for installing, updating, configuring, and
running the Quorum Server, see the HPE Serviceguard Quorum Server Version A.12.00.30 Release
Notes at http://www.hpe.com/info/linux-serviceguard-docs (Select HP Serviceguard Quorum Server
Software). See also the discussion of the QS_HOST and QS_ADDR parameters under Cluster
Configuration Parameters on page 111.

Creating the Logical Volume Infrastructure


Serviceguard makes use of shared disk storage. This is set up to provide high availability by using
redundant data storage and redundant paths to the shared devices. Storage for a Serviceguard package
is logically composed of LVM Volume Groups that are activated on a node as part of starting a package
on that node. Storage is generally configured on logical units (LUNs).
Disk storage for Serviceguard packages is built on shared disks that are cabled to multiple cluster nodes.
These are separate from the private Linux root disks, which include the boot partition and root file
systems. To provide space for application data on shared disks, create disk partitions using the fdisk,
and build logical volumes with LVM.
You can build a cluster (next section) before or after defining volume groups for shared data storage. If
you create the cluster first, information about storage can be added to the cluster and package
configuration files after the volume groups are created.

Setting Up and Running the Quorum Server 181


See Volume Managers for Data Storage on page 68 for an overview of volume management in
Serviceguard for Linux. The sections that follow explain how to do the following tasks:

• Displaying Disk Information on page 183


• Creating Partitions on page 183
• Enabling Volume Group Activation Protection on page 185
• Building Volume Groups: Example for Smart Array Cluster Storage (MSA 2000 Series) on page
186
• Building Volume Groups and Logical Volumes on page 187
• Testing the Shared Configuration on page 188
• Storing Volume Group Configuration Data on page 189
• Setting up Disk Monitoring on page 190

CAUTION: The minor numbers used by the LVM volume groups must be the same on all cluster
nodes. This means that if there are any non-shared volume groups in the cluster, create the same
number of them on all nodes, and create them before you define the shared storage. If possible,
avoid using private volume groups, especially LVM boot volumes. Minor numbers increment with
each logical volume, and mismatched numbers of logical volumes between nodes can cause a
failure of LVM (and boot, if you are using an LVM boot volume).

NOTE: Except as noted in the sections that follow, you perform the LVM configuration of shared storage
on only one node. The disk partitions will be visible on other nodes as soon as you reboot those nodes.
After you’ve distributed the LVM configuration to all the cluster nodes, you will be able to use LVM
commands to switch volume groups between nodes. (To avoid data corruption, a given volume group
must be active on only one node at a time).

Before you configure volume group on Red Hat Enterprise Linux 7 or later and SUSE Linux Enterprise
Server 12 or later, you must disable the lvm2 metadata daemon so that up to date volume group data is
seen on all nodes in the cluster. For more information about how to disable lvm metadata daemon on
RHEL 7 and SLES 12, see the following:

• Redhat Enterprise Linux version 7 Logical Volume Manager Administration available at https://
access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/
Logical_Volume_Manager_Administration/index.html
• Storage Administration Guide SUSE Linux Enterprise Server 12 available at https://www.suse.com/
documentation/sles-12/pdfdoc/stor_admin/stor_admin.pdf

NOTE: Enabling the LVM metadata daemon on any Red Hat Enterprise Linux or SUSE Linux Enterprise
Server versions may result in volume group activation problems. Hence this daemon must be disabled by
default on all versions.

Limitation

• Serviceguard supports Multipathing only with Device Mapper (DM) multipath. The cmpreparestg
command might fail to create the LVM volume group, if device has multiple paths but mapper device is
not configured and only one of the path has been provided.
For example, if a device has two paths /dev/sdx and /dev/sdy. Device Mapper devices has not
been configured and if one of the paths will be used with cmpreparestg, then cmpreparestg

182 Building an HA Cluster Configuration


command might fail to create the LVM volume group. Instead, if mapper device /dev/mapper/
mpathx is configured for both paths /dev/sdx and /dev/sdy and mapper device is used with
cmpreparestg command, then it creates the LVM volume group.

• On Red Hat Enterprise Linux 7, Serviceguard supports only user friendly named mapper device to
create LVM volume group. For information about how to setup user friendly named mapper device,
see Red Hat Enterprise Linux 7 DM Multipath Configuration and Administration available at https://
access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/pdf/DM_Multipath/
Red_Hat_Enterprise_Linux-7-DM_Multipath-en-US.pdf.

Displaying Disk Information


To display a list of configured disks, use the following command:

fdisk -l

You will see output such as the following:


Disk /dev/sda: 64 heads, 32 sectors, 8678 cylinders
Units = cylinders of 2048 * 512 bytes

Device Boot Start End Blocks Id System


/dev/sda1 * 1 1001 1025008 83 Linux
/dev/sda2 1002 8678 7861248 5 Extended
/dev/sda5 1002 4002 3073008 83 Linux
/dev/sda6 4003 5003 1025008 82 Linux swap
/dev/sda7 5004 8678 3763184 83 Linux

Disk /dev/sdb: 64 heads, 32 sectors, 8678 cylinders


Units = cylinders of 2048 * 512 bytes

Device Boot Start End Blocks Id System

Disk /dev/sdc: 255 heads, 63 sectors, 1106 cylinders


Units = cylinders of 16065 * 512 bytesDisk /dev/sdd: 255 heads, 63 sectors, 1106 cylinders
Units = cylinders of 16065 * 512 bytes

In this example, the disk described by device file /dev/sda has already been partitioned for Linux, into
partitions named /dev/sda1 - /dev/sda7. The second internal device /dev/sdb and the two external
devices /dev/sdc and /dev/sdd have not been partitioned.

NOTE: fdisk may not be available for SUSE on all platforms. In this case, using YAST2 to set up the
partitions is acceptable.

Creating Partitions
You must define a partition on each disk device (individual disk or LUN in an array) that you want to use
for your shared storage. Use the fdisk command for this.
The following steps create the new partition:

1. Run fdisk, specifying your device file name in place of <DeviceName>:

# fdisk <DeviceName>

Respond to the prompts as shown in the following table, to define a partition:

Displaying Disk Information 183


Prompt Response Action Performed

1. Command (m for help): n Create a new partition

2. Command action e extended p p Creation a primary partition


primary partition (1-4)

3. Partition number (1-4): 1 Create partition 1

4. First cylinder (1-nn, default 1): Enter Accept the default starting cylinder
1

5. Last cylinder or +size or +sizeM Enter Accept the default, which is the last
or +sizeK (1-nn, default nn): cylinder number

6. Command (m for help): p Display partition data

7. Command (m for help): w Write data to the partition table

The following example of the fdisk dialog shows that the disk on the device file /dev/sdc is
configured as one partition, and appears as follows:

fdisk /dev/sdc
Command (m for help): n
Command action
e extended
p primary partition (1-4) p
Partition number (1-4): 1
First cylinder (1-4067, default 1): Enter
Using default value 1Last cylinder or +size or +sizeM or +sizeK (1-4067, default 4067): Enter
Using default value 4067

Command (m for help): p


Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders
Units = cylinders of 2048 * 512 bytes

Device Boot Start End Blocks Id System


/dev/sdc 1 4067 4164592 83 Linux

Command (m for help): w


The partition table has been altered!

2. Respond to the prompts as shown in the following table to set a partition type:

Prompt Response Action Performed

1. Command (m for help): t Set the partition type

2. Partition number (1-4): 1 Partition affected

3 Hex code (L to list codes): 8e Set partition to type to Linux LVM

4. Command (m for help): p Display partition data

5. Command (m for help): w Write data to the partition table

184 Building an HA Cluster Configuration


The following example of the fdisk dialog describes that the disk on the device file /dev/sdc is set
to Smart Array type partition, and appears as follows:

fdisk /dev/sdc
Command (m for help): t
Partition number (1-4): 1
HEX code (type L to list codes): 8e

Command (m for help): p


Disk /dev/sdc: 64 heads, 32 sectors, 4067 cylinders
Units = cylinders of 2048 * 512 bytes

Device Boot Start End Blocks Id System


/dev/sdc 1 4067 4164592 8e Linux LVM

Command (m for help): w


The partition table has been altered!

3. Repeat this process for each device file that you will use for shared storage.

fdisk /dev/sdd

fdisk /dev/sdf

fdisk /dev/sdg
4. If you will be creating volume groups for internal storage, make sure to create those partitions as well,
and create those volume groups before you define the shared storage.

fdisk /dev/sddb

NOTE: fdisk may not be available for SUSE on all platforms. In this case, using YAST2 to set up the
partitions is acceptable.

Enabling Volume Group Activation Protection


As of Serviceguard for Linux A.11.16.07, you can enable activation protection for logical volume groups,
preventing the volume group from being activated by more than one node at the same time. Activation
protection, if used, must be enabled on each cluster node.
Follow these steps to enable activation protection for volume groups on Red Hat and SUSE systems:

IMPORTANT: Perform this procedure on each node.

Procedure

1. Edit /etc/lvm/lvm.conf and add the following line:


tags { hosttags = 1 }

2. Uncomment the line in /etc/lvm/lvm.conf that begins # volume_list =, and edit it to include
all of the node's "private" volume groups (those not shared with the other cluster nodes), including the
root volume group.

Enabling Volume Group Activation Protection 185


For example if the root volume group is vg00 and the node also uses vg01 and vg02 as private
volume groups, the line should look like this:

volume_list = [ "vg00", "vg01", "vg02" ]

3. Create the file /etc/lvm/lvm_$(uname -n).conf

4. Add the following line to the file you created in step 3:


activation { volume_list=[“@node”] }

where node is the value of uname -n.

5. Run vgscan:
vgscan

NOTE: At this point, the setup for volume-group activation protection is complete. Serviceguard adds a
tag matching the uname -n value of the owning node to each volume group defined for a package
when the package runs and deletes the tag when the package halts. The command vgs -o +tags
vgname will display any tags that are set for a volume group.
The sections that follow take you through the process of configuring volume groups and logical
volumes, and distributing the shared configuration. When you have finished that process, use the
procedure under Testing the Shared Configuration on page 188 to verify that the setup has been
done correctly.

Building Volume Groups: Example for Smart Array Cluster Storage (MSA 2000 Series)

NOTE: For information about setting up and configuring the MSA 2000 for use with Serviceguard, see
HPE Serviceguard for Linux Version A.11.19 or later Deployment Guide at http://www.hpe.com/info/
linux-serviceguard-docs.

Use Logical Volume Manager (LVM) on your system to create volume groups that can be activated by
Serviceguard packages. This section provides an example of creating Volume Groups on LUNs created
on MSA 2000 Series storage. For more information on LVM, see the Logical Volume Manager How To,
which you can find at http://tldp.org/HOWTO/HOWTO-INDEX/howtos.html.
Before you start, partition your LUNs and label them with a partition type of 8e (Linux LVM). Use the type
t parameter of the fdisk command to change from the default of 83 (Linux).
Do the following on one node:

NOTE: You can create a single logical volume or multiple logical volumes using cmpreparestg (1m). If
you use cmpreparestg, you can skip the following steps.

1. Update the LVM configuration and create the /etc/lvmtab file. You can omit this step if you have
previously created volume groups on this node.

vgscan

NOTE: The files /etc/lvmtab and /etc/lvmtab.d may not exist on some distributions. In that
case, ignore references to these files.

2. Create LVM physical volumes on each LUN. For example:

pvcreate -f /dev/sda1

186 Building Volume Groups: Example for Smart Array Cluster Storage (MSA 2000 Series)
pvcreate -f /dev/sdb1

pvcreate -f /dev/sdc1
3. Check whether there are already volume groups defined on this node. Be sure to give each volume
group a unique name.

vgdisplay
4. Create separate volume groups for each Serviceguard package you will define. In the following
example, we add the LUNs /dev/sda1 and /dev/sdb1 to volume group vgpkgA, and /dev/sdc1
to vgpkgB:

vgcreate --addtag $(uname -n) /dev/vgpkgA /dev/sda1 /dev/sdb1

vgcreate --addtag $(uname -n) /dev/vgpkgB /dev/sdc1

NOTE: Use vgchange --addtag only if you are implementing volume-group activation protection.
Remember that volume-group activation protection, if used, must be implemented on each node.

Building Volume Groups and Logical Volumes

Procedure

1. Use Logical Volume Manager (LVM) to create volume groups that can be activated by Serviceguard
packages.
For an example showing volume-group creation on LUNs, see Building Volume Groups: Example
for Smart Array Cluster Storage (MSA 2000 Series) on page 186. (For Fibre Channel storage you
would use device-file names such as those used in the section Creating Partitions on page 183).
2. On Linux distributions that support it, enable activation protection for volume groups. See Enabling
Volume Group Activation Protection on page 185.
3. To store data on these volume groups you must create logical volumes. The following creates a 500
Megabyte logical volume named /dev/vgpkgA/lvol1 and a one Gigabyte logical volume
named /dev/vgpkgA/lvol2 in volume group vgpkgA:

lvcreate -L 500M vgpkgA

lvcreate -L 1G vgpkgA
4. Create a file system on one of these logical volumes, and mount it in a newly created directory:

NOTE: You can create file systems using the cmpreparestg (1m)command. If you use
cmpreparestg, you can skip this step.

mke2fs -j /dev/vgpkgA/lvol1

mkdir /extra

mount -t ext3 /dev/vgpkgA/lvol1 /extra

NOTE: For information about supported filesystem types, see the fs_type discussion on.

5. To test that the file system /extra was created correctly and with high availability, you can create a
file on it, and read it.

Building Volume Groups and Logical Volumes 187


echo "Test of LVM" >> /extra/LVM-test.conf

cat /extra/LVM-test.conf

NOTE: Be careful if you use YAST or YAST2 to configure volume groups, as that may cause all volume
groups on that system to be activated. After running YAST or YAST2, check to make sure that volume
groups for Serviceguard packages not currently running have not been activated, and use LVM
commands to deactivate any that have. For example, use the command vgchange -a n /dev/
sgvg00 to deactivate the volume group sgvg00.

Testing the Shared Configuration


When you have finished the shared volume group configuration, you can test that the storage is correctly
sharable as follows:

Procedure

1. On ftsys9, activate the volume group, mount the file system that was built on it, write a file in the
shared file system and look at the result:
vgchange --addtag $(uname -n) vgpkgB

NOTE: If you are using the volume-group activation protection feature of Serviceguard for Linux, you
must use vgchange --addtag to add a tag when you manually activate a volume group. Similarly,
you must remove the tag when you deactivate a volume group that will be used in a package (as
shown at the end of each step).
Use vgchange --addtag and vgchange --deltag only if you are implementing volume-group
activation protection. Remember that volume-group activation protection, if used, must be
implemented on each node.
Serviceguard adds a tag matching the uname -n value of the owning node to each volume group
defined for a package when the package runs; the tag is deleted when the package is halted. The
command vgs -o +tags vgname will display any tags that are set for a volume group.

vgchange -a y vgpkgB
mount /dev/vgpkgB/lvol1 /extra
echo ‘Written by’ ‘hostname‘ ‘on’ ‘date‘ > /extra/datestamp
cat /extra/datestamp
You should see something like the following, showing the date stamp written by the other node:
Written by ftsys9.mydomain on Mon Jan 22 14:23:44 PST 2006
Now unmount the volume group again:
umount /extra
vgchange -a n vgpkgB
vgchange --deltag $(uname -n) vgpkgB

2. On ftsys10, activate the volume group, mount the file system, write a date stamp on to the shared
file, and then look at the content of the file:
vgchange --addtag $(uname -n) vgpkgB
vgchange -a y vgpkgB
mount /dev/vgpkgB/lvol1 /extra
echo ‘Written by’ ‘hostname‘ ‘on’ ‘date‘ >> /extra/datestamp
cat /extra/datestamp

188 Testing the Shared Configuration


You should see something like the following, including the date stamp written by the other node:
Written by ftsys9.mydomain on Mon Jan 22 14:23:44 PST 2006
Written by ftsys10.mydomain on Mon Jan 22 14:25:27 PST 2006

Now unmount the volume group again, and remove the tag you added in step 1:
umount /extra
vgchange -a n vgpkgB
vgchange --deltag $(uname -n) vgpkgB

NOTE: The volume activation protection feature of Serviceguard for Linux requires that you add the
tag as shown at the beginning of the above steps when you manually activate a volume group.
Similarly, you must remove the tag when you deactivate a volume group that will be used in a package
(as shown at the end of each step). As of Serviceguard for Linux A.11.16.07, a tag matching the
uname -n value of the owning node is automatically added to each volume group defined for a
package when the package runs; the tag is deleted when the package is halted. The command vgs -
o +tags vgname will display any tags that are set for a volume group.

Storing Volume Group Configuration Data


When you create volume groups, LVM creates a backup copy of the volume group configuration on the
configuration node. In addition, you should create a backup of configuration data on all other nodes where
the volume group might be activated by using the vgcfgbackup command:

vgcfgbackup vgpkgA vgpkgB

If a disk in a volume group must be replaced, you can restore the old disk’s metadata on the new disk by
using the vgcfgrestore command. See “Replacing Disks” in the “Troubleshooting” chapter.

Preventing Boot-Time vgscan and Ensuring Serviceguard Volume Groups Are


Deactivated
By default, Linux will perform LVM startup actions whenever the system is rebooted. These include a
vgscan (on some Linux distributions) and volume group activation. This can cause problems for volumes
used in a Serviceguard environment (for example, a volume group for a Serviceguard package that is not
currently running may be activated). To prevent such problems, proceed as follows on the various Linux
versions.

NOTE:
You do not need to perform these actions if you have implemented volume-group activation protection as
described under Enabling Volume Group Activation Protection on page 185.

SUSE Linux Enterprise Server


Prevent a vgscan at boot time by removing the /etc/rc.d/boot.d/S07boot.lvm file from all cluster
nodes.

NOTE:
Be careful if you use YAST or YAST2 to configure volume groups, as that may cause all volume groups to
be activated. After running YAST or YAST2, check that volume groups for Serviceguard packages not
currently running have not been activated, and use LVM commands to deactivate any that have. For
example, use the command vgchange -a n /dev/sgvg00 to deactivate the volume group sgvg00.

Red Hat

Storing Volume Group Configuration Data 189


It is not necessary to prevent vgscan on Red Hat.
To deactivate any volume groups that will be under Serviceguard control, add vgchange commands to
the end of /etc/rc.d/rc.sysinit; for example, if volume groups sgvg00 and sgvg01 are under
Serviceguard control, add the following lines to the end of the file:
vgchange -a n /dev/sgvg00
vgchange -a n /dev/sgvg01
The vgchange commands activate the volume groups temporarily, then deactivate them; this is expected
behavior.

Setting up Disk Monitoring


Serviceguard for Linux includes a Disk Monitor which you can use to detect problems in disk connectivity.
This lets you fail a package over from one node to another in the event of a disk link failure.
See Creating a Disk Monitor Configuration on page 254 for instructions on configuring disk
monitoring.

Creating a Storage Infrastructure with VxVM


In addition to configuring the cluster, you create the appropriate logical volume infrastructure to provide
access to data from different nodes. This is done with Logical Volume Manager (LVM) or Veritas Volume
Manager (VxVM). You can also use a mixture of volume types, depending on your needs. LVM and VxVM
configuration are done before cluster configuration.
For more information about how to migrate from LVM data storage to VxVM data storage, see the
following documents at https://sort.veritas.com/documents/doc_details/sfha/6.0.1/Linux/
ProductGuides/:

• Veritas Storage Foundation and High Availability Solutions, Solutions Guide


• Veritas Storage Foundation and High Availability Installation Guide
• Veritas Storage Foundation Cluster File System High Availability Administrator's Guide

Converting Disks from LVM to VxVM


You can use the vxvmconvert(1m) utility to convert LVM volume groups into VxVM disk groups. Before
you do this, you must not deactivate the volume group or any logical volumes. The LVM volume group
must not be mounted. Before you start, ensure to create a backup of each volume group’s configuration
with the vgcfgbackup command and make a backup of the data in the volume group. For more
information about how to migrate from LVM data storage to VxVM data storage, see the following
documents at https://sort.veritas.com/documents/doc_details/sfha/6.0.1/Linux/ProductGuides/:

• Veritas Storage Foundation and High Availability Solutions, Solutions Guide


• Veritas Storage Foundation and High Availability Installation Guide
• Veritas Storage Foundation Cluster File System High Availability Administrator's Guide

Initializing Disks for VxVM


You need to initialize the physical disks that are employed in VxVM disk groups. To initialize a disk, log on
to one node in the cluster, then use the vxdiskadm program to initialize multiple disks, or use the
vxdisksetup command to initialize one disk at a time, as in the following example:
/usr/lib/vxvm/bin/vxdisksetup -i sda

190 Setting up Disk Monitoring


Initializing Disks Previously Used by LVM
If a physical disk has been previously used with LVM, you must use the pvremove command to delete
the LVM header data from all the disks in the volume group.

NOTE:
These commands make the disk and its data unusable by LVM and allow it to be initialized by VxVM.
(The commands should only be used if you have previously used the disk with LVM and do not want to
save the data on it.)

You can remove LVM header data from the disk as in the following example (note that all data on the disk
are erased):
pvremove /dev/sda
Then, use the vxdiskadm program to initialize multiple disks for VxVM or use the vxdisksetup
command to initialize one disk at a time, as in the following example:
/usr/lib/vxvm/bin/vxdisksetup -i sda

Creating Disk Groups

NOTE: You can use cmpreparestg (1m) to create a VxVM disk group. If you use cmpreparestg, you
do not need to perform the procedures that follow, but it is a good idea to read them so that you
understand what cmpreparestg does for you.

Use vxdiskadm or use the vxdg command, to create disk groups, as in the following example:
vxdg init logdata sda
Verify the configuration with the following command:
vxdg list
NAME STATE ID
logdata enabled 972078742.1084.node1

Propagation of Disk Groups in VxVM


A VxVM disk group can be created on any node, whether the cluster is up or not. You must validate the
disk group by trying to import it on each node.

Creating Logical Volumes

NOTE: You can create a single logical volume or multiple logical volumes using cmpreparestg (1m). If
you use cmpreparestg, you can skip this step, but it is a good idea to read them so that you understand
what cmpreparestg does for you.

Use the vxassist command to create logical volumes. The following is an example:
vxassist -g logdata make log_files 1024m
This command creates a 1024 MB volume named log_files in a disk group named logdata. The volume
can be referenced with the block device file /dev/vx/dsk/logdata/log_files or the raw (character)
device file /dev/vx/rdsk/logdata/log_files. Verify the configuration with the following command:
vxprint -g logdata

Initializing Disks Previously Used by LVM 191


The output of this command is shown in the following example:
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTILO PUTILO
v logdata fsgen ENABLED 1024000 ACTIVE
pl logdata-01 system ENABLED 1024000 ACTIVE

NOTE: The specific commands for creating mirrored and multi-path storage using VxVM are described in
the Veritas Volume Manager Reference Guide.

Creating File Systems


If your installation uses file systems, create them next.

NOTE:
You can create file systems by means of the cmpreparestg (1m) command. If you use
cmpreparestg, you can skip the following steps, but it is a good idea to read them so that you
understand what cmpreparestg does for you.

Use the following commands to create a file system for mounting on the logical volume just created:

1. Create the file system on the newly created volume:


mkfs -t vxfs /dev/vx/rdsk/logdata/log_files

2. Create a directory to mount the volume:


mkdir /logs

3. Mount the volume:


mount /dev/vx/dsk/logdata/log_files /logs

4. Check to ensure the file system is present, then unmount the file system:
umount /logs

Monitoring VxVM Disks


The Serviceguard VxVM Volume Monitor provides a means for effective and persistent monitoring of
VxVM volumes. The Volume Monitor supports Veritas Volume Manager version 6.0 and later. You can
configure the Volume Monitor (cmresserviced) to run as a service in a package that requires the
monitored volume or volumes. When a monitored volume fails or becomes inaccessible, the service exits,
causing the package to fail on the current node. (The package’s failover behavior depends on its
configured settings, as with any other failover package.)
For example, the following service_cmd monitors two volumes at the default log level 0, with a default
polling interval of 60 seconds, and prints all log messages to the console:
service_name Volume_mon
service_cmd $SGSBIN/cmresserviced /dev/vx/dsk/dg_dd2/lvol2
service_restart none
service_fail_fast_enabled yes
service_halt_timeout 300
For more information, see the cmresserviced (1m) manpage. For more information about configuring
package services, see the service_name parameter descriptions.

192 Creating File Systems


Deporting Disk Groups
After creating the disk groups that are to be used by Serviceguard packages, use the following command
with each disk group to allow the disk group to be deported by the package control script on other cluster
nodes:
vxdg deport <DiskGroupName>
where <DiskGroupName> is the name of the disk group that are activated by the control script.
When all disk groups have been deported, you must issue the following command on all cluster nodes to
allow them to access the disk groups:
vxdctl enable

Re-Importing Disk Groups


After deporting disk groups, they are not available for use on the node until they are imported again either
by a package control script or with a vxdg import command. If you need to import a disk group
manually for maintenance or other purposes, you import it, start up all its logical volumes, and mount file
systems as in the following example:
vxdg import dg_01
vxvol -g dg_01 startall
mount /dev/vx/dsk/dg_01/myvol /mountpoint

Package Startup Time with VxVM


With VxVM, each disk group is imported by the package control script that uses the disk group. This
means that cluster startup time is not affected, but individual package startup time might be increased
because VxVM imports the disk group at the time of package start up.

Clearimport at System Reboot Time


At system reboot time, the cmcluster RC script does a vxdisk clearimport on all disks formerly
imported by the system, provided they have the noautoimport flag set, and provided they are not currently
imported by another running node. The clearimport clears the host ID on the disk group, to allow any
node that is connected to the disk group to import it when the package moves from one node to another.
Using the clearimport at reboot time allows Serviceguard to clean up following a node failure, for
example, a system crash during a power failure. Disks that were imported at the time of the failure still
have the node’s ID written on them, and this ID must be cleared before the rebooting node or any other
node can import them with a package control script.
Note that the clearimport is done for disks previously imported with noautoimport set on any system
that has Serviceguard installed, whether it is configured in a cluster or not.

Configuring the Cluster


This section describes how to define the basic cluster configuration. This must be done on a system that
is not part of a Serviceguard cluster (that is, on which Serviceguard is installed but not configured). You
can do this in Serviceguard Manager from any node, or from the command line as described below.
Use the cmquerycl command to specify a set of nodes to be included in the cluster and to generate a
template for the cluster configuration file.

IMPORTANT: See NODE_NAME under Cluster Configuration Parameters on page 111 for
important information about restrictions on the node name.

Here is an example of the command (enter it all one line):

Deporting Disk Groups 193


cmquerycl -v -C $SGCONF/clust1.conf -n ftsys9 -n ftsys10
This creates a template file, by default /usr/local/cmcluster/clust1.conf (for Red Hat
Enterprise Linux) and/opt/cmcluster/clust1.conf (for SUSE Linux Enterprise Server). In this
output file, keywords are separated from definitions by white space. Comments are permitted, and must
be preceded by a pound sign (#) in the far left column.

NOTE: Hewlett Packard Enterprise strongly recommends that you modify the file so as to send heartbeat
over all possible networks.

The manpage for the cmquerycl command further explains the parameters that appear in this file. Many
are also described in Planning and Documenting an HA Cluster. Modify your /etc/cmcluster/
clust1.config file as needed.

cmquerycl Options

Speeding up the Process


In a larger or more complex cluster with many nodes, networks or disks, the cmquerycl command may
take several minutes to complete. To speed up the configuration process, you can direct the command to
return selected information only by using the -k and -w options:
-k eliminates some disk probing, and does not return information about potential cluster lock volume
groups and lock physical volumes.
-w local lets you specify local network probing, in which LAN connectivity is verified between interfaces
within each node only. This is the default when you use cmquerycl with the-C option.
(Do not use -w local if you need to discover nodes and subnets for a cross-subnet configuration; see
Full Network Probing).
-w none skips network querying. If you have recently checked the networks, this option will save time.

Specifying the Address Family for the Cluster Hostnames


You can use the -a option to tell Serviceguard to resolve cluster node names (as well as Quorum Server
hostnames, if any) to IPv4 addresses only (-a ipv4) IPv6 addresses only (-a ipv6), or both (-a any).
You can also configure the address family by means of the HOSTNAME_ADDRESS_FAMILY in the
cluster configuration file.

IMPORTANT: See About Hostname Address Families: IPv4-Only, IPv6-Only, and Mixed Mode
for a full discussion, including important restrictions for IPv6–only and mixed modes.

If you use the -a option, Serviceguard will ignore the value of the HOSTNAME_ADDRESS_FAMILY
parameter in the existing cluster configuration, if any, and attempt to resolve the cluster and Quorum
Server hostnames as specified by the -a option:

• If you specify -a ipv4 , each of the hostnames must resolve to at least one IPv4 address; otherwise
the command will fail.
• Similarly, if you specify -a ipv6, each of the hostnames must resolve to at least one IPv6 address;
otherwise the command will fail.
• If you specify -a any, Serviceguard will attempt to resolve each hostname to an IPv4 address, then, if
that fails, to an IPv6 address.

If you do not use the -a option:

194 cmquerycl Options


• If a cluster is already configured, Serviceguard will use the value configured for
HOSTNAME_ADDRESS_FAMILY, which defaults to IPv4.
• If no cluster configured, and Serviceguard finds at least one IPv4 address that corresponds to the local
node's hostname (that is, the node on which you are running cmquerycl), Serviceguard will attempt
to resolve all hostnames to IPv4 addresses. If no IPv4 address is found for a given hostname,
Serviceguard will look for an IPv6 address. (This is the same behavior as if you had specified -a
any).

Specifying the Address Family for the Heartbeat


To tell Serviceguard to use only IPv4, or only IPv6, addresses for the heartbeat, use the -h option. For
example, to use only IPv6 addresses:
cmquerycl -v -h ipv6 -C $SGCONF/clust1.conf -n ftsys9 -n ftsys10

• -h ipv4 tells Serviceguard to discover and configure only IPv4 subnets. If it does not find any eligible
subnets, the command will fail.
• -h ipv6 tells Serviceguard to discover and configure only IPv6 subnets. If it does not find any eligible
subnets, the command will fail.
• If you don't use the -h option, Serviceguard will choose the best available configuration to meet
minimum requirements, preferring an IPv4 LAN over IPv6 where both are available. The resulting
configuration could be IPv4 only, IPv6 only, or a mix of both. You can override Serviceguard's default
choices by means of the HEARTBEAT_IP parameter, discussed under Cluster Configuration
Parameters on page 111; that discussion also spells out the heartbeat requirements.
• The -h and -c options are mutually exclusive.

Specifying the Cluster Lock


You can use the cmquerycl command line to specify a cluster lock LUN (-L lock_lun_device) or quorum
server (-q quorum_server [qs_ip2]). For more details, see cmquerycl (1m)manpage.
For more information, see Specifying a Lock LUN and Specifying a Quorum Server sections.

Full Network Probing


-w full lets you specify full network probing, in which actual connectivity is verified among all LAN
interfaces on all nodes in the cluster, whether or not they are all on the same subnet.

NOTE: This option must be used to discover actual or potential nodes and subnets in a cross-subnet
configuration. See Obtaining Cross-Subnet Information. It will also validate IP Monitor polling targets;
see Monitoring LAN Interfaces and Detecting Failure: IP Level, and POLLING_TARGET under
Cluster Configuration Parameters on page 111.

Specifying a Lock LUN


A cluster lock LUN or quorum server is required for two-node clusters. If you will be using a lock LUN, be
sure to specify the -L lock_lun_device option with the cmquerycl command. If the name of the
device is the same on all nodes, enter the option before the node names, as in the following example (all
on one line):
cmquerycl -v -L /dev/sda1 -n lp01 -n lp02 -C $SGCONF/lpcluster.conf
If the name of the device is different on the different nodes, specify each device file following each node
name, as in the following example (all on one line):

Specifying the Address Family for the Heartbeat 195


cmquerycl -v -n node1 -L /dev/sda1 -n node2 -L /dev/sda2 -C $SGCONF/
lpcluster.conf

NOTE:
An iSCSI storage device does not support configuring a lock LUN.

Specifying a VCENTER_SERVER or ESX_HOST


You can configure either VCENTER_SERVER or ESX_HOST parameter in the cluster configuration file.
Both parameters are optional and are mutually exclusive. For more information about these parameters,
see Cluster Configuration Parameters on page 111 and the cmquerycl (1m) manpage.
To configure the package with VMFS storage, then VCENTER_SERVER or ESX_HOST parameter must
be specified in the cluster configuration file. For information about VMFS storage, see Storage
configuration type in a VMware environment.
You can add, modify, and delete VCENTER_SERVER or ESX_HOST parameter while the cluster is
running.

NOTE: To remove the configured VCENTER_SERVER or ESX_HOST parameter all the packages having
Dynamically linked storage configuration must be deleted from the cluster.

The Serviceguard connects to the configured vCenter or Esxi host in the cluster to attach and detach the
VMFS disks configured in the package to ensure their exclusive access to the VMware Virtual machines.
When you issue attach and detach instructions, Serviceguard must authenticate the session using the
vCenter Server or Esxi host logging credentials. These credentials must be stored in the Serviceguard
Credential Store (SCS). To create SCS, you must use cmvmusermgmt utility which capture and store
vCenter server and Esxi host user credentials on virtual machines configured to be part of the cluster.
These credentials must have already been created on vCenter server or Esxi hosts prior to their use in
Serviceguard cluster. When Serviceguard nodes communicate with the vCenter server or Esxi host to
manage VMFS disks, they need to authenticate themselves using the user credentials stored in the SCS.
The cmvmusermgmt utility must be used to manage the SCS.
If the cluster is already configured and the SCS is to be created or updated, then cmvmusermgmt utility
can be executed on any of the cluster nodes. The changes to the SCS will automatically be distributed to
all the configured cluster nodes. However, when updating the SCS if any of the cluster nodes are not
reachable or down, then the SCS on such node must be synchronized using the sync option once the
node is reachable.
If the cluster has to be created and the SCS is already created on the future cluster nodes, then the
cluster creation operation (cmapplyconf) must be executed on the node where the cmvmusermgmt
utility was used to create the SCS. This will automatically distribute the SCS to all the cluster nodes.
The SCS must be created before applying the cluster with VCENTER_SERVER or ESX_HOST on the
node where the cluster configuration is applied.
Serviceguard validates the details provided in the SCS file and will be copied to all the nodes configured
in the cluster.

196 Specifying a VCENTER_SERVER or ESX_HOST


NOTE:

• The user account provided for the Esxi host or vCenter server must have administrative privileges or
must be a root user. If the user is created on vCenter, then the same user with same privileges must
be created on ESXi also. If you do not want to use the administrator user account or the root user,
create a role with the required privileges for VMwareDisks resource functionality and assign this role to
the user. The role assigned to the user account must have the following privileges:

◦ Low level file operations on datastore


◦ Browse datastore on datastore
◦ Add existing disk on virtual machine
◦ Change resource on virtual machine
◦ Remove disk on virtual machine
◦ Add or remove device on virtual machine

If required you can add additional privileges. For more information about how to create a role and add
a user to the created role, see VMware product documentation.

• If the vCenter or Esxi host password expires, you must run cmvmusermgmt command to update the
SCS file.
• Hewlett Packard Enterprise recommends you to run cmcheckconf command periodically to ensure
that the SCS file contents are verified.
• You must either use IP address or hostname of a VMware vCenter or Esxi host with cmvmusermgmt
command.

Specifying a Quorum Server


IMPORTANT: The following are standard instructions. For special instructions that may apply to
your version of Serviceguard and the Quorum Server see “Configuring Serviceguard to Use the
Quorum Server” in the latest version HPE Serviceguard Quorum Server Version A.12.00.30
Release Notes, at http://www.hpe.com/info/linux-serviceguard-docs (Select HP Serviceguard
Quorum Server Software).

A cluster lock LUN or quorum server, is required for two-node clusters. To obtain a cluster configuration
file that includes Quorum Server parameters, use the -q option of the cmquerycl command, specifying
a Quorum Server hostname or IP address, for example (all on one line):
cmquerycl -q <QS_Host> -n ftsys9 -n ftsys10 -C <ClusterName>.conf
To specify an alternate hostname or IP address by which the Quorum Server can be reached, use a
command such as (all on one line):
cmquerycl -q <QS_Host> <QS_Addr> -n ftsys9 -n ftsys10 -C <ClusterName>.conf
Enter the QS_HOST (IPv4 or IPv6 on SLES 11; IPv4 only on Red Hat 5 and Red Hat 6), optional
QS_ADDR (IPv4 or IPv6 on SLES 11; IPv4 only on Red Hat 5 and Red Hat 6) ,
QS_POLLING_INTERVAL, and optionally a QS_TIMEOUT_EXTENSION; and also check the
HOSTNAME_ADDRESS_FAMILY setting, which defaults to IPv4. See the parameter descriptions under
Cluster Configuration Parameters on page 111.

Specifying a Quorum Server 197


For important information, see also About Hostname Address Families: IPv4-Only, IPv6-Only, and
Mixed Mode; and What Happens when You Change the Quorum Configuration Online.

Obtaining Cross-Subnet Information


As of Serviceguard A.11.18 or later it is possible to configure multiple IPv4 subnets, joined by a router,
both for the cluster heartbeat and for data, with some nodes using one subnet and some another. See
Cross-Subnet Configurations for rules and definitions.
You must use the -w full option to cmquerycl to discover the available subnets.
For example, assume that you are planning to configure four nodes, NodeA, NodeB, NodeC, and NodeD,
into a cluster that uses the subnets 15.13.164.0, 15.13.172.0, 15.13.165.0, 15.13.182.0,
15.244.65.0, and 15.244.56.0.
The following command
cmquerycl –w full –n nodeA –n nodeB –n nodeB –n nodeC –n nodeD
will produce the output such as the following:
Node Names: nodeA
nodeB
nodeC
nodeD

Bridged networks (full probing performed):


1 lan3 (nodeA)
lan4 (nodeA)
lan3 (nodeB)
lan4 (nodeB)
2 lan1 (nodeA)
lan1 (nodeB)
3 lan2 (nodeA)
lan2 (nodeB)
4 lan3 (nodeC)
lan4 (nodeC)
lan3 (nodeD)
lan4 (nodeD)

5 lan1 (nodeC)
lan1 (nodeD)
6 lan2 (nodeC)
lan2 (nodeD)

IP subnets:
IPv4:

198 Obtaining Cross-Subnet Information


15.13.164.0 lan1 (nodeA)
lan1 (nodeB)
15.13.172.0 lan1 (nodeC)
lan1 (nodeD)
15.13.165.0 lan2 (nodeA)
lan2 (nodeB)
15.13.182.0 lan2 (nodeC)
lan2 (nodeD)
15.244.65.0 lan3 (nodeA)
lan3 (nodeB)
15.244.56.0 lan4 (nodeC)
lan4 (nodeD)

IPv6:

3ffe:1111::/64 lan3 (nodeA)


lan3 (nodeB)
3ffe:2222::/64 lan3 (nodeC)
lan3 (nodeD)

Possible Heartbeat IPs:


15.13.164.0
15.13.164.1 (nodeA)
15.13.164.2 (nodeB)
15.13.172.0 15.13.172.158 (nodeC)
15.13.172.159 (nodeD)
15.13.165.0 15.13.165.1 (nodeA)
15.13.165.2 (nodeB)
15.13.182.0 15.13.182.158 (nodeC)
15.13.182.159 (nodeD)
Route connectivity(full probing performed):

1 15.13.164.0
15.13.172.0
2 15.13.165.0
15.13.182.0
3 15.244.65.0
4 15.244.56.0
In the Route connectivity section, the numbers on the left (1-4) identify which subnets are routed to
each other (for example, 15.13.164.0 and 15.13.172.0).

IMPORTANT: Note that in this example subnet 15.244.65.0, used by NodeA and NodeB, is not
routed to 15.244.56.0, used by NodeC and NodeD.
But subnets 15.13.164.0 and 15.13.165.0, used by NodeA and NodeB, are routed respectively
to subnets 15.13.172.0 and 15.13.182.0, used by NodeC and NodeD. At least one such
routing among all the nodes must exist for cmquerycl to succeed.

For information about configuring the heartbeat in a cross-subnet configuration, see the HEARTBEAT_IP
parameter discussion under Cluster Configuration Parameters on page 111.

Building an HA Cluster Configuration 199


Identifying Heartbeat Subnets
The cluster configuration file includes entries for IP addresses on the heartbeat subnet. Hewlett Packard
Enterprise recommends that you use a dedicated heartbeat subnet, and configure heartbeat on other
subnets as well, including the data subnet.
The heartbeat can be on an IPv4 or an IPv6 subnet.
The heartbeat can comprise multiple IPv4 subnets joined by a router. In this case at least two heartbeat
paths must be configured for each cluster node. See also the discussion of HEARTBEAT_IP, and Cross-
Subnet Configurations.

Specifying Maximum Number of Configured Packages


This value must be equal to or greater than the number of packages currently configured in the cluster.
The count includes all types of packages: failover, multi-node, and system multi-node. The maximum
number of packages per cluster is 300. The default is the maximum.

NOTE:
Remember to tune kernel parameters on each node to ensure that they are set high enough for the
largest number of packages that will ever run concurrently on that node.

Modifying the MEMBER_TIMEOUT Parameter


The cmquerycl command supplies a default value of 14 seconds for the MEMBER_TIMEOUT
parameter. Changing this value will directly affect the cluster’s re-formation and failover times. You may
need to increase the value if you are experiencing cluster node failures as a result of heavy system load
or heavy network traffic; or you may need to decrease it if cluster re-formations are taking a long time.
You can change MEMBER_TIMEOUT while the cluster is running.
For more information about node timeouts, see What Happens when a Node Times Out and the
MEMBER_TIMEOUT parameter discussions under Cluster Configuration Parameters on page 111, and
Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low.

Configuring Root Disk Monitoring parameter


Serviceguard version A.12.20.00 monitors for root disk failures. Set the parameters
ROOT_DISK_MONITOR, ROOT_DISK_MONITOR_INTERVAL, and
ROOT_DISK_MONITOR_EXCLUDE_NODES in the cluster configuration file to monitor the root disk.
For more information about configuring Root Disk Monitoring parameter see, Cluster Configuration
Parameters.

Controlling Access to the Cluster


Serviceguard access-control policies define cluster users’ administrative or monitoring capabilities.

A Note about Terminology


Although you will also sometimes see the term role-based access (RBA) in the output of Serviceguard
commands, the preferred set of terms, always used in this manual, is as follows:

200 Identifying Heartbeat Subnets


• Access-control policies - the set of rules defining user access to the cluster.

◦ Access-control policy - one of these rules, comprising the three parameters USER_NAME,
USER_HOST, USER_ROLE. See Setting up Access-Control Policies.

• Access roles - the set of roles that can be defined for cluster users (Monitor, Package Admin, Full
Admin).

◦ Access role - one of these roles (for example, Monitor).

How Access Roles Work


Serviceguard daemons grant access to Serviceguard commands by matching the command user’s
hostname and username against the access control policies you define. Each user can execute only the
commands allowed by his or her role.
The diagram that shows the access roles and their capabilities. The innermost circle is the most trusted;
the outermost the least. Each role can perform its own functions and the functions in all of the circles
outside it. For example, Serviceguard Root can perform its own functions plus all the functions of Full
Admin, Package Admin and Monitor; Full Admin can perform its own functions plus the functions of
Package Admin and Monitor; and so on.

How Access Roles Work 201


Figure 37: Access Roles

Levels of Access
Serviceguard recognizes two levels of access, root and non-root:

• Root access: Full capabilities; only role allowed to configure the cluster.
As Access Roles shows, users with root access have complete control over the configuration of the
cluster and its packages. This is the only role allowed to use the cmcheckconf, cmapplyconf,
cmdeleteconf, and cmmodnet -a commands.
In order to exercise this Serviceguard role, you must log in as the root user (superuser) on a node in
the cluster you want to administer. Conversely, the root user on any node in the cluster always has full
Serviceguard root access privileges for that cluster; no additional Serviceguard configuration is
needed to grant these privileges.

IMPORTANT: Users on systems outside the cluster can gain Serviceguard root access
privileges to configure the cluster only via a secure connection (rsh or ssh).

• Non-root access: Other users can be assigned one of four roles:

202 Levels of Access


◦ Full Admin: Allowed to perform cluster administration, package administration, and cluster and
package view operations.
These users can administer the cluster, but cannot configure or create a cluster. Full Admin
includes the privileges of the Package Admin role.

◦ (all-packages) Package Admin: Allowed to perform package administration, and use cluster and
package view commands.
These users can run and halt any package in the cluster, and change its switching behavior, but
cannot configure or create packages. Unlike single-package Package Admin, this role is defined in
the cluster configuration file. Package Admin includes the cluster-wide privileges of the Monitor
role.

◦ (single-package) Package Admin: Allowed to perform package administration for a specified


package, and use cluster and package view commands.
These users can run and halt a specified package, and change its switching behavior, but cannot
configure or create packages. This is the only access role defined in the package configuration file;
the others are defined in the cluster configuration file. Single-package Package Admin also
includes the cluster-wide privileges of the Monitor role.

◦ Monitor: Allowed to perform cluster and package view operations.


These users have read-only access to the cluster and its packages.

IMPORTANT: A remote user (one who is not logged in to a node in the cluster, and is not
connecting via rsh or ssh) can have only Monitor access to the cluster.
(Full Admin and Package Admin can be configured for such a user, but this usage is deprecated.
As of Serviceguard A.11.18 configuring Full Admin or Package Admin for remote users gives
them Monitor capabilities. See Setting up Access-Control Policies for more information.)

Setting up Access-Control Policies


The root user on each cluster node is automatically granted the Serviceguard root access role on all
nodes. (See Configuring Root-Level Access for more information.) Access-control policies define non-
root roles for other cluster users.

NOTE: For more information and advice, see the white paper Securing Serviceguard at http://
www.hpe.com/info/linux-serviceguard-docs (Select HP Serviceguard -> White Papers).

Define access-control policies for a cluster in the cluster configuration file; see Cluster Configuration
Parameters on page 111. To define access control for a specific package, use user_host and related
parameters in the package configuration file. You can define up to 200 access policies for each cluster. A
root user can create or modify access control policies while the cluster is running.

NOTE: Once nodes are configured into a cluster, the access-control policies you set in the cluster and
package configuration files govern cluster-wide security; changes to the “bootstrap” cmclnodelist file
are ignored (see Allowing Root Access to an Unconfigured Node).

Access control policies are defined by three parameters in the configuration file:

Setting up Access-Control Policies 203


• Each USER_NAME can consist either of the literal ANY_USER, or a maximum of 8 login names from
the /etc/passwd file on USER_HOST. The names must be separated by spaces or tabs, for
example:
# Policy 1:
USER_NAME john fred patrick
USER_HOST bit
USER_ROLE PACKAGE_ADMIN

• USER_HOST is the node where USER_NAME will issue Serviceguard commands.

NOTE: The commands must be issued on USER_HOST but can take effect on other nodes; for
example, patrick can use bit’s command line to start a package on gryf (assuming bit and
gryf are in the same cluster).

Choose one of these three values for USER_HOST:

◦ ANY_SERVICEGUARD_NODE - any node on which Serviceguard is configured, and which is on a


subnet with which nodes in this cluster can communicate (as reported by cmquerycl -w full).

NOTE: If you set USER_HOST to ANY_SERVICEGUARD_NODE, set USER_ROLE to MONITOR;


users connecting from outside the cluster cannot have any higher privileges (unless they are
connecting via rsh or ssh; this is treated as a local connection).
Depending on your network configuration, ANY_SERVICEGUARD_NODE can provide wide-ranging
read-only access to the cluster.

◦ CLUSTER_MEMBER_NODE - any node in the cluster

◦ A specific node name - Use the hostname portion (the first part) of a fully-qualified domain name
that can be resolved by the name service you are using; it should also be in each node’s /etc/
hosts. Do not use an IP addresses or the fully-qualified domain name. If there are multiple
hostnames (aliases) for an IP address, one of those must match USER_HOST. See Configuring
Name Resolution for more information.

• USER_ROLE must be one of these three values:

◦ MONITOR

◦ FULL_ADMIN

◦ PACKAGE_ADMIN

MONITOR and FULL_ADMIN can be set only in the cluster configuration file and they apply to the entire
cluster. PACKAGE_ADMIN can be set in the cluster configuration file or a package configuration file. If it
is set in the cluster configuration file, PACKAGE_ADMIN applies to all configured packages; if it is set in
a package configuration file, it applies to that package only. These roles are not exclusive; for
example, more than one user can have the PACKAGE_ADMIN role for the same package.

NOTE: You do not have to halt the cluster or package to configure or modify access control policies.

204 Building an HA Cluster Configuration


Here is an example of an access control policy:
USER_NAME john
USER_HOST bit
USER_ROLE PACKAGE_ADMIN
If this policy is defined in the cluster configuration file, it grants user john the PACKAGE_ADMIN role for
any package on node bit. User john also has the MONITOR role for the entire cluster, because
PACKAGE_ADMIN includes MONITOR. If the policy is defined in the package configuration file for
PackageA, then user john on node bit has the PACKAGE_ADMIN role only for PackageA.
Plan the cluster’s roles and validate them as soon as possible. If your organization’s security policies
allow it, you may find it easiest to create group logins. For example, you could create a MONITOR role for
user operator1 from CLUSTER_MEMBER_NODE (that is, from any node in the cluster). Then you could
give this login name and password to everyone who will need to monitor your clusters.
Role Conflicts
Do not configure different roles for the same user and host; Serviceguard treats this as a conflict and will
fail with an error when applying the configuration. “Wildcards”, such as ANY_USER and
ANY_SERVICEGUARD_NODE, are an exception: it is acceptable for ANY_USER and john to be given
different roles.

IMPORTANT: Wildcards do not degrade higher-level roles that have been granted to individual
members of the class specified by the wildcard. For example, you might set up the following policy
to allow root users on remote systems access to the cluster:
USER_NAME root
USER_HOST ANY_SERVICEGUARD_NODE
USER_ROLE MONITOR
This does not reduce the access level of users who are logged in as root on nodes in this cluster;
they will always have full Serviceguard root-access capabilities.

Consider what would happen if these entries were in the cluster configuration file:
# Policy 1:
USER_NAME john
USER_HOST bit
USER_ROLE PACKAGE_ADMIN

# Policy 2:
USER_NAME john
USER_HOST bit
USER_ROLE MONITOR

# Policy 3:
USER_NAME ANY_USER
USER_HOST ANY_SERVICEGUARD_NODE
USER_ROLE MONITOR
In the above example, the configuration would fail because user john is assigned two roles. (In any case,
Policy 2 is unnecessary, because PACKAGE_ADMIN includes the role of MONITOR).
Policy 3 does not conflict with any other policies, even though the wildcard ANY_USER includes the
individual user john.

Building an HA Cluster Configuration 205


NOTE: Check spelling especially carefully when typing wildcards, such as ANY_USER and
ANY_SERVICEGUARD_NODE. If they are misspelled, Serviceguard will assume they are specific users or
nodes.

Package versus Cluster Roles


Package configuration will fail if there is any conflict in roles between the package configuration and the
cluster configuration, so it is a good idea to have the cluster configuration file in front of you when you
create roles for a package; use cmgetconf to get a listing of the cluster configuration file.
If a role is configured for a username/hostname in the cluster configuration file, do not specify a role for
the same username/hostname in the package configuration file; and note that there is no point in
assigning a package administration role to a user who is root on any node in the cluster; this user already
has complete control over the administration of the cluster and its packages.

Configuring Cluster Generic Resources


This section describes the step-by-step procedure to configure cluster generic resources. You can also
configure cluster generic resources from Serviceguard Manager. See the online help for instructions on
how to configure from Serviceguard Manager.

Procedure

1. Create a cluster configuration file that contains the generic resource parameters.
cmquerycl -v -C $SGCONF/cluster.conf -n node1 -n node2 –q <quorum_server>

2. Edit the cluster configuration file and specify the generic resource parameters.
GENERIC_RESOURCE_NAME cpu_monitor
GENERIC_RESOURCE_TYPE extended
GENERIC_RESOURCE_CMD “$SGCONF/generic_resource_monitors/cpu_monitor.sh 20”
GENERIC_RESOURCE_SCOPE node
GENERIC_RESOURCE_RESTART 25
GENERIC_RESOURCE_HALT_TIMEOUT 60000000

NOTE: Cluster generic resources must be configured to use the monitoring script via
GENERIC_RESOURCE_CMD parameter. It is the generic resource command monitoring script that
contains the logic to monitor the resource and set the status of a generic resource accordingly by
using cmsetresource(1m).
These scripts must be written by end-users according to their requirements. The monitoring script
must be configured as a GENERIC_RESOURCE_CMD in the cluster if the monitoring of the resource is
required to be started and stopped as a part of the cluster.
Configure the monitoring script by providing the full path name of the monitoring script as the
GENERIC_RESOURCE_CMD value as shown in the step.
Hewlett Packard Enterprise provides a template that describes how a monitoring script can be written.
For more information on monitoring scripts and the template, see Monitoring Script for Cluster
Generic Resources. For the description of cluster generic resources parameters, see Cluster
Configuration Parameters and Using the Cluster Generic Resources Monitoring Service.

3. After editing the cluster configuration file, verify the content of the cluster configuration file.
cmcheckconf -v -C $SGCONF/cluster.conf

4. When verification completes without errors, apply the cluster configuration file. This adds the cluster
configuration information (along with cluster generic resources) to the binary cluster configuration file
in the $SGCONF directory and distributes it to all the cluster nodes.
cmapplyconf -C $SGCONF/cluster.conf

206 Package versus Cluster Roles


5. Verify that the cluster generic resources parameters are configured.
CLUSTER STATUS
sg_cluster down

NODE STATUS STATE


sgltt2 down unknown

Quorum_Server_Status:
NAME STATUS STATE ADDRESS
qs_node unknown unknown 10.149.2.5

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY unknown eth0
PRIMARY unknown eth3
PRIMARY unknown eth4
PRIMARY unknown bond0

Cluster Generic Resources:


NAME SCOPE TYPE STATUS / COMMAND CURRENT- MAX-CONFIGURED
VALUE STATUS RESTARTS RESTARTS
cpu_monitor node Extended 0 unknown 0 25

NODE STATUS STATE


sgltt4 down unknown

Quorum_Server_Status:
NAME STATUS STATE ADDRESS
qs_node unknown unknown 10.149.2.5

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY unknown eth0
PRIMARY unknown eth3
PRIMARY unknown eth4
PRIMARY unknown bond0

Cluster Generic Resources:


NAME SCOPE TYPE STATUS / COMMAND CURRENT- MAX-CONFIGURED
VALUE STATUS RESTARTS RESTARTS
cpu_monitor node Extended 0 unknown 0 25

6. The cmviewcl -v -f line output (snippet) will be as follows


cmviewcl -v -f line | grep generic_resource
generic_resource:cpu_monitor|name=cpu_monitor
generic_resource:cpu_monitor|type=extended
generic_resource:cpu_monitor|define=cluster
generic_resource:cpu_monitor|scope=node
generic_resource:cpu_monitor|command="$SGCONF/generic_resource_monitors/cpu_monitor.sh 20"
generic_resource:cpu_monitor|max_restarts_allowed=25
generic_resource:cpu_monitor|halt_timeout=60000000
generic_resource:cpu_monitor|node:sgltt2|name=sgltt2
generic_resource:cpu_monitor|node:sgltt2|current_value=0
generic_resource:cpu_monitor|node:sgltt2|cmd_status=unknown
generic_resource:cpu_monitor|node:sgltt2|consumed_restart_count=0
generic_resource:cpu_monitor|node:sgltt4|name=sgltt4
generic_resource:cpu_monitor|node:sgltt4|current_value=0
generic_resource:cpu_monitor|node:sgltt4|cmd_status=unknown
generic_resource:cpu_monitor|node:sgltt4|consumed_restart_count=0

Building an HA Cluster Configuration 207


NOTE: The default status of a cluster generic resource command status is UNKNOWN and the default
current_value is "0" unless the status/value of a simple/extended generic resource is set using the
cmsetresource command.

7. Start the cluster. As part of the cluster start, the monitoring script will start the monitoring of the generic
resource and set the status accordingly.
cmruncl

Using Cluster Generic Resources in package configuration


This section describes the step-by-step procedure to configure cluster generic resources into package
configuration. You can also configure Cluster generic resources into a package from Serviceguard
Manager. See the online help for instructions on how to configure the existing cluster generic resource
into a package from Serviceguard Manager.

Prerequisites
Configure the cluster generic resource as described in the section Configuring Cluster Generic
Resources.

Procedure

1. Create a package configuration file that contains the generic resource module.
cmmakepkg $SGCONF/pkg1/pkg1.conf
Package template is created.
2. Edit the file before you use it.
3. Optional: To generate a configuration file by adding the generic resource module to an existing
package, enter the entire command in one line.
cmmakepkg -i $SGCONF/pkg1/pkg1.conf -m sg/generic_resource

4. Edit the package configuration file and specify the generic resource parameters.
generic_resource_name cpu_monitor
generic_resource_evaluation_type before_package_start
generic_resource_up_criteria <=30

NOTE: When using the cluster generic resource in a package, it is mandatory that corresponding
services should not be defined for package generic resource under the package configuration.
The service functionality has been defined in cluster generic resource by using
GENERIC_RESOURCE_CMD, GENERIC_RESOURCE_RESTART and
GENERIC_RESOURCE_HALT_TIMEOUT parameters.

CAUTION: Using cluster generic resource in package and adding the service functionality for the
same generic resource in package will result in unexpected problems.

5. After editing the package configuration file, verify the content of the package configuration file.
cmcheckconf -v -P $SGCONF/pkg1/pkg1.conf

6. When verification completes without errors, apply the package configuration file. This adds the
package configuration information (along with generic resources) to the binary cluster configuration file
in the $SGCONF directory and distributes it to all the cluster nodes.
cmapplyconf -P $SGCONF/pkg1/pkg1.conf

208 Using Cluster Generic Resources in package configuration


Enter Y when prompted to confirm the modification.
7. Verify that the generic resources parameters are configured.
cmviewcl -v -p pkg1
UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 down halted enabled unowned

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS NODE_NAME NAME
Generic Resource unknown sgltt2 cpu_monitor
Generic Resource unknown sgltt4 cpu_monitor

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary down sgltt2
Alternate down sgltt4

Other_Attributes:
ATTRIBUTE_NAME ATTRIBUTE_VALUE
Style modular
Priority no_priority

The cmviewcl -v -f line output (snippet) will be as follows:

cmviewcl -v -f line -p pkg1 | grep generic_resource


generic_resource:cpu_monitor|name=cpu_monitor
generic_resource:cpu_monitor|evaluation_type=before_package_start
generic_resource:cpu_monitor|up_criteria="<=30"
generic_resource:cpu_monitor|node:sgltt2|status=unknown
generic_resource:cpu_monitor|node:sgltt2|current_value=0
generic_resource:cpu_monitor|node:sgltt4|status=unknown
generic_resource:cpu_monitor|node:sgltt4|current_value=0

8. Start the cluster. As part of the cluster start, the monitoring script will start the monitoring of the generic
resource and set the status accordingly. Also the cluster start will bring up the package if the
up_criteria is met.
cmruncl
cmruncl: Validating network configuration...
cmruncl: Network validation complete
Cluster successfully formed.
Check the syslog files on all nodes in the cluster to verify that no warnings occurred during startup.

cmviewcl -v

CLUSTER STATUS
sg_cluster up

NODE STATUS STATE


sgltt2 up running

Quorum_Server_Status:
NAME STATUS STATE ADDRESS
qs_node up running 10.149.2.5

Building an HA Cluster Configuration 209


Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth3
PRIMARY up eth4
PRIMARY up bond0

Cluster Generic Resources:


NAME SCOPE TYPE STATUS / COMMAND CURRENT- MAX-CONFIGURED
VALUE STATUS RESTARTS RESTARTS
cpu_monitor node Extended 1 up 0 25

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running enabled sgltt2

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Generic Resource up cpu_monitor

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled sgltt2 (current)
Alternate up enabled sgltt4

Other_Attributes:
ATTRIBUTE_NAME ATTRIBUTE_VALUE
Style modular
Priority no_priority

NODE STATUS STATE


sgltt4 up running

Quorum_Server_Status:
NAME STATUS STATE ADDRESS
qs_node up running 10.149.2.5

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth3
PRIMARY up eth4
PRIMARY up bond0

Cluster Generic Resources:


NAME SCOPE TYPE STATUS / COMMAND CURRENT- MAX-CONFIGURED
VALUE STATUS RESTARTS RESTARTS
cpu_monitor node Extended 1 up 0 25

The cmviewcl -v -f line output of running cluster with cluster generic resource configured in package
(snippet) will be as follows:

cmviewcl -v -f line | grep generic_resource

package:pkg1|generic_resource:cpu_monitor|name=cpu_monitor
package:pkg1|generic_resource:cpu_monitor|evaluation_type=before_package_start
package:pkg1|generic_resource:cpu_monitor|up_criteria="<=30"
package:pkg1|generic_resource:cpu_monitor|node:sgltt2|status=up
package:pkg1|generic_resource:cpu_monitor|node:sgltt2|current_value=1
package:pkg1|generic_resource:cpu_monitor|node:sgltt4|status=up
package:pkg1|generic_resource:cpu_monitor|node:sgltt4|current_value=1
package:pkg1|module_name:sg/generic_resource|module_name=sg/generic_resource
package:pkg1|module_name:sg/generic_resource|module_version=1
generic_resource:cpu_monitor|name=cpu_monitor
generic_resource:cpu_monitor|type=extended
generic_resource:cpu_monitor|define=cluster
generic_resource:cpu_monitor|scope=node
generic_resource:cpu_monitor|command="$SGCONF/generic_resource_monitors/cpu_monitor.sh 20"
generic_resource:cpu_monitor|max_restarts_allowed=25
generic_resource:cpu_monitor|halt_timeout=60000000
generic_resource:cpu_monitor|node:sgltt2|name=sgltt2
generic_resource:cpu_monitor|node:sgltt2|current_value=1

210 Building an HA Cluster Configuration


generic_resource:cpu_monitor|node:sgltt2|cmd_status=up
generic_resource:cpu_monitor|node:sgltt2|pid=27870
generic_resource:cpu_monitor|node:sgltt2|consumed_restart_count=0
generic_resource:cpu_monitor|node:sgltt4|name=sgltt4
generic_resource:cpu_monitor|node:sgltt4|current_value=1
generic_resource:cpu_monitor|node:sgltt4|cmd_status=up
generic_resource:cpu_monitor|node:sgltt4|pid=23163
generic_resource:cpu_monitor|node:sgltt4|consumed_restart_count=0

Getting and Setting the Status/Value of a Simple/Extended Cluster Generic Resource


You can use the Serviceguard commands cmgetresource(1m) and cmsetresource(1m),
respectively, to get or set the status of a simple generic resource or the value of an extended generic
resource. These commands can also be used in the monitoring script or executed from the CLI. You must
be a root user (UID=0) to execute these commands. Non-root users cannot run these commands.

Serviceguard command to get the status of a simple or value of extended cluster


generic resource
Use the cmgetresource command to get the status of a simple generic resource or the value of an
extended generic resource. For example:
cmgetresource -r cpu_monitor
This retrieves the status of the generic resource cpu_monitor if it is configured as a simple resource. If
configured as an extended resource, the current value is returned.

Serviceguard command to set the status of a simple or value of extended cluster


generic resource
Use the cmsetresource command to set the status of a simple generic resource or the value of an
extended generic resource. For example:
cmsetresource -r disk_status -s up
This sets the status of the generic resource disk_status to up. This is a simple generic resource and only
the status can be set to up or down.
cmsetresource -r cpu_monitor 10
This sets the current value of the generic resource cpu_monitor to 10. This is an extended generic
resource and only numeric values from 1 to 2147483647 can be set.
See the man pages for more information.

Online reconfiguration of cluster generic resources


Online operations such as addition, deletion, and modification of generic resources in a cluster are
supported. The following operations can be performed online:

• Addition of a new cluster generic resource to running cluster is supported.


• Deletion of a cluster generic resource. Ensure that the generic resource being deleted is not
configured in any packages. If it is configured, you must first remove the generic resources from all the
configured packages.
• When the cluster is up and running, modification of cluster generic resources is allowed for the
following parameters only.

◦ GENERIC_RESOURCE_RESTART

◦ GENERIC_RESOURCE_HALT_TIMEOUT

Getting and Setting the Status/Value of a Simple/Extended Cluster Generic Resource 211
• Modification of GENERIC_RESOURCE_NAME is equivalent to removal of generic resource and addition
of new generic resource. Hence you can modify all the parameter values for the generic resource.
• Modification of GENERIC_RESOURCE_TYPE from a simple resource to an extended resource or
conversely from extended to simple resource is not allowed when the cluster is running.
• Modification of GENERIC_RESOURCE_CMD is not allowed when the cluster is running.

Offline reconfiguration of cluster generic resources


Offline operations such as addition, deletion, and modification of generic resources in a cluster are
supported. The following operations can be performed offline:

• Addition of a new cluster generic resource to halted cluster is supported.


• Deletion of a cluster generic resource. Ensure that the generic resource being deleted is not
configured in any packages. If it is configured, first you must remove the generic resources from all the
configured packages. Until the generic resource is unconfigured from all the packages, deletion is not
supported even in halted cluster.
• When the cluster is halted, modification to cluster generic resources parameters
GENERIC_RESOURCE_RESTART, GENERIC_RESOURCE_CMD, and
GENERIC_RESOURCE_HALT_TIMEOUT is supported.

• Modification of GENERIC_RESOURCE_NAME is equivalent to removal of generic resource and addition


of new generic resource. When you modify the generic resource name and if it is configured on any
package, you cannot remove the generic resource even though cluster is offline.
• Modification of GENERIC_RESOURCE_TYPE from a simple resource to an extended resource or
conversely is not allowed when any of the packages is configured to use the cluster generic resource.
However if none of the packages are configured, modification of type is allowed in cluster
configuration.

Verifying the Cluster Configuration


If you have edited a cluster configuration template file, use the following command to verify the content of
the file:
cmcheckconf -v -C $SGCONF/clust1.conf
This command checks the following:

• Network addresses and connections.


• Quorum server connection.
• All lock LUN device names on all nodes refer to the same physical disk area.
• One and only one lock LUN device is specified per node.
• A quorum server or lock LUN is configured, but not both.
• Uniqueness of names.
• Existence and permission of scripts specified in the command line.
• If all nodes specified are in the same heartbeat subnet.
• Correct configuration filename.
• All nodes can be accessed.

212 Offline reconfiguration of cluster generic resources


• No more than one CLUSTER_NAME, MEMBER_TIMEOUT, and AUTO_START_TIMEOUT are
specified.
• The value for package run and halt script timeouts does not exceed the maximum.
• The value for AUTO_START_TIMEOUT variables is greater than zero.
• Heartbeat network minimum requirement. See HEARTBEAT_IP under Cluster Configuration
Parameters on page 111.
• At least one NODE_NAME is specified.
• Each node is connected to each heartbeat network.
• All heartbeat networks are of the same type of LAN.
• The network interface device files specified are valid LAN device files.
• Other configuration parameters for the cluster and packages are valid.

If the cluster is online the cmcheckconf command also verifies that all the conditions for the specific
change in configuration have been met.

Cluster Lock Configuration Messages


The cmquerycl, cmcheckconf and cmapplyconf commands will return errors if the cluster lock is not
correctly configured. If there is no cluster lock in a cluster with two nodes, the following message is
displayed in the cluster configuration file:
# Warning: Neither a quorum server nor a lock lun was specificed.
# A Quorum Server or a lock lun is required for clusters of only two nodes.
If you attempt to configure both a quorum server and a lock LUN, the following message appears on
standard output when issuing the cmcheckconf or cmapplyconf command:
Duplicate cluster lock, line 55. Quorum Server already specified.

Distributing the Binary Configuration File


After specifying all cluster parameters, use the cmapplyconf command to apply the configuration. This
action distributes the binary configuration file to all the nodes in the cluster. Hewlett Packard Enterprise
recommends doing this separately before you configure packages (described in the next chapter). In this
way, you can verify the quorum server, heartbeat networks, and other cluster-level operations by using
the cmviewcl command on the running cluster. Before distributing the configuration, ensure that your
security files permit copying among the cluster nodes. See Configuring Root-Level Access.
The following command distributes the binary configuration file:
cmapplyconf -v -C $SGCONF/clust1.conf

Managing the Running Cluster


This section describes some approaches to routine management of the cluster. For more information, see
Cluster and Package Maintenance. You can manage the cluster from Serviceguard Manager, or by
means of Serviceguard commands as described below.

Cluster Lock Configuration Messages 213


NOTE: You can use the cmdeploycl (1m) command to create and start the cluster automatically after
its creation. The cmdeploycl (1m) command internally calls the cmquerystg (1m) command — to
configure cluster Lock LUN, cmpreparecl (1m) — to perform all the prerequisites, cmquerycl,
cmapplyconf, and cmruncl. If you use cmdeploycl (1m) command, you do not need to perform
the procedures that follow, but it is good idea to read them so that you understand what cmdeploycl
(1m) command does for you.

Checking Cluster Operation with Serviceguard Commands


• cmviewcl checks the status of the cluster and many of its components. A non-root user with the role
of Monitor can run this command from a cluster node or see status information in Serviceguard
Manager.
• cmrunnode is used to start a node. A non-root user with the role of Full Admin, can run this command
from a cluster node or through Serviceguard Manager.
• cmhaltnode is used to manually stop a running node. (This command is also used by
shutdown(1m)). A non-root user with the role of Full Admin can run this command from a cluster
node or through Serviceguard Manager.
• cmruncl is used to manually start a stopped cluster. A non-root user with Full Admin access can run
this command from a cluster node, or through Serviceguard Manager.
• cmhaltcl is used to manually stop a cluster. A non-root user with Full Admin access, can run this
command from a cluster node or through Serviceguard Manager.

You can use these commands to test cluster operation, as in the following:

1. If the cluster is not already running, start it:


cmruncl -v
By default, cmruncl will check the networks. Serviceguard will probe the actual network configuration
with the network information in the cluster configuration. If you do not need this validation, use
cmruncl -v -w none instead, to turn off validation and save time

2. When the cluster has started, make sure that cluster components are operating correctly:
cmviewcl -v
Make sure that all nodes and networks are functioning as expected. For more information, refer to the
chapter on “Cluster and Package Maintenance.”

3. Verify that nodes leave and enter the cluster as expected using the following steps:

• Halt the cluster. You can use Serviceguard Manager or the cmhaltnode command.

• Check the cluster membership to verify that the node has left the cluster. You can use the
Serviceguard Manager main page or the cmviewcl command.

• Start the node. You can use Serviceguard Manager or the cmrunnode command.

• Verify that the node has returned to operation. You can use Serviceguard Manager or the
cmviewcl command again.

4. Bring down the cluster. You can use Serviceguard Manager or the cmhaltcl -v -f command.

214 Checking Cluster Operation with Serviceguard Commands


See the manpages for more information about these commands. See Troubleshooting Your Cluster for
more information about cluster testing.

Setting up Autostart Features


Automatic startup is the process in which each node individually joins a cluster; Serviceguard provides a
startup script to control the startup process. If a cluster already exists, the node attempts to join it; if no
cluster is running, the node attempts to form a cluster consisting of all configured nodes. Automatic
cluster start is the preferred way to start a cluster. No action is required by the system administrator.
There are three cases:

• The cluster is not running on any node, all cluster nodes must be reachable, and all must be
attempting to start up. In this case, the node attempts to form a cluster consisting of all configured
nodes.
• The cluster is already running on at least one node. In this case, the node attempts to join that cluster.
• Neither is true: the cluster is not running on any node, and not all the nodes are reachable and trying
to start. In this case, the node will attempt to start for the AUTO_START_TIMEOUT period. If neither of
these things becomes true in that time, startup will fail.

To enable automatic cluster start, set the flag AUTOSTART_CMCLD to 1 in the $SGAUTOSTARTfile
($SGCONF/cmcluster.rc) on each node in the cluster; the nodes will then join the cluster at boot time.
Here is an example of the $SGAUTOSTART file:
SGAUTOSTART=/usr/local/cmcluster/conf/cmcluster.rc
#*************************** CMCLUSTER *************************

# Highly Available Cluster configuration


#
# @(#) $Revision: 82.2 $
#
#
# AUTOSTART_CMCLD
#
# Automatic startup is the process in which each node individually
# joins a cluster. If a cluster already exists, the node attempts
# to join it; if no cluster is running, the node attempts to form
# a cluster consisting of all configured nodes. Automatic cluster
# start is the preferred way to start a cluster. No action is
# required by the system administrator. If set to 1, the node will
# attempt to join/form its CM cluster automatically as described
# above. If set to 0, the node will not attempt to join its CM
# cluster.

AUTOSTART_CMCLD=1

NOTE: The /sbin/init.d/cmcluster file may call files that Serviceguard stores in$SGCONF/rc.
(See Understanding the Location of Serviceguard Files on page 169 for information about
Serviceguard directories on different Linux distributions.) This directory is for Serviceguard use only! Do
not move, delete, modify, or add files in this directory.

Setting up Autostart Features 215


NODE_TOC_BEHAVIOR
The NODE_TOC_BEHAVIOR parameter determines the node behavior, when the safety timer expires. It
can be set to reboot or panic. The default is reboot, which reboots the node when the safety timer
expires.
To verify the deadman settings, do the following:
#cat /proc/deadman/info
Deadman Enabled:No
Deadman Mode:reboot
CONFIG_HZ:1000
#
If the value is set to panic, Linux crash dumps need to be configured to capture the dump.

NOTE:
In the absence of crash dump, it is not possible to determine the cause of a reset. The
NODE_TOC_BEHAVIOR must be set to panic to capture the crash dump.

To configure the NODE_TOC_BEHAVIOR parameter:

1. Edit the file $SGCONF/cmcluster.rc configuration file.

2. Change the NODE_TOC_BEHAVIOR parameter to reboot or panic.

3. Verify the new configuration changes as follows:

a. Reboot the system to load the new deadman mode value


Or

b. Ensure that cluster services are halted on the node. Then, unload the deadman module and restart
the SGSafetyTimer service:
#rmmod deadman
#service SGSafetyTimer restart
This loads deadman with new NODE_TOC_BEHAVIOR value, which can be verified as explained
previously.

Changing the System Message


You may find it useful to modify the system's login message to include a statement such as the following:
This system is a node in a high availability cluster.
Halting this system may cause applications and services to
start up on another node in the cluster.
You may want to include a list of all cluster nodes in this message, together with additional cluster-specific
information.
The /etc/motd file may be customized to include cluster-related information.

Managing a Single-Node Cluster


The number of nodes you will need for your cluster depends on the processing requirements of the
applications you want to protect.

216 NODE_TOC_BEHAVIOR
In a single-node cluster, a quorum server is not required, since there is no other node in the cluster. The
output from the cmquerycl command omits the quorum server information area if there is only one
node.
You still need to have redundant networks, but you do not need to specify any heartbeat LANs, since
there is no other node to send heartbeats to. In the cluster configuration file, specify all LANs that you
want Serviceguard to monitor. For LANs that already have IP addresses, specify them with the
STATIONARY_IP parameter, rather than the HEARTBEAT_IP parameter.

Single-Node Operation
Single-node operation occurs in a single-node cluster, or in a multi-node cluster in which all but one node
has failed, or in which you have shut down all but one node, which will probably have applications
running. As long as the Serviceguard daemon cmcld is active, other nodes can rejoin the cluster at a
later time.
If the cmcld daemon fails during single-node operation, it will leave the single node up and your
applications running. (This is different from the failure of cmcld in a multi-node cluster, which causes the
node to halt with a reboot, and packages to be switched to adoptive nodes.)
It is not necessary to halt the single node in this case, since the applications are still running, and no other
node is currently available for package switching.

CAUTION:
But you should not try to restart Serviceguard; data corruption might occur if another node were to
attempt to start up a new instance of an application that is still running on the single node. Instead,
choose an appropriate time to shut down and reboot the node. This will allow the applications to
shut down and Serviceguard to restart the cluster after the reboot.

Disabling identd
Ignore this section unless you have a particular need to disable identd.
You can configure Serviceguard not to use identd.

CAUTION: This is not recommended. Consult the white paper Securing Serviceguard at http://
www.hpe.com/info/linux-serviceguard-docs (Select HP Serviceguard -> White Papers) for
more information.

If you must disable identd, do the following on each node after installing Serviceguard but before each
node rejoins the cluster (For example, before issuing a cmrunnode or cmruncl).
For Red Hat and SUSE:

1. Change the value of the server_args parameter in the file /etc/xinetd.d/hacl-cfg from -c to -c
-i

2. Restart xinetd:
#systemctl restart qs.service

Deleting the Cluster Configuration


You can delete a cluster configuration by means of the cmdeleteconf command. The command
prompts for a verification before deleting the files unless you use the -f option. You can delete the
configuration only when the cluster is down. The action removes the binary configuration file from all the
nodes in the cluster and resets all cluster-aware volume groups to be no longer cluster-aware.

Single-Node Operation 217


NOTE:
The cmdeleteconf command removes only the cluster binary file $SGCONF/cmclconfig. It does not
remove any other files from the $SGCONF directory.

Although the cluster must be halted, all nodes in the cluster should be powered up and accessible before
you use the cmdeleteconf command. If a node is powered down, power it up and allow it to boot. If a
node is inaccessible, you will see a list of inaccessible nodes and the following message:
Checking current status
cmdeleteconf: Unable to reach node lptest1.
WARNING: Once the unreachable node is up, cmdeleteconf
should be executed on the node to remove the configuration.

Delete cluster lpcluster anyway (y/[n])?


Reply Yes to remove the configuration. Later, if the inaccessible node becomes available, run
cmdeleteconf on that node to remove the configuration file.

218 Building an HA Cluster Configuration


Configuring Packages and Their Services
Serviceguard packages group together applications and the services and resources they depend on.
The typical Serviceguard package is a failover package that starts on one node but can be moved (“failed
over”) to another if necessary. For more information, see What is Serviceguard for Linux? , How the
Package Manager Works, and Package Configuration Planning.
You can also create multi-node packages, which run on more than one node at the same time.
System multi-node packages, which run on all the nodes in the cluster, are supported only for applications
supplied by Hewlett Packard Enterprise.
Creating or modifying a package requires the following broad steps, each of which is described in the
sections that follow:

1. Decide on the package’s major characteristics and choose the modules you need to include
(Choosing Package Modules).
2. Generate the package configuration file (Generating the Package Configuration File).
3. Edit the configuration file (Editing the Configuration File).
4. Verify and apply the package configuration (Verifying and Applying the Package Configuration).
5. Add the package to the cluster (Adding the Package to the Cluster).

Choosing Package Modules


IMPORTANT: Before you start, you need to do the package-planning tasks described under
Package Configuration Planning.

To choose the right package modules, you need to decide the following things about the package you are
creating:

• What type of package it is; see Types of Package: Failover, Multi-Node, System Multi-Node.
• Which parameters need to be specified for the package (beyond those included in the base type,
which is normally failover, multi-node, or system-multi-node). See Package Modules and
Parameters.

When you have made these decisions, you are ready to generate the package configuration file; see
Generating the Package Configuration File.

Types of Package: Failover, Multi-Node, System Multi-Node


There are three types of packages:

• Failover packages. This is the most common type of package. Failover packages run on one node at a
time. If there is a failure, Serviceguard (or a user) can halt them, and then start them up on another
node selected from the package’s configuration list; see node_name.

Configuring Packages and Their Services 219


To generate a package configuration file that creates a failover package, include -m sg/failover
on the cmmakepkg command line. See Generating the Package Configuration File.

• Multi-node packages. These packages run simultaneously on more than one node in the cluster.
Failures of package components such as applications, services, generic resource, or subnets, will
cause the package to be halted only on the node on which the failure occurred.
Relocatable IP addresses cannot be assigned to multi-node packages.
To generate a package configuration file that creates a multi-node package, include -m sg/
multi_node on the cmmakepkg command line. See Generating the Package Configuration File.

• System multi-node packages. System multi-node packages are supported only for applications
supplied by Hewlett Packard Enterprise.

NOTE: The following parameters cannot be configured for multi-node packages:

• failover_policy
• failback_policy
• ip_subnet
• ip_address

Volume groups configured for packages of this type must be activated in shared mode.

For more information about types of packages and how they work, see How the Package Manager
Works. For information on planning a package, see Package Configuration Planning.
When you have decided on the type of package you want to create, the next step is to decide what
additional package-configuration modules you need to include; see Package Modules and Parameters.

Differences between Failover and Multi-Node Packages


Note the following important differences in behavior between multi-node and failover packages:

• If a multi-node package has auto_run disabled (set to no in the package configuration file) it will not
start when the cluster is started. You can use cmmodpkg to enable package switching and start the
package for the first time. But if you then halt the multi-node package via cmhaltpkg, it can be re-
started only by means of cmrunpkg, not cmmodpkg.

• If a multi-node package is halted via cmhaltpkg, package switching is not disabled. This means that
the halted package will start to run on a rebooted node, if it is configured to run on that node and its
dependencies are met.
• When a multi-node package is started the first time (either at cluster startup, or subsequently if
auto_run is set to no, and package switching is then enabled) any dependent package will start on its
primary node. But if a multi-node package is halted along with its dependent packages, and the multi-
node package is then restarted, dependent packages which have had package switching re-enabled
will start on the first eligible node on which an instance of the multi-node package comes up; this may
not be the dependent packages’ primary node.
To ensure that dependent failover packages restart on their primary node if the multi-node packages
they depend on need to be restarted, make sure the dependent packages’ package switching is not
re-enabled before the multi-node packages are restarted. You can then either restart the dependent

220 Differences between Failover and Multi-Node Packages


failover packages with cmrunpkg, specifying the node you want them to start on, or enable package
switching for these packages after the multi-node package startup is complete.

Package Modules and Parameters


The table that follows shows the package modules and the configuration parameters each module
includes. Read this section in conjunction with the discussion under Package Configuration Planning.
Use this information, and the parameter explanations that follow Package Parameter Explanations to
decide which modules (if any) you need to add to the failover, multi-node, or system multi-node module,
to create your package.
You can use cmmakepkg -l (letter “l”) to see a list of all available modules, including non-Serviceguard
modules such as those supplied in the HPE Toolkits.

NOTE: If you are going to create a complex package that contains many modules, you may want to skip
the process of selecting modules, and simply create a configuration file that contains all the modules:
cmmakepkg -m sg/all $SGCONF/pkg_sg_complex
(The output will be written to $SGCONF/pkg_sg_complex.)

Base Package Modules


At least one base module (or default or all, which include the base module) must be specified on the
cmmakepkg command line. Parameters marked with an asterisk (*) are new or changed as of
Serviceguard A.11.18, A.11.19, A.11.20.00, A.11.20.10, A.11.20.20, or A.12.00.X. See the Package
Parameter Explanations for more information.

Package Modules and Parameters 221


Table 9: Base Modules

Module Name Parameters (page) Comments

failover package_name * Base module. Use as primary


module_name * building block for failover
module_version * packages.
package_type Cannot be used if
package_description * package_type is
node_name multi_node or
auto_run system_multi_node
node_fail_fast_enabled
run_script_timeout
halt_script_timeout
successor_halt_timeout *
script_log_file
operation_sequence *
log_level *
failover_policy
failback_policy
priority

multi_node package_name * Base module. Use as primary


module_name * building block for multi-node
module_version * packages.
package_type Cannot be used if
node_name package_type is failover
auto_run or system_multi_node.
node_fail_fast_enabled
run_script_timeout
halt_script_timeout
successor_halt_timeout *
script_log_file
operation_sequence *
log_level *
priority *

system_multi_node package_name * Base module. Primary


module_name * building block for system
module_version * multi-node packages. System
package_type multi-node packages are
supported only for
node_name
applications supplied by
auto_run Hewlett Packard Enterprise.
node_fail_fast_enabled
run_script_timeout
halt_script_timeout
successor_halt_timeout *
script_log_file *
operation_sequence *

222 Configuring Packages and Their Services


Module Name Parameters (page) Comments

log_level *
priority *

Optional Package Modules


Add optional modules to a base module if you need to configure the functions in question. Parameters
marked with an asterisk (*) are new or changed as of Serviceguard A.11.18, A.11.19, A.11.20.00, A.
11.20.10, A.11.20.20, or A.12.00.X. See the Package Parameter Explanations for more information.

Table 10: Optional Modules

Module Name Parameters (page) Comments

dependency dependency_name * Add to a base


dependency_condition module to create a
dependency_location package that
depends on one or
more other
packages.

weight weight_name weight_name, weight_value * Add to a base


weight value weight_name, weight_value * module to create a
package that has
weight that will be
counted against a
node's capacity.

monitor_subnet monitored_subnet * Add to a base


monitored_subnet_access* module to configure
subnet monitoring
for the package.

package_ip ip_subnet * Add to


ip_subnet_node * failover
ip_address *
module to assign
relocatable IP
addresses to a
failover package.

service service_name * Add to a base


service_cmd (S) module to create a
service_restart * package that runs
service_fail_fast_enabled an application or
service.
service_halt_on_maintenance
service_halt_timeout

Table Continued

Optional Package Modules 223


Module Name Parameters (page) Comments

generic_resource generic_resource_name Add to a base


generic_resource_evaluation_type module to create a
generic_resource_up_criteria package with
generic resources
that can be used to
monitor critical
resources through
custom monitors by
configuring them as
user-defined
services or cluster
generic resource.

vmfs vmdk_file_name * Add to a base


datastore_name * module if you want
scsi_controller * to use VMware
disk_type * Virtual Machine File
System (VMware
VMFS).

volume_group vgchange_cmd * Add to a base


vg (S) module if the
package needs to
mount file systems
on LVM volumes.

filesystem concurrent_fsck_operations Add to a base


fs_mount_retry_count module to configure
fs_umount_retry_count * filesystem options
fs_name* for the package.
fs_directory *
fs_type (S)
fs_mount_opt
fs_umount_opt
fs_fsck_opt

pev pev_ * Add to a base


module to configure
environment
variables to be
passed to an
external script.

Table Continued

224 Configuring Packages and Their Services


Module Name Parameters (page) Comments

external_pre external_pre_script * Add to a base


module to specify
additional programs
to be run before
volume groups are
activated while the
package is starting
and after they are
deactivated while
the package is
halting.

external external_script * Add to a base


module to specify
additional programs
to be run during
package start and
halt time.

acp user_name Add to a base


user_host module to configure
user_role Access Control
Policies for the
package.

all all parameters Use if you are


creating a complex
package that
requires most or all
of the optional
parameters; or if
you want to see the
specifications and
comments for all
available
parameters.

multi_node_all all parameters that can be used by a multi-node Use if you are
package; includes multi_node, dependency, creating a multi-
monitor_subnet, service, volume_group, node package that
filesystem, pev, external_pre, external, and requires most or all
acp modules. of the optional
parameters that are
available for this
type of package.

Table Continued

Configuring Packages and Their Services 225


Module Name Parameters (page) Comments

default (all parameters) A symbolic link to


the all module;
used if a base
module is not
specified on the
cmmakepkg
command line; see
cmmakepkg
Examples.

pr_cntl Add to a base


module to enable
the Persistent
Reservation in a
package.

xdc/xdc Use if you are


configuring
serviceguard-xdc
packages that
require Host-based
mirroring in
Extended Distance
Cluster
(serviceguard-xdc)
environment. For
information about
xdc/xdc module
attributes, see HPE
Serviceguard
Extended Distance
Cluster for Linux A.
12.00.40
Deployment Guide

NOTE: The xdc/xdc


module is compatible
only with sg/
failover and is not
compatible with sg/
multi_node and
sg/
system_multinode.

NOTE: The default form for parameter names in the modular package configuration file is lower case.
There are no compatibility issues; Serviceguard is case-insensitive as far as the parameter names are
concerned.

Package Parameter Explanations


Brief descriptions of the package configuration parameters follow.

226 Package Parameter Explanations


NOTE: For more information, see the comments in the editable configuration file output by the
cmmakepkg command, and the cmmakepkg (1m) manpage.
If you are going to browse these explanations deciding which parameters you need, you may want to
generate and print out a configuration file that has the comments for all of the parameters; you can create
such a file as follows:
cmmakepkg -m sg/all $SGCONF/sg-all
or simply
cmmakepkg $SGCONF/sg-all
This creates a file $SGCONF/sg-all that contains all the parameters and comments. (See
Understanding the Location of Serviceguard Files on page 169 for the location of $SGCONF on your
version of Linux.)
More detailed instructions for running cmmakepkg are in the next section, Generating the Package
Configuration File.
See also Package Configuration Planning.

package_name
Any name, up to a maximum of 39 characters, that:

• starts and ends with an alphanumeric character


• otherwise contains only alphanumeric characters or dot ( .), dash (-), or underscore (_)

• is unique among package names in this cluster

IMPORTANT: Restrictions on package names in previous Serviceguard releases were less


stringent. Packages whose names do not conform to the above rules will continue to run, but if
you reconfigure them, you will need to change the name; cmcheckconf and cmapplyconf will
enforce the new rules.

module_name
The module name. Do not change it. Used in the form of a relative path (for example, sg/failover) as
a parameter to cmmakepkg specify modules to be used in configuring the package. (The files reside in
the $SGCONF/modules directory; see Understanding the Location of Serviceguard Files on page 169
for the location of $SGCONF on your version of Linux.)
New for modular packages.

module_version
The module version. Do not change it.
New for modular packages.

package_type
The type can be failover, multi_node, or system multi_node. You can configure only failover or
multi-node packages; see Types of Package: Failover, Multi-Node, System Multi-Node.
Packages of one type cannot include the base module for another; for example, if package_type is
failover, the package cannot include the multi_node, or system_multi_node module.

package_name 227
package_description
The application that the package runs. This is a descriptive parameter that can be set to any value you
choose, up to a maximum of 80 characters. Default value is Serviceguard Package.

node_name
The node on which this package can run, or a list of nodes in order of priority, or an asterisk (*) to indicate
all nodes. The default is *. For system multi-node packages, you must specify node_name *.
If you use a list, specify each node on a new line, preceded by the literal node_name, for example:
node_name <node1>
node_name <node2>
node_name <node3>

The order in which you specify the node names is important. First list the primary node name (the node
where you normally want the package to start), then the first adoptive node name (the best candidate for
failover), then the second adoptive node name, followed by additional node names in order of preference.
In case of a failover, control of the package will be transferred to the next adoptive node name listed in
the package configuration file, or (if that node is not available or cannot run the package at that time) to
the next node in the list, and so on.
If a package is configured with a site_preferred or site_preferred_manual failover policy and if
you want to modify the default NODE_NAME, ensure that the NODE_NAME entries are grouped by sites.
For example, in the following configuration, a package with site_preferred policy can have
NODE_NAME entries in the order node2 , node1 , node 4, node3 but not node2, node3, node1 and
node4.

SITE_NAME A
NODE STATUS STATE
node1 up running
node2 up running

SITE_NAME B
NODE STATUS STATE
node3 up running
node4 up running

IMPORTANT: See Cluster Configuration Parameters for important information about node
names.
See About Cross-Subnet Failover for considerations when configuring cross-subnet packages,
which are further explained under Cross-Subnet Configurations.

auto_run
Can be set to yes or no. The default is yes.
For failover packages, yes allows Serviceguard to start the package (on the first available node listed
under node_name) on cluster start-up, and to automatically restart it on an adoptive node if it fails. no
prevents Serviceguard from automatically starting the package, and from restarting it on another node.
This is also referred to as package switching, and can be enabled or disabled while the package is
running, by means of the cmmodpkg command.
auto_run should be set to yes if the package depends on another package, or is depended on; see
About Package Dependencies.

228 package_description
For system multi-node packages, auto_run must be set to yes. In the case of a multi-node package,
setting auto_run to yes allows an instance to start on a new node joining the cluster; no means it will not.

node_fail_fast_enabled
Can be set to yes or no. The default is no.
yes means the node on which the package is running will be halted (reboot) if the package fails; no
means Serviceguard will not halt the system.
If this parameter is set to yes and one of the following events occurs, Serviceguard will halt the system
(reboot) on the node where the control script fails:

• A package subnet fails and no backup network is available


• A generic resource fails
• Serviceguard is unable to execute the halt function
• The start or halt function times out

NOTE: If the package halt function fails with “exit 1”, Serviceguard does not halt the node, but sets
no_restart for the package, which disables package switching, setting auto_run to no and thereby
preventing the package from starting on any adoptive node.

Setting node_fail_fast_enabled to yes prevents Serviceguard from repeatedly trying (and failing) to start
the package on the same node.
Setting node_fail_fast_enabled to yes ensures that the package can fail over to another node even if the
package cannot halt successfully. Be careful when using node_fail_fast_enabled, as it will cause all
packages on the node to halt abruptly. For more information, see Responses to Failures and
Responses to Package and Service Failures .
For system multi-node packages, node_fail_fast_enabled must be set to yes.

run_script_timeout
The amount of time, in seconds, allowed for the package to start; or no_timeout. The default is
no_timeout. The maximum is 4294.
If the package does not complete its startup in the time specified by run_script_timeout, Serviceguard will
terminate it and prevent it from switching to another node. In this case, if node_fail_fast_enabled is set
to yes, the node will be halted (rebooted).
If no timeout is specified (no_timeout), Serviceguard will wait indefinitely for the package to start.
If a timeout occurs:

• Switching will be disabled.


• The current node will be disabled from running the package.

NOTE: If no_timeout is specified, and the script hangs, or takes a very long time to complete, during
the validation step (cmcheckconf (1m)), cmcheckconf will wait 20 minutes to allow the validation to
complete before giving up.

node_fail_fast_enabled 229
halt_script_timeout
The amount of time, in seconds, allowed for the package to halt; or no_timeout. The default is
no_timeout. The maximum is 4294.
If the package’s halt process does not complete in the time specified by halt_script_timeout, Serviceguard
will terminate the package and prevent it from switching to another node. In this case, if
node_fail_fast_enabled is set to yes, the node will be halted (reboot).
If a halt_script_timeout is specified, it should be greater than the sum of all the values set for
service_halt_timeout for this package.
If a timeout occurs:

• Switching will be disabled.


• The current node will be disabled from running the package.

If a halt-script timeout occurs, you may need to perform manual cleanup. See Troubleshooting Your
Cluster.

successor_halt_timeout
Specifies how long, in seconds, Serviceguard will wait for packages that depend on this package to halt,
before halting this package. Can be 0 through 4294, or no_timeout. The default is no_timeout.

• no_timeout means that Serviceguard will wait indefinitely for the dependent packages to halt.

• 0 means Serviceguard will not wait for the dependent packages to halt before halting this package.

This parameter is new as of A.11.18. See also About Package Dependencies.

script_log_file
The full pathname of the package’s log file. The default is$SGRUN/log/<package_name>.log . (See
Understanding the Location of Serviceguard Files on page 169 for more information about
Serviceguard pathnames.) See also log_level.

operation_sequence
Defines the order in which the scripts defined by the package’s component modules will start up. See the
package configuration file for details.
This parameter is not configurable; do not change the entries in the configuration file.
New for modular packages.

log_level
Determines the amount of information printed to stdout when the package is validated, and to the
script_log_file when the package is started and halted. Valid values are 0 through 5, but you should
normally use only the first two (0 or 1); the remainder (2 through 5) are intended for use by HHewlett
Packard Enterprise Support.

• 0 - informative messages

• 1 - informative messages with slightly more detail

• 2 - messages showing logic flow

• 3 - messages showing detailed data structure information

230 halt_script_timeout
• 4 - detailed debugging information

• 5 - function call flow

New for modular packages.

failover_policy
Specifies how Serviceguard decides where to start the package, or restart it if it fails. Can be set to
configured_node, min_package_node, site_preferred, or site_preferred_manual. The
default is configured_node.

• configured_nodemeans Serviceguard will attempt to start the package on the first available node in
the list you provide under node_name .
• min_package_node means Serviceguard will start the package on whichever node in the
node_name list has the fewest packages running at the time.
• site_preferred means Serviceguard will try all the eligible nodes on the local SITE before failing
the package over to a node on another SITE. This policy can be configured only in a Metrocluster with
site aware failover configuration; see the documents listed under Cross-Subnet Configurations for
more information.
• site_preferred_manual means Serviceguard will try to fail the package over to a node on the
local SITE. If there are no eligible nodes on the local SITE, the package will halt with global switching
enabled. You can then restart the package locally, when a local node is available, or start it on another
SITE. This policy can be configured only in a Metrocluster with site aware failover configuration; see
the documents listed under Cross-Subnet Configurations for more information.

NOTE:

• For site_preferred or site_preferred_manualfailover_policy to be effective define the


policy in the packages running or configured to run on the cluster with more than one site configured
or more than one site nodes.
• When site_preferred or site_preferred_manualfailover_policy is defined in a package,
cmrunpkg -a option cannot be used to run the package.

This parameter can be set for failover packages only. If this package will depend on another package or
vice versa, see also About Package Dependencies.

failback_policy
Specifies whether or not Serviceguard will automatically move a package that is not running on its
primary node (the first node on its node_name list) when the primary node is once again available. Can
be set to automatic or manual. The default is manual.

• manual means the package will continue to run on the current node.

• automatic means Serviceguard will move the package to the primary node as soon as that node
becomes available, unless doing so would also force a package with a higher priority to move.

CAUTION: When the failback_policy is automatic and you set the NODE_NAME to '*', if you add,
delete, or rename a node in the cluster, the primary node for the package might change resulting in
the automatic failover of that package.

failover_policy 231
NOTE: When the failover_policy is site_preferred or site_preferred_manual, failback_policy
cannot be set to automatic.

This parameter can be set for failover packages only. If this package will depend on another package or
vice versa, see also About Package Dependencies.

priority
Assigns a priority to a failover package whose failback_policy is configured_node. Valid values are 1
through 3000, or no_priority. The default is no_priority. See also the dependency_ parameter
descriptions dependency_name.
priority can be used to satisfy dependencies when a package starts, or needs to fail over or fail back: a
package with a higher priority than the packages it depends on can force those packages to start or
restart on the node it chooses, so that its dependencies are met.
If you assign a priority, it must be unique in this cluster. A lower number indicates a higher priority, and a
numerical priority is higher than no_priority. Hewlett Packard Enterprise recommends assigning
values in increments of 20 so as to leave gaps in the sequence; otherwise you may have to shuffle all the
existing priorities when assigning priority to a new package.

IMPORTANT: Because priority is a matter of ranking, a lower number indicates a higher priority (20
is a higher priority than 40). A numerical priority is higher than no_priority.

This parameter is new as of A.11.18. See also About Package Dependencies.

dependency_name
A unique identifier for a particular dependency (see dependency_condition) that must be met in order
for this package to run (or keep running). It must be unique among this package's dependency_names.
The length and formal restrictions for the name are the same as for package_name.

IMPORTANT: Restrictions on dependency names in previous Serviceguard releases were less


stringent. Packages that specify dependency_names that do not conform to the above rules will
continue to run, but if you reconfigure them, you will need to change the dependency_name;
cmcheckconf and cmapplyconf will enforce the new rules.

Configure this parameter, along with dependency_condition and dependency_location, and optionally
priority priority, if this package depends on another package; for example, if this package depends on a
package named pkg2:
dependency_name pkg2dep
dependency_condition pkg2 = UP
dependency_location same_node
For more information about package dependencies, see About Package Dependencies.

dependency_condition
The condition that must be met for this dependency to be satisfied. As of Serviceguard A.11.18, the only
condition that can be set is that another package must be running.
The syntax is: <package_name> = UP, where <package_name> is the name of the package depended
on. The type and characteristics of the current package (the one we are configuring) impose the following
restrictions on the type of package it can depend on:

232 priority
• If the current package is a multi-node package, < package_name > must identify a multi-node or
system multi-node package.
• If the current package is a failover package and its failover_policy is min_package_node, <
package_name > must identify a multi-node or system multi-node package.
• If the current package is a failover package and configured_node is its failover_policy, <
package_name > must identify a multi-node or system multi-node package, or a failover package
whose failover_policy is configured_node.

See also About Package Dependencies.

dependency_location
Specifies where the dependency_condition must be met. The only legal value is same_node.

weight_name, weight_value
These parameters specify a weight for a package; this weight is compared to a node's available capacity
(defined by the CAPACITY_NAME and CAPACITY_VALUE parameters in the cluster configuration file) to
determine whether the package can run there.
Both parameters are optional, but if weight_value is specified, weight_name must also be specified, and
must come first. You can define up to four weights, corresponding to four different capacities, per cluster.
To specify more than one weight for this package, repeat weight_name and weight_value.

NOTE: But if weight_name is package_limit, you can use only that one weight and capacity
throughout the cluster. package_limit is a reserved value, which, if used, must be entered exactly in
that form. It provides the simplest way of managing weights and capacities; see Simple Method for more
information.

The rules for forming weight_name are the same as those for forming package_name. weight_name
must exactly match the corresponding CAPACITY_NAME.
weight_value is an unsigned floating-point value between 0 and 1000000 with at most three digits after
the decimal point.
You can use these parameters to override the cluster-wide default package weight that corresponds to a
given node capacity. You can define that cluster-wide default package weight by means of the
WEIGHT_NAME and WEIGHT_DEFAULT parameters in the cluster configuration file (explicit default). If
you do not define an explicit default (that is, if you define a CAPACITY_NAME in the cluster configuration
file with no corresponding WEIGHT_NAME and WEIGHT_DEFAULT), the default weight is assumed to
be zero (implicit default). Configuring weight_name and weight_value here in the package configuration
file overrides the cluster-wide default (implicit or explicit), and assigns a particular weight to this package.
For more information, see About Package Weights. See also the discussion of the relevant parameters
under Cluster Configuration Parameters, in the cmmakepkg (1m) and cmquerycl (1m) manpages,
and in the cluster configuration and package configuration template files.

monitored_subnet
The LAN subnet that is to be monitored for this package. You can specify multiple subnets; use a
separate line for each.
If you specify a subnet as a monitored_subnet the package will not run on any node not reachable via
that subnet. This normally means that if the subnet is not up, the package will not run. (For cross-subnet
configurations, in which a subnet may be configured on some nodes and not on others, see
monitored_subnet_access below, ip_subnet_node , and About Cross-Subnet Failover.)

dependency_location 233
Typically you would monitor the ip_subnet, specifying it here as well as in the ip_subnet parameter, but
you may want to monitor other subnets as well; you can specify any subnet that is configured into the
cluster (via the STATIONARY_IP parameter in the cluster configuration file). See Stationary and
Relocatable IP Addresses and Monitored Subnets for more information.
If any monitored_subnet fails, Serviceguard will switch the package to any other node specified by
node_name node_name which can communicate on all the monitored_subnets defined for this package.
See the comments in the configuration file for more information and examples.

monitored_subnet_access
In cross-subnet configurations, specifies whether each monitored_subnet is accessible on all nodes in
the package’s node_name list node_name, or only some. Valid values are PARTIAL, meaning that at
least one of the nodes has access to the subnet, but not all; and FULL, meaning that all nodes have
access to the subnet. The default is FULL, and it is in effect if monitored_subnet_access is not specified.
See also ip_subnet_node and About Cross-Subnet Failover.
New for modular packages.

ip_subnet
Specifies an IP subnet used by the package.

CAUTION: Hewlett Packard Enterprise recommends that this subnet be configured into the cluster.
You do this in the cluster configuration file by specifying a HEARTBEAT_IP or STATIONARY_IP
under a NETWORK_INTERFACE on the same subnet, for each node in this package's
NODE_NAME list. For example, an entry such as the following in the cluster configuration file
configures subnet 192.10.25.0 (lan1) on node ftsys9:
NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.10.25.18
SeeCluster Configuration Parameters for more information.
If the subnet is not configured into the cluster, Serviceguard cannot manage or monitor it, and in fact
cannot guarantee that it is available on all nodes in the package's node-name list node_name.
Such a subnet is referred to as an external subnet, and relocatable addresses on that subnet are
known as external addresses. If you use an external subnet, you risk the following consequences:

• If the subnet fails, the package will not fail over to an alternate node.
• Even if the subnet remains intact, if the package needs to fail over because of some other type
of failure, it could fail to start on an adoptive node because the subnet is not available on that
node.

For each subnet used, specify the subnet address on one line and, on the following lines, the relocatable
IP addresses that the package uses on that subnet. These will be configured when the package starts
and unconfigured when it halts.
For example, if this package uses subnet 192.10.25.0 and the relocatable IP addresses 192.10.25.12 and
192.10.25.13, enter:
ip_subnet 192.10.25.0
ip_address 192.10.25.12
ip_address 192.10.25.13
If you want the subnet to be monitored, specify it in the monitored_subnet parameter monitored_subnet
as well.

234 monitored_subnet_access
In a cross-subnet configuration, you also need to specify which nodes the subnet is configured on; see
ip_subnet_node below. See also monitored_subnet_access and About Cross-Subnet Failover.
This parameter can be set for failover packages only.

ip_subnet_node
In a cross-subnet configuration, specifies which nodes an ip_subnet is configured on. If no
ip_subnet_nodes are listed under an ip_subnet, it is assumed to be configured on all nodes in this
package’s node_name list node_name.
Can be added or deleted while the package is running, with these restrictions:

• The package must not be running on the node that is being added or deleted.
• The node must not be the first to be added to, or the last deleted from, the list of ip_subnet_nodes for
this ip_subnet.

See also monitored_subnet_access and About Cross-Subnet Failover.


New for modular packages.

ip_address
A relocatable IP address on a specified ip_subnet.
For more information about relocatable IP addresses, see Stationary and Relocatable IP Addresses
and Monitored Subnets.
This parameter can be set for failover packages only.

service_name
A service is a program or function which Serviceguard monitors as long the package is up. service_name
identifies this function and is used by the cmrunserv and cmhaltserv commands. You can configure a
maximum of 30 services per package and 900 services per cluster.
The length and formal restrictions for the name are the same as for package_name. service_name must
be unique among all packages in the cluster.

IMPORTANT: Restrictions on service names in previous Serviceguard releases were less stringent.
Packages that specify services whose names do not conform to the above rules will continue to run,
but if you reconfigure them, you will need to change the name; cmcheckconf and cmapplyconf
will enforce the new rules.

Each service is defined by five parameters: service_name, service_cmd, service_restart,


service_fail_fast_enabled, service_halt_on_maintenance, and service_halt_timeout. See the descriptions
that follow.
The following is an example of fully defined service:
service_name volume_mon
service_cmd "$SGSBIN/cmresserviced /dev/vx/dsk/dg_dd2/lvol2"
service_restart none
service_fail_fast_enabled yes
service_halt_on_maintenance yes
service_halt_timeout 300
See the package configuration template file for more examples.

ip_subnet_node 235
service_cmd
The command that runs the program or function for this service_name, for example,
/usr/bin/X11/xclock -display 15.244.58.208:0
Only Serviceguard environment variables defined in the /etc/cmcluster.conf file or an absolute
pathname can be used with Service command; neither the PATH variable nor any other environment
variable is passed to the command. The default shell is /bin/sh. For example,
service_cmd $SGCONF/pkg1/script.sh
service_cmd /etc/cmcluster/pkg1/script.sh
service_cmd /usr/local/cmcluster/conf/pkg1/script.sh
service_cmd /opt/cmcluster/conf/pkg1/script.sh
service_cmd $SGSBIN/cmresserviced /dev/sdd1

NOTE: Be careful when defining service run commands. Each run command is executed in the following
way:

• The cmrunserv command executes the run command.

• Serviceguard monitors the process ID (PID) of the process the run command creates.
• When the command exits, Serviceguard determines that a failure has occurred and takes appropriate
action, which may include transferring the package to an adoptive node.
• If a run command is a shell script that runs some other command and then exits, Serviceguard will
consider this normal exit as a failure.

Make sure that each run command is the name of an actual service and that its process remains alive
until the actual service stops. One way to manage this is to configure a package such that the service is
actually a monitoring program that checks the health of the application that constitutes the main function
of the package, and exits if it finds the application has failed. The application itself can be started by an
external_script.

service_restart
The number of times Serviceguard will attempt to re-run the service_cmd. Valid values are unlimited,
none or any positive integer value. Default is none.
If the value is unlimited, the service will be restarted an infinite number of times. If the value is none,
the service will not be restarted.

service_fail_fast_enabled
Specifies whether or not Serviceguard will halt the node (reboot) on which the package is running if the
service identified by service_name fails. Valid values are yes and no. Default is no, meaning that failure
of this service will not cause the node to halt.

service_halt_on_maintenance
If service_halt_on_maintenance parameter is set to yes and the package is put into maintenance mode,
Serviceguard halts the service. Then, Serviceguard automatically restarts the failed services when the
package is taken out of maintenance mode.

236 service_cmd
service_halt_timeout
The length of time, in seconds, Serviceguard will wait for the service to halt before forcing termination of
the service’s process. The maximum value is 4294.
The value should be large enough to allow any cleanup required by the service to complete.
If no value is specified, a zero timeout will be assumed, meaning that Serviceguard will not wait for any
time before terminating the process.

generic_resource_name
Defines the logical name used to identify a generic resource in a package. This name corresponds to the
generic resource name used by the cmgetresource(1m) and cmsetresource(1m) commands.
Multiple generic_resource_name entries can be specified in a package.
The length and formal restrictions for the name are the same as for package_name.
Each name must be unique within a package, but a single resource can be specified across multiple
packages.
You can configure a maximum of 100 generic resources per cluster.
Each generic resource is defined by three parameters:

• generic_resource_name
• generic_resource_evaluation_type
• generic_resource_up_criteria

See the descriptions that follow.


The following is an example of defining generic resource parameters:
generic_resource_name cpu_monitor
generic_resource_evaluation_type during_package_start
generic_resource_up_criteria <50
See the package configuration file for more examples.

generic_resource_evaluation_type
Defines when the status of a generic resource is evaluated.
Valid values are during_package_start and before_package_start. The default is
during_package_start.
The resources that will be available during the course of start of the package must be configured with an
evaluation_type as during_package_start.
Monitoring for these generic resources can be started and stopped as a part of the package, and the
monitoring script can be configured as a service. This can be achieved by configuring a service_name
and a service_cmd containing the full path name of the monitoring executable/script. The monitoring of
the generic resource starts only when the monitoring scripts are started and not at the start of the
package.
For information on monitoring scripts, see Monitoring Script for Generic Resources.
If there is a common generic resource that needs to be monitored as a part of multiple packages, then the
monitoring of the generic resources can be started and stopped as a part of the cluster. The monitoring
script can be configured as a generic resource command. This can be achieved by configuring a
cluster_generic_resource and a generic_resource_cmd containing the full path name of the

service_halt_timeout 237
monitoring script. The monitoring of the generic resource starts only when the monitoring scripts are
started as part of cluster start up.
For information on how to configure cluster generic resource, see Configuring Cluster Generic
Resources and for monitoring scripts, see Monitoring Script for Cluster Generic Resources.

generic_resource_up_criteria
Defines a criterion to determine whether the status of a generic resource identified by
generic_resource_name is up.
Attribute requires a logical operator and a value. The operators ==, !=, >, <, >=, and <= are allowed.
Values must be positive integer values ranging from 1 to 2147483647.

NOTE: Operators other than the ones mentioned above are not supported. This attribute does not accept
more than one up criterion. For example, >> 10, << 100 are not valid.
Though values ranging from 1 to 2147483647 can be entered with the above mentioned operators, the
below four conditions are not allowed to be set:
< 1, > 2147483647, >= 1 and <= 2147483647
This is because:

• If you specify generic_resource_up_criteria < 1 or > 2147483647, for the status of a resource to be 'up'
you cannot enter values to satisfy the up_criteria condition. Hence, the resource can never be 'up'.
• Similarly, if you specify generic_resource_up_criteria >= 1 or <= 2147483647, the status will always be
'up' as the criteria is always met. You cannot enter values to dissatisfy the up_criteria to bring the
resource status to 'down'.

• generic_resource_up_criteria is an optional attribute. It determines whether a given generic resource


is a simple generic resource or an extended generic resource.
It is not specified for a simple resource, but is required for an extended resource.

• A single package can contain both simple and extended resources.


• A given resource cannot be configured as a simple generic resource in one package and as an
extended generic resource in another package. It must be either simple or extended in all packages.
• A single package can have a combination of generic resources of evaluation type
before_package_start and during_package_start.

vmdk_file_name
Specifies the VMDK file name that represents the VMFS disk to be configured in the package. The
vmdk_file_name must be specified in such a way that the vmdk_file_name is preceded by the name of
the directory in which VMDK file resides.
For example,
<directory_name_where_vmdk_file_resides>/<vmdk_file_name>
Legal values for vmdk_file_name:

• Any string that starts and ends with an alphanumeric character


• Contains only alphanumeric characters or dot (.), dash (-), or underscore (_)

238 generic_resource_up_criteria
• Must not contain space, tab, or any other special characters
• Maximum length is 80 characters

NOTE:

• Ensure that it is already created on the Esxi host in which the VMware cluster nodes are mapped.
• If you change vmdk_file_name parameter in the host that is already configured as part of the package,
it might result in unexpected problem.

datastore_name
Specifies the name of the VMware datastore where the VMDK file resides. You must always configure
datastore_name on shared disk.
Legal values for datastore_name:

• Any string that starts and ends with an alphanumeric character


• Contains only alphanumeric characters or dot (.), dash (-), or underscore (_)

• Must not contain space, tab, or any other special characters


• Maximum length is 80 characters

NOTE:

• Ensure that it is already created on the Esxi host in which the VMware cluster nodes are available.
• If you change datastore_name parameter in the host that is already configured as part of the package,
it might result in unexpected problem.

scsi_controller
Specifies the SCSI controller device information that describes the SCSI bus number and the slot on
which the VMDK file will be connected. For example, ifxyz.vmdk file uses the SCSI controller 1 and slot 1,
then scsi_controller value will be 1:1.
Legal values for scsi_controller:
A numeric value pair of the form X:Y, where X is SCSI controller (0-3) and Y is the slot number (0-6, 8-15)
as used in the VMDK configuration on host.

NOTE:
Ensure that it is already created on the VMware cluster nodes on which the package will be configured.
The scsi_controller parameter that are added as part of package configuration file must be available on all
VMware cluster nodes.

disk_type
Specifies the type of virtual disk to be added to the package. Legal values for disk_type can be RDM or
VMFS.

datastore_name 239
NOTE:
The value of disk_type can be selected while creating the VMDK file. The VMDK can be of type Virtual
disk or RDM. If the disk_type is Virtual disk, then specify as VMFS in the package configuration. If the
disk_type is RDM disk, then specify as RDM in the package configuration.

vgchange_cmd
Specifies the method of activation for each Logical Volume Manager (LVM) volume group identified by a
vg entry.
The default is vgchange -a y.

vxvol_cmd
Specifies the method of recovery for mirrored VxVM volumes.
If recovery is found to be necessary during package startup, by default the script will pause until the
recovery is complete. To change this behavior, comment out the line
vxvol_cmd "vxvol -g \${DiskGroup} startall"
in the configuration file and uncomment the line
vxvol_cmd "vxvol -g \${DiskGroup} -o bg startall"
This allows package startup to continue while mirror re-synchronization is in progress.

vg
Specifies an LVM volume group (one per vg, each on a new line) on which a file system (see fs_type)
needs to be mounted. A corresponding vgchange_cmd (see above) specifies how the volume group is to
be activated. The package script generates the necessary filesystem commands on the basis of the fs_
parameters (see File system parameters ).

vxvm_dg
Specifies a VxVM disk group (one per vxvm_dg, each on a new line) on which a file system needs to be
mounted. See the comments in the package configuration file and Creating a Storage Infrastructure
with VxVM on page 190, for more information.

vxvm_dg_retry
Specifies whether to retry the import of a VxVM disk group, using vxdisk scandisks to check for any
missing disks that might have caused the import to fail.
Legal values are yes and no. yes means vxdisk scandisks will be run in the event of an import
failure. The default is no.

IMPORTANT:
vxdisk scandisks can take a long time in the case of a large IO subsystem.

deactivation_retry_count
Specifies the number of times the package shutdown script will repeat an attempt to deactivate a disk
group (VxVM) and the minimum number of times for volume group (LVM) before failing. Legal value is
zero or any greater number. The default is 2.

240 vgchange_cmd
kill_processes_accessing_raw_devices
Specifies whether or not to kill processes that are using raw devices (for example, database applications)
when the package shuts down. Default is no. See the comments in the package configuration file for
more information.

File system parameters


A package can activate one or more storage groups on startup, and to mount logical volumes to file
systems. At halt time, the package script unmounts the file systems and deactivates each storage group.
All storage groups must be accessible on each target node.
For each file system (fs_name) you specify in the package configuration file, you must identify a logical
volume, the mount point, the mount, umount and fsck options, and the type of the file system; for
example:
fs_name /dev/vg01/lvol1
fs_directory /pkg01aa
fs_mount_opt "-o rw"
fs_umount_opt ""
fs_fsck_opt ""
fs_type "ext3"
A logical volume must be built on an LVM volume group. Logical volumes can be entered in any order.
For an NFS-imported file system, see the discussion under fs_name and fs_server.
The parameter explanations that follow provide more detail.

concurrent_fsck_operations
The number of concurrent fsck operations allowed on file systems being mounted during package
startup.
Legal value is any number greater than zero. The default is 1.
If the package needs to run fsck on a large number of file systems, you can improve performance by
carefully tuning this parameter during testing (increase it a little at time and monitor performance each
time).

fs_mount_retry_count
The number of mount retries for each file system. Legal value is zero or any greater number. The default
is zero.
If the mount point is busy at package startup and fs_mount_retry_count is set to zero, package startup
will fail.
If the mount point is busy and fs_mount_retry_count is greater than zero, the startup script will attempt to
kill the user process responsible for the busy mount point (fuser -ku) and then try to mount the file
system again. It will do this the number of times specified by fs_mount_retry_count.
If the mount still fails after the number of attempts specified by fs_mount_retry_count, package startup will
fail.

fs_umount_retry_count
The number of umount retries for each file system.

kill_processes_accessing_raw_devices 241
Legal value is 1 or any greater number. The default is 1. Operates in the same way as
fs_mount_retry_count.

fs_name
This parameter, in conjunction with fs_directory, fs_type, fs_mount_opt, fs_umount_opt, and fs_fsck_opt,
specifies a filesystem that is to be mounted by the package.
fs_name must specify the block devicefile for a logical volume.
For an NFS-imported file system, the additional parameters required are fs_server, fs_directory, fs_type,
and fs_mount_opt; see fs_server for an example.

CAUTION: Before configuring an NFS-imported file system into a package, make sure you have
read and understood the rules and guidelines under Planning for NFS-mounted File Systems,
and configured the cluster parameter CONFIGURED_IO_TIMEOUT_EXTENSION, described under
Cluster Configuration Parameters on page 111.

File systems are mounted in the order you specify in the package configuration file, and unmounted in the
reverse order.
See File system parameters and the comments in the FILESYSTEMS section of the configuration file for
more information and examples. See also Volume Manager Planning , and the mount manpage.

NOTE: For filesystem types (see fs_type), a volume group must be defined in this file (using vg; see vg)
for each logical volume specified by an fs_name entry.

fs_server
The name or IP address (IPv4 or IPv6) of the NFS server for an NFS-imported file system. In this case,
you must also set fs_type to nfs, fs_mount_opt to -o llock on HPUX , and -o local_lock = all on Linux.
fs_name specifies the directory to be imported from fs_server, and fs_directory specifies the local mount
point.
For example:
fs_name /var/opt/nfs/share1
fs_server wagon
fs_directory /nfs/mnt/share1
fs_type nfs
#fs_mount_opt —o local_lock =all
#fs_umount_opt
#fs_fsck_opt

NOTE:
fs_umount_opt is optional and fs_fsck_opt is not used for an NFS-imported file system. (Both are left
commented out in this example.)

fs_directory
The root of the file system specified by fs_name.
See the mount manpage and the comments in the configuration file for more information.

fs_type
The type of the file system specified by fs_name.
For an NFS-imported file system, this must be set to nfs. See the example under fs_server.

242 fs_name
File System Types, Commands, and Platforms lists the supported file system types, commands, and
platforms.

Table 11: File System Types, Commands, and Platforms

File system type fsck command Supported platform

ext3 e2fsck Red Hat Enterprise Linux 5


Red Hat Enterprise Linux 6
SUSE Linux Enterprise Server 11

ext4 e4fsck Red Hat Enterprise Linux 51


Red Hat Enterprise Linux 6

XFS xfs_repair Red Hat Enterprise Linux 6


SUSE Linux Enterprise Server 11

Btrfs2 btrfsck Red Hat Enterprise Linux 6 and later


SUSE Linux Enterprise Server 11 SP2

VxFS file system fsck Red Hat Enterprise Linux 5


Red Hat Enterprise Linux 6
SUSE Linux Enterprise Server 11 SP2 and
SP3

1 This is supported from SGLX_00354.tar.shar patch and later.


2 Btrfs is a new copy on write (CoW) filesystem for Linux aimed at implementing advanced features while focusing on
fault tolerance, repair, and administration.

WARNING: ext4 file system has a delayed allocation mechanism. Hence, the behavior of writing
files to disk is different from ext3. Unlike ext3, the ext4 file system does not write data to disk on
committing the transaction, so it takes longer for the data to be written to the disk. Your program
must use data integrity calls such as fsync() to ensure that data is written to the disk.

See the comments in the package configuration file template for more information.

fs_mount_opt
The mount options for the file system specified by fs_name. See the comments in the configuration file
for more information.

fs_umount_opt
The umount options for the file system specified by fs_name. See the comments in the configuration file
for more information.

fs_fsck_opt
The fsck options for the file system specified by fs_name (see fs_type).

NOTE: When using XFS file system you must use the xfs_repair command options instead of fsck
command options for this parameter.

fs_mount_opt 243
For more information, see the fsck and xfs_repair manpage, and the comments in the configuration
file.

pv
Physical volume on which persistent reservations (PR) will be made if the device supports it.

IMPORTANT:

This parameter is for use only by Hewlett Packard Enterprise partners, who should follow the
instructions in the package configuration file.

For information about Serviceguard's implementation of PR, see About Persistent Reservations.

pev_
Specifies a package environment variable that can be passed to external_pre_script, external_script, or
both, by means of the cmgetpkgenv command.
The variable name must be in the form pev_<variable_name> and contain only alphanumeric
characters and underscores. The letters pev (upper-case or lower-case) followed by the underscore (_)
are required.
The variable name and value can each consist of a maximum of MAXPATHLEN characters (4096 on
Linux systems).
You can define more than one variable. See About External Scripts, as well as the comments in the
configuration file, for more information.

external_pre_script
The full pathname of an external script to be executed before volume groups and disk groups are
activated during package startup, and after they have been deactivated during package shutdown; that is,
effectively the first step in package startup and last step in package shutdown. New for modular
packages. For restrictions, see service_cmd.
If more than one external_pre_script is specified, the scripts will be executed on package startup in the
order they are entered into the package configuration file, and in the reverse order during package
shutdown.
See About External Scripts, as well as the comments in the configuration file, for more information and
examples.

external_script
The full pathname of an external script. For restrictions, see service_cmd. This script is often the means
of launching and halting the application that constitutes the main function of the package. New for
modular packages.
The script is executed on package startup after volume groups and file systems are activated and IP
addresses are assigned, but before services are started; and during package shutdown after services are
halted but before IP addresses are removed and volume groups and file systems deactivated.
If more than one external_script is specified, the scripts will be executed on package startup in the order
they are entered into this file, and in the reverse order during package shutdown.
See About External Scripts, as well as the comments in the configuration file, for more information and
examples. See also service_cmd.

244 pv
user_host
The system from which a user specified by user_name user_name can execute package-administration
commands.
Legal values are any_serviceguard_node, or cluster_member_node, or a specific cluster node. If
you specify a specific node it must be the official hostname (the hostname portion, and only the
hostname portion, of the fully qualified domain name). As with user_name, be careful to spell the
keywords exactly as given.

user_name
Specifies the name of a user who has permission to administer this package. See also user_host
user_host and user_role; these three parameters together define the access control policy for this
package (see Controlling Access to the Cluster). These parameters must be defined in this order:
user_name, user_host, user_role.
Legal values for user_name are any_user or a maximum of eight login names from /etc/passwd on
user_host.

NOTE: Be careful to spell any_user exactly as given; otherwise Serviceguard will interpret it as a user
name.

Note that the only user_role that can be granted in the package configuration file is package_admin for
this particular package; you grant other roles in the cluster configuration file. See Setting up Access-
Control Policies for further discussion and examples.

user_role
Must be package_admin, allowing the user access to the cmrunpkg, cmhaltpkg, and cmmodpkg
commands (and the equivalent functions in Serviceguard Manager) and to the monitor role for the
cluster. See Controlling Access to the Cluster for more information.

Generating the Package Configuration File


When you have chosen the configuration modules your package needs (see Choosing Package
Modules), you are ready to generate a package configuration file that contains those modules. This file
will consist of a base module (failover, multi-node or system multi-node) plus the modules that contain the
additional parameters you have decided to include.

Before You Start


Before you start building a package, create a subdirectory for it in the $SGCONF directory, for example:
mkdir $SGCONF/pkg1
(See Understanding the Location of Serviceguard Files on page 169 for information about
Serviceguard pathnames.)

cmmakepkg Examples
The cmmakepkg command generates a package configuration file. Some examples follow; see the
cmmakepkg (1m) manpage for complete information. All the examples create an editable configuration
file pkg1.conf in the $SGCONF/pkg1 directory.

user_host 245
NOTE: If you do not include a base module (or default or all) on the cmmakepkg command line,
cmmakepkg will ignore the modules you specify and generate a default configuration file containing all the
parameters.
For a complex package, or if you are not yet sure which parameters you will need to set, the default may
be the best choice; see the first example below.
You can use the-v option with cmmakepkg to control how much information is displayed online or
included in the configuration file. Valid values are 0, 1 and 2. -v 0 removes all comments; -v 1 includes a
brief heading for each parameter; -v 2 provides a full description of each parameter. The default is level
2.

• To generate a configuration file that contains all the optional modules:


cmmakepkg $SGCONF/pkg1/pkg1.conf

• To create a generic failover package (that could be applied without editing):


cmmakepkg -n pkg1
-m sg/failover $SGCONF/pkg1/pkg1.conf

• To generate a configuration file for a failover package that uses relocatable IP addresses and runs an
application that requires file systems to be mounted at run time (enter the command all on one line):
cmmakepkg -m sg/failover -m sg/package_ip -m sg/service -m
sg/filesystem -m sg/volume_group $SGCONF/pkg1/pkg1.conf

• To generate a configuration file adding the generic resources module to an existing package
(enter the command all on one line):
cmmakepkg -i$SGCONF/pkg1/pkg1.conf -m sg/generic_resource

• To generate a configuration file for a failover package that runs an application that requires another
package to be up (enter the command all on one line):
cmmakepkg -m sg/failover -m sg/dependency -m sg/service
$SGCONF/pkg1/pkg1.conf

• To generate a configuration file adding the services module to an existing package (enter the
command all on one line):
cmmakepkg -i $SGCONF/pkg1/pkg1.conf -m sg/service
$SGCONF/pkg1/pkg1_v2.conf

NOTE: You can add more than one module at a time.

• To generate a configuration file adding the Persistent Reservation module to an existing package:
cmmakepkg -i $SGCONF/pkg1/pkg1.conf -m sg/pr_cntl

• To create a serviceguard-xdc package in serviceguard-xdc environment:


cmmakepkg -m sg/all -m xdc/xdc pkg_xdc.conf
cmcheckconf -P pkg_xdc.conf
cmapplyconf -P pkg_xdc.conf

Next Step
The next step is to edit the configuration file you have generated; see Editing the Configuration File.

246 Next Step


Editing the Configuration File
When you have generated the configuration file that contains the modules your package needs (see
Generating the Package Configuration File), you need to edit the file to set the package parameters to
the values that will make the package function as you intend.
It is a good idea to configure complex failover packages in stages, as follows:

1. Configure volume groups and mount points only.


2. Check and apply the configuration; see Verifying and Applying the Package Configuration.
3. Run the package and ensure that it can be moved from node to node.

NOTE: cmcheckconf and cmapplyconf check for missing mount points, volume groups, etc.

4. Halt the package.


5. Configure package IP addresses and application services.
6. Run the package and ensure that applications run as expected and that the package fails over
correctly when services are disrupted. See Testing the Package Manager .

Use the following bullet points as a checklist, referring to the Package Parameter Explanations, and the
comments in the configuration file itself, for detailed specifications for each parameter.

NOTE: Optional parameters are commented out in the configuration file (with a # at the beginning of the
line). In some cases these parameters have default values that will take effect unless you uncomment the
parameter (remove the #) and enter a valid value different from the default. Read the surrounding
comments in the file, and the explanations in this chapter, to make sure you understand the implications
both of accepting and of changing a given default.
In all cases, be careful to uncomment each parameter you intend to use and assign it the value you want
it to have.

• package_name. Enter a unique name for this package. Note that there are stricter formal requirements
for the name as of A.11.18.
• package_type. Enter failover or multi_node. ( system_multi_node is reserved for
special-purpose packages supplied by HP.) Note that there are restrictions if another
package depends on this package; see About Package Dependencies . See Types of Package:
Failover, Multi-Node, System Multi-Node for more information.
• node_name. Enter the name of each cluster node on which this package can run, with a separate
entry on a separate line for each node.
• auto_run. For failover packages, enter yes to allow Serviceguard to start the package on the first
available node specified by node_name, and to automatically restart it later if it fails. Enter no to keep
Serviceguard from automatically starting the package.
• node_fail_fast_enabled. Enter yes to cause the node to be halted (system halt) if the package fails;
otherwise enter no.

• run_script_timeout and halt_script_timeout. Enter the number of seconds Serviceguard should wait for
package startup or shutdown, respectively, to complete; or leave the default, no_timeout. See
run_script_timeout.
• successor_halt_timeout. Used if other packages depend on this package; see About Package
Dependencies.

Editing the Configuration File 247


• script_log_file.
• log_level.
• failover_policy. Enter configured_node or min_package_node. (This parameter can be set for
failover packages only.)
• failback_policy . Enter automatic or manual.
(This parameter can be set for failover packages only.)

• If this package will depend on another package or packages, enter values for dependency_name ,
dependency_condition, dependency_location, and optionally priority.
See About Package Dependencies for more information.

NOTE: The package(s) this package depends on must already be part of the cluster configuration by
the time you validate this package (via cmcheckconf; see Verifying and Applying the Package
Configuration); otherwise validation will fail.

• To configure package weights, use the weight_name and weight_value parameters weight_name,
weight_value . See About Package Weights for more information.
• Use monitored_subnet to specify a subnet to be monitored for this package. If there are multiple
subnets, repeat the parameter as many times as needed, on a new line each time.
In a cross-subnet configuration, configure the additional monitored_subnet_accessparameter for each
monitored_subnet as necessary; see About Cross-Subnet Failover for more information.

• If your package will use relocatable IP addresses, enter the ip_subnet and ip_address addresses. See
the parameter descriptions About Cross-Subnet Failover for rules and restrictions.
In a cross-subnet configuration, configure the additional ip_subnet_node parameter for each ip_subnet
as necessary; see About Cross-Subnet Failover for more information.

• For each service the package will run:

◦ enter the service_name (for example, a daemon or long-running process)


◦ enter the service_cmd (for example, the command that starts the process)
◦ enter values for service_fail_fast_enabled and service_halt_timeout if you need to change them
from their defaults.
◦ service_restart if you want the package to restart the service if it exits. (A value of unlimited can
be useful if you want the service to execute in a loop, rather than exit and halt the package.)

Include a service entry for disk monitoring if the package depends on monitored disks. Use entries
similar to the following:
service_name=“cmresserviced_Pkg1”
service_cmd=”$SGSBIN/cmresserviced /dev/sdd1”
service_restart=””
See Creating a Disk Monitor Configuration for more information.

• To monitor a crucial resource as part of a package using generic resources, enter values for the
following parameters:

248 Configuring Packages and Their Services


◦ generic_resource_name to identify the generic resource in a package.
◦ generic_resource_evaluation_type to define whether the status of the generic resource must be
evaluated during or before the package is started.
◦ generic_resource_up_criteria to determine the status of a generic resource based on the specified
criterion.
See Configuring a Generic Resource for more information.

• If the package needs to activate LVM volume groups, configure vgchange_cmd, or leave the default.
• If the package needs to mount LVM volumes to file systems (see fs_type ), use the vg parameters to
specify the names of the volume groups to be activated, and select the appropriate vgchange_cmd .
Use the fs_ parameters fs_name to specify the characteristics of file systems and how and where to
mount them. See the comments in the FILESYSTEMS section of the configuration file for more
information and examples.
Enter each volume group on a separate line, for example:
vg vg01
vg vg02

• If your package mounts large number of file systems, consider increasing the values of the following
parameters:

◦ concurrent_fsck_operations—specifies the number of parallel fsckoperations that will be allowed


at package startup.

• Specify the filesystem mount and unmount retry options.


• You can use the pev_ parameter to specify a variable to be passed to external scripts. Make sure the
variable name begins with the upper-case or lower-case letters pev and an underscore (_). You can
specify more than one variable. See About External Scripts, and the comments in the configuration
file, for more information.
• If you want the package to run an external “pre-script” during startup and shutdown, use the
external_pre_script parameter (see external_pre_script) to specify the full pathname of the script, for
example, $SGCONF/pkg1/pre_script1.

• If the package will run an external script, use the external_script parameter (see external_script) to
specify the full pathname of the script, for example, $SGCONF/pkg1/script1.
See About External Scripts , and the comments in the configuration file, for more information.

• Configure the Access Control Policy for up to eight specific users or any_user.
The only user role you can configure in the package configuration file is package_admin for the
package in question. Cluster-wide roles are defined in the cluster configuration file. See Setting up
Access-Control Policies for more information.

• If you are using mirrored VxVM disks, use vxvol_cmd to specify the mirror recovery option to be used
by

Configuring Packages and Their Services 249


vxvol.

• You can specify a deactivation_retry_count for LVM and VxVM volume groups. See
deactivation_retry_count.

Adding or Removing a Module from an Existing Package


To add a module to an existing package, use the cmmakepkg command to generate a new configuration
file. Then, include the parameters of the new module to the existing package configuration file and re-
apply the package configuration.
For example, to add an external_script module to an existing package, say pkg1:

1. Obtain a copy of the package configuration file:


cmgetconf -p pkg1 pkg1.conf

2. Generate a new configuration file adding the external_script module to the existing package
pkg1:
cmmakepkg -i pkg1.conf -m sg/external_script pkg1_v2.conf

3. Edit the package configuration file and specify the external_script parameter.

4. Re-apply the package configuration:


cmapplyconf -P pkg1_v2.conf

To remove a module from an existing package, use the cmmakepkg command to generate a new
configuration file excluding the module that you want to remove. Then, copy the remaining package
attributes from the old configuration file to the new configuration file and re-apply the package
configuration.

Verifying and Applying the Package Configuration


Serviceguard checks the configuration you enter and reports any errors.
Use a command such as the following to verify the content of the package configuration file you have
created, for example:
cmcheckconf -v -P $SGCONF/pkg1/pkg1.conf
Errors are displayed on the standard output. If necessary, re-edit the file to correct any errors, then run
cmcheckconf again until it completes without errors.
The following items are checked:

• Package name is valid, and at least one node_name entry is included.


• There are no duplicate parameter entries (except as permitted for multiple volume groups, etc).
• Values for all parameters are within permitted ranges.
• Configured resources are available on cluster nodes.
• File systems and volume groups are valid.

250 Adding or Removing a Module from an Existing Package


NOTE: Ensure that the filesystem is clean before adding any filesystem to the package configuration
file. To do so, follow these steps:

1. Activate the volume group on which the filesystem is created:


vgchange --addtag <Fully qualified domain name of the node> <volume_group
name>
vgchange -a y <volume_group name>

2. Run the filesystem specific fsck command for ext2, ext3, and ext4 filesystems:
fsck_command <fs_name>
Where, fsck_command can be different depending on the filesystem type.
If fsck commands detects that the filesystem is clean, you can add the filesystem to the package
configuration file. Otherwise, do not add the filesystem to the package configuration file. For
information about filesystem specific fsck command, see File System Types, Commands, and
Platforms

3. If the filesystem is of xfs or btrfs, mount the filesystem:

◦ If mount is successful, you can use the filesystem and umount.


◦ If mount is not successful, run xfs_repair command for xfs filesystem and btrfsck
command for btrfs filesystem. If xfs_repair or btrfsck command succeeds, you can use
the filesystem in the package configuration file.

4. Deactivate the volume group:


vgchange -a n <volume_group name>

5. Run the following command:


vgchange --deltag <Fully qualified domain name of the node> <volume_group
name>

• Services are executable.


• Any package that this package depends on is already be part of the cluster configuration.

For more information, see the manpage for cmcheckconf (1m) and Verifying Cluster Analytics
Daemon.
When cmcheckconf has completed without errors, apply the package configuration, for example:
cmapplyconf -P $SGCONF/pkg1/pkg1.conf
This adds the package configuration information to the binary cluster configuration file in the $SGCONF
directory and distributes it to all the cluster nodes.

NOTE: For modular packages, you now need to distribute any external scripts identified by the
external_pre_script and external_script parameters.

Configuring Packages and Their Services 251


Alert Notification for Serviceguard Environment
Alert notification enhances Serviceguard capability by sending email notification to configured email
addresses in case of predefined set of events. Email addresses can be configured by editing the email_id
parameter in the package configuration file. This feature is applicable on Oracle, NFS, Enterprise
Database Postgres Plus Advanced Server (EDB PPAS) and Sybase Toolkits, and serviceguard-xdc
packages.

NOTE: This feature is supported only on modular style package.

Oracle and NFS Toolkits Environment


For information about alert notification on Oracle and NFS toolskits environment, see the following
documents at http://www.hpe.com/info/linux-serviceguard-docs:

• HPE Serviceguard Toolkit for Oracle version A.05.01.10 on Linux User Guide
• HPE Serviceguard Toolkit for NFS version A.03.03.10 on Linux User Guide

serviceguard-xdc Environment
By default, this parameter is commented and is present in the package configuration file for the
serviceguard-xdc packages.
The email_id parameter must be used to provide email addresses of the serviceguard-xdc alert
notification recipients. Each email_id parameter can have one of the following values:

• A complete email address


• An alias
• A distribution list

You can also include multiple recipients by repeating the email_id address.
The serviceguard-xdc package can send an alert email:

• When a mirror half of the MD device becomes inaccessible


• When raid_monitor service cannot add back a mirror half of the MD device after the mirror half
becomes accessible.
• When a mirror half of the LVM RAID1 logical volume device becomes inaccessible.
• When number of mirrored volumes of a LVM RAID1 logical volume is greater than two or equal to one.
• When a mirror half of the VxVM volume becomes inaccessible.
• When a number of plexes for a mirrored VxVM volume is greater than two or equal to one.

For example, consider the following scenario:


If the xdcpkg package is running on node1 and the MD device configured in xdcpkg package is /dev/
md0. /dev/hpdev/my_disk1 and /dev/hpdev/my_disk2 are the mirror halves of the MD /dev/
md0, and for some reason /dev/hpdev/my_disk2 becomes inaccessible. If the email_id specified in
the package configuration file is sgusername@example.com. The following email notification is sent to
sgusername@example.com:

Date: Tue, 9 Oct 2012 23:18:01 -0700


From: root <root@node1.hp.com>
Message-Id: <201210100618.q9A6I1d9023167@node1.hp.com>

252 Alert Notification for Serviceguard Environment


To: sgusername@example.com
Subject: Serviceguard Alert: Package xdcpkg has lost access to my_disk2 of md0 on node1

Hi,

There seems to be an issue in the package xdcpkg in your Serviceguard cluster.


For more information, check the package and system logs of node1.hp.com.

Time of failure : Tue Oct 9 23:18:01 PDT 2012


Cluster Name : node1_cluster
Node name : node1.hp.com
Location of package log: /usr/local/cmcluster/run/log/xdcpkg.log

The mirror half /dev/hpdev/my_disk2 of MD device /dev/md0, which is configured in package xdcpkg, is not
accessible from node node1. Please rectify the issue.
Thanks.

Adding the Package to the Cluster


You can add the new package to the cluster while the cluster is running, subject to the value of
max_configured_packages in the cluster configuration file. See Adding a Package to a Running Cluster
on page 296.

How Control Scripts Manage VxVM Disk Groups


VxVM disk groups are outside the control of the Serviceguard cluster. The package control script uses
standard VxVM commands to import and deport these disk groups. (For more information on importing
and deporting disk groups, see the import and deport options in the vxdg man page.)
The control script imports disk groups using the vxdg command with the -tfC options. The -t option
specifies that the disk is imported with the noautoimport flag, which means that the disk will not be
automatically re-imported at boot time. Since disk groups included in the package control script are only
imported by Serviceguard packages, they should not be auto-imported.
The -foption allows the disk group to be imported even if one or more disks (a mirror, for example) is not
currently available. The -C option clears any existing host ID that might be written on the disk from a prior
activation by another node in the cluster. If the disk had been in use on another node which has gone
down with a TOC, then its host ID may still be written on the disk, and this needs to be cleared so the new
node’s ID can be written to the disk. Note that the disk groups are not imported clearing the host ID if the
host ID is set and matches a node that is not in a failed state. This is to prevent accidental importation of
a disk group on multiple nodes which could result in data corruption.

CAUTION: Although Serviceguard uses the -C option within the package control script framework,
this option should not normally be used from the command line. Troubleshooting Your Cluster,
shows some situations where you might need to use -C from the command line.

The following example shows the command with the same options that are used by the control script:
# vxdg -tfC import dg_01
This command takes over ownership of all the disks in disk group dg_01, even though the disk currently
has a different host ID written on it. The command writes the current node’s host ID on all disks in disk
group dg_01 and sets the noautoimport flag for the disks. This flag prevents a disk group from being
automatically re-imported by a node following a reboot. If a node in the cluster fails, the host ID is still
written on each disk in the disk group. However, if the node is part of a Serviceguard cluster then on
reboot the host ID will be cleared by the owning node from all disks which have the noautoimport flag
set, even if the disk group is not under Serviceguard control. This allows all cluster nodes, which have
access to the disk group, to be able to import the disks as part of the cluster operation.
The control script also uses the vxvol startall command to start up the logical volumes in each disk
group that is imported.

Adding the Package to the Cluster 253


Creating a Disk Monitor Configuration
Serviceguard provides disk monitoring for the shared storage that is activated by packages in the cluster.
The monitor daemon on each node tracks the status of all the disks on that node that you have
configured for monitoring.
The configuration must be done separately for each node in the cluster, because each node monitors only
the group of disks that can be activated on that node, and that depends on which packages are allowed
to run on the node.
To set up monitoring, include a monitoring service in each package that uses disks you want to track.
Remember that service names must be unique across the cluster; you can use the package name in
combination with the string cmresserviced. The following shows an entry in the package configuration
file for pkg1:
service_name cmresserviced_pkg1
service_fail_fast_enabled yes
service_halt_timeout 300
service_cmd "$SGSBIN/cmresserviced /dev/sdd1 /dsv/sde1"
service_restart none

CAUTION: Because of a limitation in LVM, service_fail_fast_enabled must be set to yes, forcing the
package to fail over to another node if it loses its storage.

NOTE:

• The service_cmd entry must include the cmresserviced command.


It is also important to set service_restartto none.

• Hewlett Packard Enterprise recommends that if you are using cmresservied command to monitor a
VMFS disks, configure cmresserviced command to monitor the logical volume path, the volume
group name, or the persistent device names using UDEV as these are persistent.

254 Creating a Disk Monitor Configuration


Cluster and Package Maintenance
This chapter describes the cmviewcl command, then shows how to start and halt a cluster or an
individual node, how to perform permanent reconfiguration, and how to start, halt, move, and modify
packages during routine maintenance of the cluster. Topics are as follows:

• Reviewing Cluster and Package Status


• Managing the Cluster and Nodes
• Managing Packages and Services
• Reconfiguring a Cluster on page 285
• Reconfiguring a Package
• Responding to Cluster Events
• Single-Node Operation
• Removing Serviceguard from a System

Reviewing Cluster and Package Status


You can check the status using Serviceguard Manager, or from a cluster node’s command line.

Reviewing Cluster and Package Status with the cmviewcl Command


Information about cluster status is stored in the status database, which is maintained on each individual
node in the cluster. You can display information contained in this database by means of the cmviewcl
command:
cmviewcl -v
You can use the cmviewcl command without root access; in clusters running Serviceguard version A.
11.16 or later, grant access by assigning the Monitor role to the users in question. In earlier versions,
allow access by adding <nodename> <nonrootuser> to the cmclnodelist file.
cmviewcl -v displays information about all the nodes and packages in a running cluster, together with
the settings of parameters that determine failover behavior.

TIP: Some commands take longer to complete in large configurations. In particular, you can expect
Serviceguard’s CPU usage to increase during cmviewcl -v as the number of packages and
services increases.

See the manpage for a detailed description of other cmviewcl options.

Viewing Package Dependencies


The cmviewcl -v command output lists dependencies throughout the cluster. For a specific package’s
dependencies, use the -p<pkgname> option.

Cluster Status
The status of a cluster, as shown by cmviewcl, can be one of the following:

Cluster and Package Maintenance 255


• up - At least one node has a running cluster daemon, and reconfiguration is not taking place.

• down - No cluster daemons are running on any cluster node.

• starting - The cluster is in the process of determining its active membership. At least one cluster
daemon is running.
• unknown - The node on which the cmviewcl command is issued cannot communicate with other
nodes in the cluster.

Node Status and State


The status of a node is either up (active as a member of the cluster) or down (inactive in the cluster),
depending on whether its cluster daemon is running or not. Note that a node might be down from the
cluster perspective, but still up and running Linux.
A node may also be in one of the following states:

• Failed. A node never sees itself in this state. Other active members of the cluster will see a node in
this state if the node is no longer active in the cluster, but is not shut down.
• Reforming. A node is in this state when the cluster is re-forming. The node is currently running the
protocols which ensure that all nodes agree to the new membership of an active cluster. If agreement
is reached, the status database is updated to reflect the new cluster membership.
• Running. A node in this state has completed all required activity for the last re-formation and is
operating normally.
• Halted. A node never sees itself in this state. Other nodes will see it in this state after the node has
gracefully left the active cluster, for instance with a cmhaltnode command.

• Unknown. A node never sees itself in this state. Other nodes assign a node this state if it has never
been an active cluster member.

Package Status and State


The status of a package can be one of the following:

• up - The package master control script is active.

• down - The package master control script is not active.

• start_wait - A cmrunpkg command is in progress for this package. The package is waiting for
packages it depends on (predecessors) to start before it can start.
• starting - The package is starting. The package master control script is running.

• halting - A cmhaltpkg command is in progress for this package and the halt script is running.

• halt_wait - A cmhaltpkg command is in progress for this package. The package is waiting to be
halted, but the halt script cannot start because the package is waiting for packages that depend on it
(successors) to halt. The parameter description for successor_halt_timeout provides more
information.
• failing - The package is halting because it, or a package it depends on, has failed.

• fail_wait - The package is waiting to be halted because the package or a package it depends on
has failed, but must wait for a package that depends on it to halt before it can halt.

256 Node Status and State


• relocate_wait - The package’s halt script has completed or Serviceguard is still trying to place the
package.
• reconfiguring — The node where this package is running is adjusting the package configuration to
reflect the latest changes that have been applied.
• reconfigure_wait — The node where this package is running is waiting to adjust the package
configuration to reflect the latest changes that have been applied.
• detached - A package is said to be detached from the cluster or node where it was running, when the
cluster or node is halted with —d option. Serviceguard no longer monitors this package. The last
known status of the package before it is detached from the cluster was up.
• unknown - Serviceguard could not determine the status at the time cmviewcl was run.

A system multi-node package is up when it is running on all the active cluster nodes. A multi-node
package is up if it is running on any of its configured nodes.
A system multi-node package can have a status of changing, meaning the package is in transition on
one or more active nodes.
The state of a package can be one of the following:

• starting - The package is starting. The package master control script is running.

• start_wait - A cmrunpkg command is in progress for this package. The package is waiting for
packages it depends on (predecessors) to start before it can start.
• running - Services are active and being monitored.

• halting - A cmhaltpkg command is in progress for this package and the halt script is running.

• halt_wait - A cmhaltpkg command is in progress for this package. The package is waiting to be
halted, but the halt script cannot start because the package is waiting for packages that depend on it
(successors) to halt. The parameter description for successor_halt_timeout provides more
information.
• halted- The package is down and halted.

• halt_aborted — The package is aborted during its normal halt sequence. For details, see
cmhaltpkg(1m) man page.

• failing - The package is halting because it, or a package it depends on, has failed.

• fail_wait - The package is waiting to be halted because the package or a package it depends on
has failed, but must wait for a package it depends on to halt before it can halt.
• failed - The package is down and failed.

• relocate_wait - The package’s halt script has completed or Serviceguard is still trying to place the
package.
• maintenance — The package is in maintenance mode; see Maintaining a Package: Maintenance
Mode.
• detached - A package is said to be detached from the cluster, when the cluster or node on which it
was running was halted with the -d option. All package components are up and running when a
package is detached. Serviceguard does not monitor the packages when in detached state.
• reconfiguring — The node where this package is running is adjusting the package configuration to
reflect the latest changes that have been applied.

Cluster and Package Maintenance 257


• reconfigure_wait — The node where this package is running is waiting to adjust the package
configuration to reflect the latest changes that have been applied.
• unknown - Serviceguard could not determine the state at the time cmviewcl was run.

The following states are possible only for multi-node packages:

• blocked - The package has never run on this node, either because a dependency has not been met,
or because auto_run is set to no.
• changing - The package is in a transient state, different from the status shown, on some nodes. For
example, a status of starting with a state of changing would mean that the package was starting
on at least one node, but in some other, transitory condition (for example, failing) on at least one
other node.

Package Switching Attributes


cmviewcl shows the following package switching information:

• AUTO_RUN: Can be enabled or disabled. For failover packages, enabled means that the package
starts when the cluster starts, and Serviceguard can switch the package to another node in the event
of failure.
For system multi-node packages, enabled means an instance of the package can start on a new
node joining the cluster (disabled means it will not).

• Switching Enabled for a Node: For failover packages, enabled means that the package can switch to
the specified node. disabled means that the package cannot switch to the specified node until the
node is enabled to run the package via the cmmodpkg command.
Every failover package is marked enabled or disabled for each node that is either a primary or
adoptive node for the package.
For multi-node packages, node switching disabled means the package cannot start on that node.

Service Status
Services have only status, as follows:

• Up. The service is being monitored.

• Down. The service is not running. It may not have started, or have halted or failed.

• Unknown. Serviceguard cannot determine the status.

Generic resource status for cluster and package


Generic resources for cluster and package have the following status:

• Up. The generic resource is up.

• Down. The generic resource is down.

• Unknown. Resource monitoring has not yet set the status of the resource.

258 Package Switching Attributes


Network Status
The network interfaces have only status, as follows:

• Up.

• Down.

• Unknown. Serviceguard cannot determine whether the interface is up or down.

Failover and Failback Policies


Failover packages can be configured with one of two values for the failover_policy parameter, as
displayed in the output of cmviewcl -v:

• configured_node. The package fails over to the next node in the node_name list in the package
configuration file.
• min_package_node. The package fails over to the node in the cluster with the fewest running
packages on it.

Failover packages can also be configured with one of two values for the failback_policy parameter, and
these are also displayed in the output of cmviewcl -v:

• automatic: Following a failover, a package returns to its primary node when the primary node
becomes available again.
• manual: Following a failover, a package will run on the adoptive node until moved back to its original
node by a system administrator.

Examples of Cluster and Package States


The following sample output from the cmviewcl -v command shows status for the cluster in the sample
configuration.

Normal Running Status


Everything is running normally; both nodes in the cluster are running, and the packages are in their
primary locations.

CLUSTER STATUS
example up
NODE STATUS STATE
ftsys9 up running
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1
Cluster Generic Resources:
NAME SCOPE TYPE STATUS / COMMAND CURRENT- MAX-CONFIGURED
VALUE STATUS RESTARTS RESTARTS
cpu_monitor node Extended 1 up 0 25
PACKAGE STATUS STATE AUTO_RUN NODE
pkg1 up running enabled ftsys9
Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node

Network Status 259


Failback manual
Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 service1
Service up 0 0 sfm_disk_monitor
Subnet up 0 0 15.13.168.0
Generic Resource up sfm_disk
Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys9 (current)
Alternate up enabled ftsys10
NODE STATUS STATE
ftsys10 up running
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1
NAME SCOPE TYPE STATUS / COMMAND CURRENT- MAX-CONFIGURED
VALUE STATUS RESTARTS RESTARTS
cpu_monitor node Extended 1 up 0 25
PACKAGE STATUS STATE AUTO_RUN NODE
pkg2 up running enabled ftsys10
Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual
Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 service2
Service up 0 0 sfm_disk_monitor 1
Subnet up 0 0 15.13.168.0
Generic Resource up sfm_disk1
Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys10 (current)
Alternate up enabled ftsys9

NOTE: The Script_Parameters section of the PACKAGE output of cmviewcl shows the Subnet
status only for the node that the package is running on. In a cross-subnet configuration, in which the
package may be able to fail over to a node on another subnet, that other subnet is not shown (see Cross-
Subnet Configurations).

Quorum Server Status


If the cluster is using a quorum server for tie-breaking services, the display shows the server name, state
and status following the entry for each node, as in the following excerpt from the output of cmviewcl -v:
CLUSTER STATUS
example up
NODE STATUS STATE
ftsys9 up running
Quorum Server Status:
NAME STATUS STATE
lp-qs up running
...
NODE STATUS STATE
ftsys10 up running
Quorum Server Status:

260 Quorum Server Status


NAME STATUS STATE
lp-qs up running

Status After Halting a Package


After we halt pkg2 with the cmhaltpkg command, the output of cmviewcl-v is as follows:

CLUSTER STATUS
example up
NODE STATUS STATE
ftsys9 up running
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1
PACKAGE STATUS STATE AUTO_RUN NODE
pkg1 up running enabled ftsys9
Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual
Script_Parameters:
ITEM STATUS MAX_RESTARTS RESTARTS NAME
Service up 0 0 service1
Service up 0 0 sfm_disk_monitor
Subnet up 0 0 15.13.168.0
Generic Resource up sfm_disk
Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys9 (current)
Alternate up enabled ftsys10
NODE STATUS STATE
ftsys10 up running
Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1
UNOWNED_PACKAGES
PACKAGE STATUS STATE AUTO_RUN NODE
pkg2 down unowned disabled unowned
Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual
Script_Parameters:
ITEM STATUS NODE_NAME NAME
Service down service2
Generic Resource up ftsys9 sfm_disk1
Subnet up 15.13.168.0
Generic Resource up ftsys10 sfm_disk1
Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled ftsys10
Alternate up enabled ftsys9

Status After Halting a Package 261


pkg2 now has the status down, and it is shown as unowned, with package switching disabled. Note that
switching is enabled for both nodes, however. This means that once global switching is re-enabled for the
package, it will attempt to start up on the primary node.

NOTE: If you halt pkg2 with the cmhaltpkg command, and the package contains non-native
Serviceguard modules that failed during the normal halt process, then the package is moved to the
partially_down status and halt_aborted state. The command exits at this point. For more
information, see Handling Failures During Package Halt.

Status After Moving the Package to Another Node


If we use the following command:
cmrunpkg -n ftsys9 pkg2
the output of the cmviewcl -v command is as follows:
CLUSTER STATUS
example up

NODE STATUS STATE


ftsys9 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1

PACKAGE STATUS STATE AUTO_RUN NODE


pkg1 up running
enabled ftsys9

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS MAX_RESTARTS
RESTARTS NAME
Service up
0 0
service1
Service up 0
0 sfm_disk_monitor
Subnet up
0 0
15.13.168.0
Generic Resource up sfm_disk

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled
ftsys9 (current)
Alternate up enabled ftsys10

PACKAGE STATUS STATE AUTO_RUN NODE


pkg2 up running
disabled ftsys9

262 Status After Moving the Package to Another Node


Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover configured_node
Failback manual

Script_Parameters:
ITEM STATUS
MAX_RESTARTS RESTARTS NAME
Service up
0 0 service2
Service up 0
0 sfm_disk_monitor
Subnet up
0 0 15.13.168.0
Generic Resource up sfm_disk

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up
enabled ftsys10
Alternate up
enabled ftsys9 (current)

NODE STATUS STATE


ftsys10 up running

Network_Parameters:
INTERFACE STATUS NAME
PRIMARY up eth0
PRIMARY up eth1

Status After Package Switching is Enabled


The following command changes package status back to Auto Run Enabled:
cmmodpkg -e pkg2
The output of the cmviewcl command is now as follows:
CLUSTER STATUS
example up
NODE STATUS STATE
ftsys9 up running
PACKAGE STATUS STATE AUTO_RUN NODE
pkg1 up running enabled ftsys9
pkg2 up running enabled ftsys9
NODE STATUS STATE
ftsys10 up running
Both packages are now running on ftsys9 and pkg2 is enabled for switching. ftsys10 is running the
daemon and no packages are running on ftsys10.

Status After Halting a Node


After halting ftsys10, with the following command:
cmhaltnode ftsys10
the output of cmviewcl is as follows on ftsys9:
CLUSTER STATUS
example up
NODE STATUS STATE

Status After Package Switching is Enabled 263


ftsys9 up running
PACKAGE STATUS STATE AUTO_RUN NODE
pkg1 up running enabled ftsys9
pkg2 up running enabled ftsys9
NODE STATUS STATE
ftsys10 down halted
This output can be seen on both ftsys9 and ftsys10.

Viewing Information about Unowned Packages


The following example shows packages that are currently unowned, that is, not running on any configured
node.
UNOWNED_PACKAGES

PACKAGE STATUS STATE AUTO_RUN NODE


PKG3 down halted enabled unowned

Policy_Parameters:
POLICY_NAME CONFIGURED_VALUE
Failover min_package_node
Failback automatic

Script_Parameters:
ITEM STATUS NODE_NAME NAME
Subnet up manx 192.8.15.0
Generic Resource unknown manx sfm_disk
Subnet up burmese 192.8.15.0
Generic Resource unknown burmese sfm_disk
Subnet up tabby 192.8.15.0
Generic Resource unknown tabby sfm_disk
Subnet up persian 192.8.15.0
Generic Resource unknown persian sfm_disk

Node_Switching_Parameters:
NODE_TYPE STATUS SWITCHING NAME
Primary up enabled manx
Alternate up enabled burmese
Alternate up enabled tabby
Alternate up enabled persian

Checking the Cluster Configuration and Components


Serviceguard provides tools that allow you to verify the cluster configuration and the state of its
components. In earlier releases, thecmcheckconf command was used to verify the cluster or package
configuration. For more information, see Verifying the Cluster Configuration .
Starting with Serviceguard A.11.20.20, the cmcheckconf command can be used at any time with –v or –
v 2 options to verify the state of the cluster and package components that are already applied.
Starting with Serviceguard A.12.10.00, you can use the cmcheckconf command with -V option to verify
cluster or package(s) or cluster and package(s) or cluster and subset of package(s) or subset of
package(s).
For more information, see the cmcheckconf (1m) manpage.

264 Viewing Information about Unowned Packages


NOTE:

• You can consider setting up a cron (1m) job to run the cmcheckconf command regularly.

• The cmapplyconf command performs the same verification as the cmcheckconf command.

• The extent of logging can be controlled using the verbosity and log levels. Higher the level of verbosity,
higher is the extent of logging. For example, to verify the cluster configuration and package files using
the -v (verbosity) options use:

◦ cmcheckconf -v OR cmcheckconf –v 1 — This command displays error and warning


messages (if any) for the type of checks being performed, such as storage, network, and so on. It
also displays the status of each check as OK or FAIL.
◦ cmcheckconf -v 2 — This command display errors and warning messages along with
informational and success message.

You can use -V option to control the extent of verification.


The cmcheckconf -V all option verifies and displays the error and warning messages along with the
informational message about the checks performed on the cluster and all the packages configured in the
cluster.
The cmcheckconf -V cluster option verifies and displays the error and warning messages along
with the informational message about the checks that are performed on the cluster. When -p
pkg_name_reference_file option is used along with cluster, in addition to the cluster verification, the
package names listed in the pkg_name_reference_file is also verified.
The cmcheckconf -V package verifies and displays the error and warning messages along with the
informational message about the checks that are performed on all the packages configured in the cluster.
When -p pkg_name_reference_file option is used along with package, the checks are performed
on the list of package names in pkg_name_reference_file.
For more information, see the cmcheckconf (1m) manpage.

Verifying Cluster and Package Components


The table describes how to verify each cluster and package component, the command or tool to use and
its description.

Verifying Cluster and Package Components 265


Table 12: Verifying Cluster and Package Components

Component (Context) Tool or Command; More Description


Information

Volume groups (package) cmcheckconf (1m), cmapplyconf Verifies for the following:
(1m)
• existence
See also Verifying the Cluster
Configuration . • availability across all the
nodes where the package
is configured to run.
• same physical volumes
across all the nodes
where the package is
configured to run.
• same volume group
across all the nodes
where the package is
configured to run.

NOTE: The volume group


verifications are ignored in
serviceguard-xdc and
Metrocluster environment.

Volume group activation cmcheckconf (1m), cmapplyconf Verifies whether the volume
protection (cluster) (1m) group activation protection is
enabled in the lvm.conf
file. For more information,
see Enabling Volume
Group Activation
Protection on page 185

LVM physical volumes (package) cmcheckconf (1m), cmapplyconf Verifies for the consistency of
(1m) the volume groups and
physical volumes of the
volume group across all the
nodes where the package is
configured to run.

Table Continued

266 Cluster and Package Maintenance


Component (Context) Tool or Command; More Description
Information

Quorum Server (cluster) cmcheckconf (1m), cmapplyconf These commands verify that
(1m). the quorum server, if used, is
running and all nodes are
authorized to access it; and,
if more than one IP address
is specified, that the quorum
server is reachable from all
nodes through both the IP
addresses.
If you enable
QS_SMART_QUORUM
parameter, it verifies whether
the cluster is configured with
a generic resource named as
sitecontroller_genres.

Lock LUN (cluster) cmcheckconf (1m), cmapplyconf These commands verify if all
(1m) the cluster nodes are
configured to use the same
device as lock LUN and that
the lock LUN device file is a
block device file.

File consistency (cluster) cmcheckconf (1m), cmcompare To verify file consistency


(1m). across all the nodes in the
cluster:
IMPORTANT: See the
manpage for differences in 1. Customize the $SGCONF/
return codes from cmclfiles2check file.
cmcheckconf without options
versus cmcheckconf -C 2. Distribute it to all the
nodes using the cmsync
(1m) command.

3. Run the cmcheckconf,


or cmcheckconf -C, or
cmchekconf -v {1|2}
command.

For a subset of nodes, or to


verify only specific
characteristics such as
ownership, content, and so
on, use the cmcompare
(1m) command.

Table Continued

Cluster and Package Maintenance 267


Component (Context) Tool or Command; More Description
Information

NTP server configuration(cluster) cmcheckconf (1m) The cmchekconf -v {1|


2} command verifies that
NTP (Network time protocol)
service is enabled on each
node in the cluster to ensure
that the system time on all
nodes is consistent, resulting
in consistent timestamps in
log files and consistent
behavior of message
services.

Mount points (package) cmcheckconf (1m), cmapplyconf These commands verify that
(1m) the mount-point directories
specified in the package
configuration file exist on all
nodes that can run the
package.

Service commands (package) cmcheckconf (1m), cmapplyconf These commands verify that
(1m) files specified by service
commands exist and are
executable. Service
commands whose paths are
nested within an unmounted
shared file system are not
checked.

Package IP addresses (package) cmcheckconf (1m), cmapplyconf


(1m)

File systems (package) cmcheckconf (1m), cmapplyconf For LVM only, commands
(1m) verify that file systems are on
the logical volumes identified
by the fs_name parameter.

External scripts and pre-scripts cmcheckconf (1m), cmapplyconf A non-zero return value from
(modular package) (1m) any script results the
commands to fail.

Table Continued

268 Cluster and Package Maintenance


Component (Context) Tool or Command; More Description
Information

NFS server connectivity cmcheckconf (1m), cmapplyconf If the package configuration


(package) (1m) file contains NFS file system,
it validates the following:

• Connectivity to the NFS


server from all the
package nodes.
• Export of share by the
NFS server.
• The status of the NFS
daemons on the NFS
server.

NOTE: For the NFS file


system mount to be
successful, the NFS
daemon must be running
on the NFS server.

VxVM disk groups (package) cmcheckconf (1m), cmapplyconf Commands check that each
(1m) node has a working physical
connection to the disks.

Setting up Periodic Cluster Verification


You can use cron (1m) to run cluster verification at a fixed time interval. Specify the commands to run
in a crontab file (For more information, see the crontab (1) manpage).

NOTE: The job must run on one of the nodes in the cluster. The crontab –e command is used to edit
the crontab file. This must be run as the root user, because only the root user can run cluster
verification. The cron (1m) command sets the job’s user and group IDs to those of the user who
submitted the job.

For example, the following script runs cluster verification and sends an email to admin@xyzcorp.com
when verification fails.
#!/bin/sh

cmcheckconf -v >/tmp/cmcheckconf.output
if (( $? != 0 ))
then
mailx -s "Cluster verification failed" admin@xyzcorp.com 2>&1 </tmp/cmcheckconf.output

fi
#!/bin/sh
cmcheckconf -V all >/tmp/cmcheckconf.output
if (( $? != 0 ))
then
mailx -s "Cluster verification failed" admin@xyzcorp.com 2>&1 /cmcheckconf.output
fi

Setting up Periodic Cluster Verification 269


To run this script from cron, use the crontab -e command and create an entry in the crontabs file.
For example, the following entry runs the script at 8 a.m. on 20th of every month:
0 8,20 * * * verification.sh
For more information, see the crontab (1) manpage.

Limitations
Serviceguard does not check for the following conditions:

• Proper configuration of Access Control Policies. For more information about Access Control Policies,
see Controlling Access to the Cluster.
• File systems configured to mount automatically on boot (that is, Serviceguard does not check /etc/
fstab)

• Uniqueness of volume group major and minor numbers.


• Proper functioning of redundant storage paths.
• Consistency of Kernel parameters and driver configurations across nodes.
• Mount point overlaps (such that one file system is obscured when another is mounted).
• Unreachable DNS server.
• Consistency of settings in .rhosts .

• Nested mount points.

Managing the Cluster and Nodes


This section describes the following tasks:

• Starting the Cluster When all Nodes are Down on page 271
• Adding Previously Configured Nodes to a Running Cluster on page 271
• Removing Nodes from Participation in a Running Cluster on page 271
• Halting the Entire Cluster on page 272
• Automatically Restarting the Cluster on page 272
• Halting a Node or the Cluster while Keeping Packages Running

In Serviceguard A.11.16 and later, these tasks can be performed by non-root users with the appropriate
privileges. See Controlling Access to the Cluster for more information about configuring access.
You can use Serviceguard Manager or the Serviceguard command line to start or stop the cluster, or to
add or halt nodes. Starting the cluster means running the cluster daemon on one or more of the nodes in
a cluster. You use different Serviceguard commands to start the cluster depending on whether all nodes
are currently down (that is, no cluster daemons are running), or whether you are starting the cluster
daemon on an individual node.
Note the distinction that is made in this chapter between adding an already configured node to the cluster
and adding a new node to the cluster configuration. An already configured node is one that is already
entered in the cluster configuration file; a new node is added to the cluster by modifying the cluster
configuration file.

270 Limitations
NOTE: Manually starting or halting the cluster or individual nodes does not require access to the quorum
server, if one is configured. The quorum server is only used when tie-breaking is needed following a
cluster partition.

Starting the Cluster When all Nodes are Down


You can use Serviceguard Manager, or the cmruncl command as described in this section, to start the
cluster when all cluster nodes are down. Particular command options can be used to start the cluster
under specific circumstances.
The -v option produces the most informative output. The following starts all nodes configured in the
cluster without a connectivity check:
cmruncl -v
The above command performs a full check of LAN connectivity among all the nodes of the cluster. Using -
w none option will allow the cluster to start more quickly but will not test connectivity. The following starts
all nodes configured in the cluster without doing connectivity check:
cmruncl -v -w none
The -n option specifies a particular group of nodes. Without this option, all nodes will be started. The
following example starts up the locally configured cluster only onftsys9 and ftsys10. (This form of the
command should only be used when you are sure that the cluster is not already running on any node.)
cmruncl -v -n ftsys9 -n ftsys10

CAUTION: Serviceguard cannot guarantee data integrity if you try to start a cluster with the
cmruncl -n command while a subset of the cluster's nodes are already running a cluster. If the
network connection is down between nodes, using cmruncl -n might result in a second cluster
forming, and this second cluster might start up the same applications that are already running on
the other cluster. The result could be two applications overwriting each other's data on the disks.

Adding Previously Configured Nodes to a Running Cluster


You can use Serviceguard Manager, or Serviceguard commands as shown, to bring a configured node up
within a running cluster.
Use the cmrunnode command to add one or more nodes to an already running cluster. Any node you
add must already be a part of the cluster configuration. The following example adds node ftsys8 to the
cluster that was just started with only nodes ftsys9 and ftsys10. The-v (verbose) option prints out all
the messages
cmrunnode -v ftsys8
By default, cmrunnode will do network validation, making sure the actual network setup matches the
configured network setup. This is the recommended method. If you have recently checked the network
and find the check takes a very long time, you can use the -w none option to bypass the validation.
Since the node's cluster is already running, the node joins the cluster and packages may be started,
depending on the package configuration (see node_name). If the node does not find its cluster running,
or the node is not part of the cluster configuration, the command fails.

Removing Nodes from Participation in a Running Cluster


You can use Serviceguard Manager, or Serviceguard commands as shown below, to remove nodes from
operation in a cluster. This operation removes the node from cluster operation by halting the cluster
daemon, but it does not modify the cluster configuration. To remove a node from the cluster configuration
permanently, you must recreate the cluster configuration file. See the next section.

Starting the Cluster When all Nodes are Down 271


Halting a node is a convenient way of bringing it down for system maintenance while keeping its
packages available on other nodes. After maintenance, the package can be returned to its primary node.
See Moving a Failover Package .
To return a node to the cluster, use cmrunnode.

NOTE: Hewlett Packard Enterprise recommends that you remove a node from participation in the cluster
(by running cmhaltnode as shown below, or Halt Node in Serviceguard Manger) before running the
Linux shutdown command, especially in cases in which a packaged application might have trouble
during shutdown and not halt cleanly.

Using Serviceguard Commands to Remove a Node from Participation in a Running


Cluster
Use the cmhaltnode command to halt one or more nodes in a cluster. The cluster daemon on the
specified node stops, and the node is removed from active participation in the cluster.
To halt a node with a running package, use the -f option. If a package was running that can be switched
to an adoptive node, the switch takes place and the package starts on the adoptive node. For example,
the following command causes the Serviceguard daemon running on node ftsys9 in the sample
configuration to halt and the package running on ftsys9 to move to ftsys10:
cmhaltnode -f -v ftsys9
This halts any packages running on the node ftsys9 by executing the halt instructions in each
package's master control script. ftsys9 is halted and the packages start on the adoptive node,
ftsys10.

Halting the Entire Cluster


You can use Serviceguard Manager, or Serviceguard commands as shown below, to halt a running
cluster.
The cmhaltcl command can be used to halt the entire cluster. This command causes all nodes in a
configured cluster to halt their Serviceguard daemons. You can use the -f option to force the cluster to
halt even when packages are running. This command can be issued from any running node. Example:
cmhaltcl -f -v
This halts all the cluster nodes.

Automatically Restarting the Cluster


You can configure your cluster to automatically restart after an event, such as a long-term power failure,
which brought down all nodes in the cluster. This is done by setting AUTOSTART_CMCLD to 1 in the
$SGAUTOSTART file (see Understanding the Location of Serviceguard Files on page 169).

Halting a Node or the Cluster while Keeping Packages


Running
There may be circumstances where you want to do maintenance that involves halting a node, or the
entire cluster, without halting or failing over the affected packages. Such maintenance might consist of
anything short of rebooting the node or nodes, but a likely case is networking changes that will disrupt the
heartbeat.
New command options in Serviceguard A.11.20.00 (collectively known as Live Application Detach (LAD))
allows you to do this kind of maintenance while keeping the packages running. The packages are no

272 Using Serviceguard Commands to Remove a Node from Participation in a Running Cluster
longer monitored by Serviceguard, but the applications continue to run. Packages in this state are called
detached packages.
When you have done the necessary maintenance, you can restart the node or cluster, and normal
monitoring will resume on the packages.

NOTE: Keep in mind that the purpose of the LAD capabilities is to allow you do maintenance on one or
more nodes, or the entire cluster. If you want to do maintenance on individual packages, or on elements
of the cluster configuration that affect only one package, or a few packages, you should probably use
package maintenance mode; see Maintaining a Package: Maintenance Mode.

What You Can Do


• Halt a node (cmhaltnode (1m) with the -d option) without causing its running packages to halt or
fail over.
Until you restart the node (cmrunnode (1m)) these packages remain detached and are not
monitored by Serviceguard.

• Halt the cluster (cmhaltcl (1m) with the -d option) without causing its running packages to halt.
Until you restart the cluster (cmruncl (1m)) these packages remain detached and are not being
monitored by Serviceguard.

• Halt a detached package, including instances of detached multi-node packages.


• Restart normal package monitoring by restarting the node (cmrunnode) or the cluster ().

• You can forcefully halt a detached node (cmhaltnode (1m)) with the -f option.

Rules and Restrictions


The following rules and restrictions apply.

• All the nodes in the cluster must be running Serviceguard A.11.20.10 or later.
• All the configured cluster nodes must be reachable by an available network.
• You must be the root user (superuser) to halt or start a node or cluster with Live Application Detach,
and to halt a detached package.
• Extended Distance Cluster (serviceguard-xdc) supports LAD for modular failover packages. For more
information, see “Creating a serviceguard-xdc Modular Package” in chapter 5 of HPE Serviceguard
Extended Distance Cluster for Linux A.12.00.40 Deployment Guide at http://www.hpe.com/info/
linux-serviceguard-docs.
• Live Application Detach is supported only with modular failover packages and modular multi-node
packages.

◦ You cannot use Live Application Detach if system multi-node packages are configured in the
cluster.
See Configuring Packages and Their Services on page 219 for more information about package
types.

What You Can Do 273


• You cannot detach package that is in maintenance mode, and you cannot place a package into
maintenance mode if any of its dependent packages are detached. Also, you cannot put a detached
package in maintenance mode.
For more information about maintenance mode, see Maintaining a Package: Maintenance Mode.
For more information about dependencies, see About Package Dependencies.

• You cannot make configuration changes to a package or a cluster in which any packages are
detached.
cmapplyconf (1m) will fail.

• You cannot halt detached packages while the cluster is down.


If you have halted a node and detached its packages, you can log in as superuser on any other node
still running in the cluster and halt any of the detached packages. But if you have halted the cluster,
you must restart it, re-attaching the packages, before you can halt any of the packages.

• cmeval (1m) does not support Live Application Detach.


See Previewing the Effect of Cluster Changes for more information about cmeval.

• In preview mode (-t) cmrunnode and cmruncl can provide only a partial assessment of the effect of
re-attaching packages.
The assessment may not accurately predict the placement of packages that depend on the packages
that will be re-attached. For more information about preview mode, see Previewing the Effect of
Cluster Changes.

• cmmodpkg -e -t is not supported for a detached package.

• You cannot run a package that has been detached.


This could come up if you detect that a package has failed while detached (and hence not being
monitored by Serviceguard). Before you could restart the package on another node, you would need
to run cmhaltpkg (1m) to halt the package on the node where it is detached.

• You cannot halt a package that is in a transitory state such as STARTING or HALTING.

• A package that is in a DETACHED or MAINTENANCE state cannot be moved to a halt_aborted state


or vice versa.
For more information, see Handling Failures During Package Halt.

Additional Points To Note


Keep the following points in mind:

• When packages are detached, they continue to run, but without high availability protection.
Serviceguard does not detect failures of components of detached packages, and packages are not
failed over.

274 Additional Points To Note


IMPORTANT: This means that you will need to detect any errors that occur while the package is
detached, and take corrective action by running cmhaltpkg to halt the detached package and
cmrunpkg (1m) to restart the package on another node.

• When you restart a node or cluster whose packages have been detached, the packages are re-
attached; that is, Serviceguard begins monitoring them again.
At this point, Serviceguard checks the health of the packages that were detached and takes any
necessary corrective action — for example, if a failover package has in fact failed while it was
detached, Serviceguard will halt it and restart it on another eligible node.

CAUTION: Serviceguard does not check LVM volume groups, mount points, and relocatable IP
addresses when re-attaching packages.

• cmviewcl (1m) reports the status and state of detached packages as detached.
This is true even if a problem has occurred since the package was detached and some or all of the
package components are not healthy or not running.

• Because Serviceguard assumes that a detached package has remained healthy, the package is
considered to be UP for dependency purposes.
This means, for example, that if you halt node1, detaching pkgA, and pkgB depends on pkgA to be
UP on ANY_NODE, pkgB on node2 will continue to run (or can start) while pkgA is detached. See
About Package Dependencies for more information about dependencies.

• As always, packages cannot start on a halted node or in a halted cluster.


• When a node having detached packages is back up after a reboot they can:

◦ Rejoin the cluster and the detached packages can move to "running" or "failed" state. If the
detached packages are moved to running state, then they must be halted and rerun as they may
have several inconsistencies post reboot.
◦ Not rejoin the cluster and the detached packages remain detached. Such packages must be halted
and rerun to avoid any inconsistencies that can be caused due to the reboot.

• If you halt a package and disable it before running cmhaltcl -d to detach other packages running in
the cluster, auto_run will be automatically re-enabled for this package when the cluster is started
again, forcing the package to start.
To prevent this behavior and keep the package halted and disabled after the cluster restarts, change
auto_run to no in the package configuration file, and re-apply the package, before running cmhaltcl
-d.

• Post Live Application Detach (LAD), cluster or node generic resource status or value and command
status will be shown with default status or values. However the generic resource command script will
continue to run on the node.
After the cluster or node is re-attached, the generic resource commands will show the actual status or
value of generic resource.

Cluster and Package Maintenance 275


Halting a Node and Detaching its Packages
To halt a node and detach its packages, proceed as follows:

Procedure

1. Make sure that the conditions spelled out under Rules and Restrictions are met.
2. Halt any packages that do not qualify for Live Application Detach, like system multi-node packages.
For example:
cmhaltpkg -n node1 smnpak1 smnpak2

NOTE: If you do not do this, the cmhaltnode in the next step will fail.

3. Halt the node with the -d (detach) option:


cmhaltnode -d node1

NOTE: -d and -fare mutually exclusive. See cmhaltnode (1m) for more information.

To re-attach the packages, restart the node:


cmrunnode node1
You cannot halt or detach a node if any package on the given node is in the halt_aborted state;
cmhaltnode will fail. However, you can forcefully halt the node using cmhaltnode (1m) with the -f
option. The node is halted irrespective of the package state.

Halting a Detached Package


To halt a package that is detached on node1, proceed as follows:

Procedure

1. Log in as superuser on another node that is still running in the cluster.


2. Halt the package; for example:
cmhaltpkg node1 pkg1

Halting the Cluster and Detaching its Packages


Procedure

1. Make sure that the conditions spelled out under Rules and Restrictions are met.
2. Halt any packages that do not qualify for Live Application Detach, like system multi-node packages.
For example:
cmhaltpkg smnp1

NOTE: If you do not do this, the cmhaltcl in the next step will fail.

3. Halt the cluster with the -d (detach) option:

276 Halting a Node and Detaching its Packages


cmhaltcl -d

NOTE: -d and -f are mutually exclusive. See cmhaltcl (1m) for more information.

To re-attach the packages, restart cluster:


cmrunnode node1

Example: Halting the Cluster for Maintenance on the Heartbeat Subnets


Suppose that you need to do networking maintenance that will disrupt all the cluster's heartbeat subnets,
but it is essential that the packages continue to run while you do it. In this example we'll assume that
packages pkg1 through pkg5 are unsupported for Live Application Detach, and pkg6 through pkgn are
supported.
Proceed as follows:

Procedure

1. Halt all the unsupported packages:


cmhaltpkg pkg1 pkg2 pkg3 pkg4 pkg5

2. Halt the cluster, detaching the remaining packages:


cmhaltcl -d

3. Upgrade the heartbeat networks as needed.


4. Restart the cluster, automatically re-attaching pkg6 through pkgn and starting any other packages
that have auto_run set to yes in their package configuration file:
cmruncl

5. Start the remaining packages; for example:


cmmodpkg -e pkg1 pkg2 pkg3 pkg4 pkg5

Managing Packages and Services


This section describes the following tasks:

• Starting a Package on page 277


• Halting a Package on page 278
• Moving a Failover Package
• Changing Package Switching Behavior on page 280

Non-root users with the appropriate privileges can perform these tasks. See Controlling Access to the
Cluster for information about configuring access.
You can use Serviceguard Manager or the Serviceguard command line to perform these tasks.

Starting a Package
Ordinarily, a package configured as part of the cluster will start up on its primary node when the cluster
starts up. You may need to start a package manually after it has been halted manually. You can do this
either in Serviceguard Manager, or with Serviceguard commands as described below.

Example: Halting the Cluster for Maintenance on the Heartbeat Subnets 277
The cluster must be running, and if the package is dependent on other packages, those packages must
be either already running, or started by the same command that starts this package (see the subsection
that follows, and About Package Dependencies.)
You can use Serviceguard Manager to start a package, or Serviceguard commands as shown below.
Use the cmrunpkg command to run the package on a particular node, then use the cmmodpkg command
to enable switching for the package; for example:
cmrunpkg -n ftsys9 pkg1
cmmodpkg -e pkg1
This starts up the package on ftsys9, then enables package switching. This sequence is necessary
when a package has previously been halted on some node, since halting the package disables switching.

Starting a Package that Has Dependencies


Before starting a package, it is a good idea to use the cmviewcl command to check for package
dependencies.
You cannot start a package unless all the packages that it depends on are running. If you try, you’ll see a
Serviceguard message telling you why the operation failed, and the package will not start.
If this happens, you can repeat the run command, this time including the package(s) this package
depends on; Serviceguard will start all the packages in the correct order.

Halting a Package
You halt a package when you want to stop the package but leave the node running.
Halting a package has a different effect from halting the node. When you halt the node, its packages may
switch to adoptive nodes (assuming that switching is enabled for them); when you halt the package, it is
disabled from switching to another node, and must be restarted manually on another node or on the same
node.
System multi-node packages run on all cluster nodes simultaneously; halting these packages stops them
running on all nodes. A multi-node package can run on several nodes simultaneously; you can halt it on
all the nodes it is running on, or you can specify individual nodes.
You can use Serviceguard Manager to halt a package, or cmhaltpkg; for example:
cmhaltpkg pkg1
This halts pkg1 and disables it from switching to another node.

Halting a Package that Has Dependencies


Before halting a package, it is a good idea to use the cmviewcl command to check for package
dependencies.
You cannot halt a package unless all the packages that depend on it are down. If you try, you’ll see a
Serviceguard message telling you why the operation failed, and the package will remain up.
If this happens, you can repeat the halt command, this time including the dependent package(s);
Serviceguard will halt the all the packages in the correct order. First, use cmviewcl to be sure that no
other running package has a dependency on any of the packages you are halting.

Handling Failures During Package Halt


When you halt a package using cmhaltpkg, sometimes errors may occur for various reasons resulting in
the failure of the command. Serviceguard provides an option so that packages can be halted in a way that
when errors occur the halting process is aborted.

278 Starting a Package that Has Dependencies


When you halt a package, if one of the non-native Serviceguard modules fails with an exit status of 3, the
halt is aborted and the package is moved to a partially_down status in a halt_aborted state.

NOTE: Non-native Serviceguard modules are those that are not delivered with the Serviceguard product.
These are additional modules such as those supplied with Serviceguard toolkit modules (for example,
Serviceguard Contributed Toolkit Suite, Oracle, NFS toolkit, EDB PPAS, Sybase, and so on).

This allows errors to be cleaned up manually during the halt process thus minimizing the risk of other
follow on errors and reducing package downtime.
When a package is in the halt_aborted state, you can do one of the following:

• Fix the error manually in the module that caused the package halt to abort and re-run cmhaltpkg
<pkg_name>.

• Run cmhaltpkg -f option to forcefully halt the package. When this command is run, it will halt the
package even if the package is in halt_aborted state.

For example, consider the following scenario:


You have a package pkgA that is up and running on a node. The package contains the following
modules:
sg-module1
sg-module2
non-sg-module1
non-sg-module2
sg-module3
sg-module4
Now, suppose you run the command cmhaltpkg pkgA, if a failure is detected in non-sg-module2,
then the package halt process is aborted at this point and the package is moved to the halt_aborted
state. The command exits and does not proceed further to halt the sg-module3 and sg-module4
modules.
After fixing the error, if you re-run cmhaltpkg pkgA, halt begins from sg-module1 and proceeds.

NOTE: This error handling mechanism is applicable only for failover packages and not for multi-node or
system multi-node packages.
It is applicable only for modular packages.
If a package is in the detached or maintenance mode, the package cannot be in halt_aborted state.

The following operations cannot be performed on a package which is in the partially_down status:

• Reconfigure a package
• Run a package
• Halt a node (however, you can forcefully halt a node using cmhaltnode -f option.)

• Halt a cluster (however, you can forcefully halt a cluster using cmhaltcl -f option.)

• Delete a package
• Failover of a package automatically. You must halt the package completely and manually failover the
package.

Cluster and Package Maintenance 279


Moving a Failover Package
Before you move a failover package to a new node, it is a good idea to run cmviewcl -v -l package
and look at dependencies. If the package has dependencies, be sure they can be met on the new node.
To move the package, first halt it where it is running using the cmhaltpkg command. This action not only
halts the package, but also disables package switching.
After it halts, run the package on the new node using the cmrunpkg command, then re-enable switching
as described below.

Changing Package Switching Behavior


There are two options to consider:

• Whether the package can switch (fail over) or not.


• Whether the package can switch to a particular node or not.

For failover packages, if package switching is NO the package cannot move to any other node; if node
switching is NO, the package cannot move to that particular node. For multi-node packages, if package
switching is set to NO, the package cannot start on a new node joining the cluster; if node switching is set
to NO, the package cannot start on that node.
Both node switching and package switching can be changed dynamically while the cluster is running. The
initial setting for package switching is determined by the auto_run parameter, which is set in the package
configuration file. If auto_run is set to yes, then package switching is enabled when the package first
starts. The initial setting for node switching is to allow switching to all nodes that are configured to run the
package.
You can use Serviceguard Manager to change package switching behavior, or Serviceguard commands
as shown below.
You can change package switching behavior either temporarily or permanently using Serviceguard
commands.
To temporarily disable switching to other nodes for a running package, use the cmmodpkg command. For
example, if pkg1 is currently running, and you want to prevent it from starting up on another node, enter
the following:
cmmodpkg -d pkg1
This does not halt the package, but will prevent it from starting up elsewhere.
You can disable package switching to particular nodes by using the -n option of the cmmodpkg
command. The following prevents pkg1 from switching to node lptest3:
cmmodpkg -d -n lptest3 pkg1
To permanently disable switching so that the next time the cluster restarts, the change you made in
package switching is still in effect, change the auto_run flag in the package configuration file, then re-
apply the configuration. (See Reconfiguring a Package on a Running Cluster on page 295.)

Maintaining a Package: Maintenance Mode


Serviceguard provides two ways to perform maintenance on components of a modular, failover package
while the package is running. These two methods are called maintenance mode and partial-startup
maintenance mode.

280 Moving a Failover Package


NOTE: If you need to do maintenance that requires halting a node, or the entire cluster, you should
consider Live Application Detach; see Halting a Node or the Cluster while Keeping Packages
Running.

• Maintenance mode is chiefly useful for modifying networks while the package is running.
• Partial-startup maintenance mode allows you to work on package services, file systems, and volume
groups.
• Neither maintenance mode nor partial-startup maintenance mode can be used for multi-node
packages, or system multi-node packages.
• Package maintenance does not alter the configuration of the package, as specified in the package
configuration file.
For information about reconfiguring a package, see Reconfiguring a Package.

NOTE: In order to run a package in partial-startup maintenance mode, you must first put it in
maintenance mode. This means that packages in partial-startup maintenance mode share the
characteristics described below for packages in maintenance mode, and the same rules and dependency
rules apply.

Characteristics of a Package Running in Maintenance Mode or Partial-


Startup Maintenance Mode
Serviceguard treats a package in maintenance mode differently from other packages in important ways.
The following points apply to a package running in maintenance mode:

• Serviceguard ignores failures reported by package services, subnets, generic resources, and file
systems; these will not cause the package to fail.

NOTE: But a failure in the package control script will cause the package to fail. The package will also
fail if an external script (or pre-script) cannot be executed or does not exist.

• The package will not be automatically failed over, halted, or started.


• A package in maintenance mode still has its configured (or default) weight, meaning that its weight, if
any, is counted against the node's capacity; this applies whether the package is up or down. (See
About Package Weights for a discussion of weights and capacities.)
• Node-wide and cluster-wide events affect the package as follows:

◦ If the node the package is running on is halted or crashes, the package will no longer be in
maintenance mode but will not be automatically started.
◦ If the cluster is halted or crashes, the package will not be in maintenance mode when the cluster
comes back up. Serviceguard will attempt to start it if auto_run is set to yes in the package
configuration file.

• If node_fail_fast_enabled is set to yes, Serviceguard will not halt the node under any of the
following conditions:

◦ Subnet failure
◦ Generic resource failure

Characteristics of a Package Running in Maintenance Mode or Partial-Startup Maintenance Mode 281


◦ A script does not exist or cannot run because of file permissions
◦ A script times out
◦ The limit of a restart count is exceeded

Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode

IMPORTANT: See the latest Serviceguard release notes for important information about version
requirements for package maintenance.

• The package must have package switching disabled before you can put it in maintenance mode.
• You can put a package in maintenance mode only on one node.

◦ The node must be active in the cluster and must be eligible to run the package (on the package's
node_name list).
◦ If the package is not running, you must specify the node name when you run cmmodpkg (1m) to
put the package in maintenance mode.
◦ If the package is running, you can put it into maintenance only on the node on which it is running.
◦ While the package is in maintenance mode on a node, you can run the package only on that node.

• You cannot put a package in maintenance mode, or take it out maintenance mode, if doing so will
cause another running package to halt.
• Since package failures are ignored while in maintenance mode, you can take a running package out of
maintenance mode only if the package is healthy.
Serviceguard checks the state of the package’s services and subnets to determine if the package is
healthy. If there are any failed services, Serviceguard automatically restarts the failed services when
the package is taken out of maintenance mode. If there are any other failures, you must halt the
package before taking it out of maintenance mode.

• Generic resources configured in a package must be available (status 'up') before taking the package
out of maintenance mode.
• You cannot do online configuration as described under Reconfiguring a Package.
• You cannot configure new dependencies involving this package; that is, you cannot make it dependent
on another package, or make another package depend on it.
• You cannot use the -t option of any command that operates on a package that is in maintenance
mode; see Previewing the Effect of Cluster Changes for information about the -t option.

Additional Rules for Partial-Startup Maintenance Mode

• You must halt the package before taking it out of partial-startup maintenance mode.
• To run a package normally after running it in partial-startup maintenance mode, you must take it out of
maintenance mode, and then restart it.

282 Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode


Dependency Rules for a Package in Maintenance Mode or Partial-Startup Maintenance
Mode
You cannot configure new dependencies involving a package running in maintenance mode, and in
addition the following rules apply (we'll call the package in maintenance mode pkgA).

• The packages that depend on pkgAmust be down and disabled when you place pkgA in maintenance
mode. This applies to all types of dependency (including exclusionary dependencies) as described
under About Package Dependencies.

◦ You cannot enable a package that depends on pkgA.

◦ You cannot run a package that depends on pkgA, unless the dependent package itself is in
maintenance mode.

• Dependency rules governing packages that pkgA depends on to be UP are bypassed so that these
packages can halt and fail over as necessary while pkgA is in maintenance mode.

• If both packages in a dependency relationship are in maintenance mode, dependency rules are
ignored for those two packages.
For example, both packages in an exclusionary dependency can be run and halted in maintenance
mode at the same time.

NOTE: If you have a package configured with generic resources and you attempt to take it out of the
maintenance mode back to the running state, the status of generic resources are evaluated. If any of the
generic resources is 'down', the package cannot be taken out of the maintenance mode.

Performing Maintenance Using Maintenance Mode


You can put a package in maintenance mode, perform maintenance, and take it out of maintenance
mode, whether the package is down or running.
This mode is mainly useful for making modifications to networking and generic resources components. To
modify other components of the package, such as services or storage.
If you want to reconfigure the package (using cmapplyconf (1m)) see Reconfiguring a Package and
Allowable Package States During Reconfiguration .

Procedure
Follow these steps to perform maintenance on a package's networking components.
In this example, we'll call the package pkg1 and assume it is running on node1.

Procedure

1. Place the package in maintenance mode:


cmmodpkg -m on -n node1 pkg1

2. Perform maintenance on the networks or resources and test manually that they are working correctly.

NOTE: If you now run cmviewcl, you'll see that the STATUS of pkg1 is up and its STATE is
maintenance.

3. If everything is working as expected, take the package out of maintenance mode:

Dependency Rules for a Package in Maintenance Mode or Partial-Startup Maintenance Mode 283
cmmodpkg -m off pkg1

Performing Maintenance Using Partial-Startup Maintenance Mode


To put a package in partial-startup maintenance mode, you put it in maintenance mode, then restart it,
running only those modules that you will not be working on.

Procedure
Follow this procedure to perform maintenance on a package. In this example, we'll assume a package
pkg1 is running on node1, and that we want to do maintenance on the package's services.

Procedure

1. Halt the package:


cmhaltpkg pkg1

2. Place the package in maintenance mode:


cmmodpkg -m on -n node1 pkg1

NOTE: The order of the first two steps can be reversed.

3. Run the package in maintenance mode.


In this example, we'll start pkg1 such that only the modules up to and including the package_ip
module are started. (See Package Modules and Parameters for a list of package modules. The
modules used by a package are started in the order shown near the top of its package configuration
file.)
cmrunpkg -m sg/package_ip pkg1

4. Perform maintenance on the services and test manually that they are working correctly.

NOTE: If you now run cmviewcl, you'll see that the STATUS of pkg1 is up and its STATE is
maintenance.

5. Halt the package:


cmhaltpkg pkg1

NOTE: You can also use cmhaltpkg -s, which stops the modules started by cmrunpkg -m — in
this case, all the modules up to and including package_ip.

6. Run the package to ensure everything is working correctly:


cmrunpkg pkg1

NOTE: The package is still in maintenance mode.

7. If everything is working as expected, bring the package out of maintenance mode:


cmmodpkg -m off pkg1

8. Restart the package:


cmrunpkg pkg1

284 Performing Maintenance Using Partial-Startup Maintenance Mode


Excluding Modules in Partial-Startup Maintenance Mode
In the example above, we used cmrunpkg -m to run all the modules up to and including package_ip,
but none of those after it. But you might want to run the entire package apart from the module whose
components you are going to work on. In this case you can use the -e option:
cmrunpkg -e sg/service pkg1
This runs all the package's modules except the services module.
You can also use -e in combination with -m. This has the effect of starting all modules up to and including
the module identified by -m, except the module identified by -e. In this case the excluded (-e) module
must be earlier in the execution sequence (as listed near the top of the package's configuration file) than
the -m module. For example:
cmrunpkg -m sg/services -e sg/package_ip pkg1

NOTE: The full execution sequence for starting a package is:

1. The master control script itself


2. Persistent reservation

Reconfiguring a Cluster
You can reconfigure a cluster either when it is halted or while it is still running. Some operations can only
be done when the cluster is halted. The table that follows shows the required cluster state for many kinds
of changes.

Table 13: Types of Changes to the Cluster Configuration

Change to the Cluster Configuration Required Cluster State

Add a new node All cluster nodes must be running.

Delete a node A node can be deleted even though it is unavailable or


unreachable.

Change Maximum Configured Packages Cluster can be running.

Change Quorum Server Configuration Cluster can be running; see What Happens when
You Change the Quorum Configuration Online.

Change Cluster Lock Configuration (lock LUN) Cluster can be running. See Updating the Cluster
Lock LUN Configuration Online and What Happens
when You Change the Quorum Configuration
Online.

Add NICs and their IP addresses to the cluster Cluster can be running. See Changing the Cluster
configuration Networking Configuration while the Cluster Is
Running on page 290.

Table Continued

Excluding Modules in Partial-Startup Maintenance Mode 285


Change to the Cluster Configuration Required Cluster State

Delete NICs and their IP addresses, from the Cluster can be running. SeeChanging the Cluster
cluster configuration Networking Configuration while the Cluster Is
Running on page 290.

Change the designation of an existing interface Cluster can be running. See Changing the Cluster
from HEARTBEAT_IP to STATIONARY_IP, or Networking Configuration while the Cluster Is
vice versa Running on page 290.

Change an interface from IPv4 to IPv6, or vice Cluster can be running. See Changing the Cluster
versa Networking Configuration while the Cluster Is
Running on page 290

Reconfigure IP addresses for a NIC used by the Must delete the interface from the cluster
cluster configuration, reconfigure it, then add it back into the
cluster configuration. See What You Must Keep in
Mind. Cluster can be running throughout.

Change NETWORK_POLLING_INTERVAL Cluster can be running.

Change IP Monitor parameters: SUBNET, Cluster can be running. See the entries for these
IP_MONITOR, POLLING TARGET parameters under Cluster Configuration Parameters
on page 111for more information.

Change MEMBER_TIMEOUT and Cluster can be running.


AUTO_START_TIMEOUT

Change Access Control Policy Cluster and package can be running.

Change SITE, SITE_NAME Node(s) associated with the corresponding SITE or


SITE_NAME entry must be down. When the cluster is
running you cannot add SITE or SITE_NAME entries.

Change ROOT_DISK_MONITOR, Cluster can be running. For more information about


ROOT_DISK_MONITOR_INTERVAL, and Root Disk Monitoring property see, Configuring Root
ROOT_DISK_MONITOR_EXCLUDE_NODES Disk Monitoring parameter.

Change Cluster Generic Resource Cluster can be running for a few parameters and must
parameters be down for others. For more information about the
supported operations, see Online reconfiguration of
cluster generic resources and Offline
reconfiguration of cluster generic resources.

Previewing the Effect of Cluster Changes


Many variables affect package placement, including the availability of cluster nodes; the availability of
networks and other resources on those nodes; failover and failback policies; and package weights,
dependencies, and priorities, if you have configured them. You can preview the effect on packages of
certain actions or events before they actually occur.
For example, you might want to check to see if the packages are placed as you expect when the cluster
first comes up; or preview what happens to the packages running on a given node if the node halts, or if
the node is then restarted; or you might want to see the effect on other packages if another, currently

286 Previewing the Effect of Cluster Changes


disabled, package is enabled, or if a package halts and cannot restart because none of the nodes on its
node_list is available.
Serviceguard provides two ways to do this: you can use the preview mode of Serviceguard commands, or
you can use the cmeval (1m) command to simulate different cluster states.
Alternatively, you might want to model changes to the cluster as a whole; cmeval allows you to do this;
see Using cmeval.

What You Can Preview


You can preview any of the following, or all of them simultaneously:

• Cluster bring-up (cmruncl)

• Cluster node state changes (cmrunnode, cmhaltnode)

• Package state changes (cmrunpkg, cmhaltpkg)

• Package movement from one node to another


• Package switching changes (cmmodpkg -e)

• Availability of package subnets, resources, and storage


• Changes in package priority, node order, dependency, failover and failback policy, node capacity and
package weight

Using cmeval
You can use cmeval to evaluate the effect of cluster changes on Serviceguard packages. You can also
use it simply to preview changes you are considering making to the cluster as a whole.
You can use cmeval safely in a production environment; it does not affect the state of the cluster or
packages. Unlike command preview mode (the -t discussed above) cmeval does not require you to be
logged in to the cluster being evaluated, and in fact that cluster does not have to be running, though it
must use the same Serviceguard release and patch version as the system on which you run cmeval.
Use cmeval rather than command preview mode when you want to see more than the effect of a single
command, and especially when you want to see the results of large-scale changes, or changes that may
interact in complex ways, such as changes to package priorities, node order, dependencies and so on.
Using cmeval involves three major steps:

1. Use cmviewcl -v -f line to write the current cluster configuration out to a file.

2. Edit the file to include the events or changes you want to preview
3. Using the file from Step 2 as input, run cmeval to preview the results of the changes.

For example, assume that pkg1 is a high-priority package whose primary node is node1, and which
depends on pkg2 and pkg3 to be running on the same node. These lower-priority-packages are currently
running on node2. pkg1 is down and disabled, and you want to see the effect of enabling it.
In the output of cmviewcl -v -f line, you would find the line package:pkg1|autorun=disabled
and change it to package:pkg1|autorun=enabled. You should also make sure that the nodes the
package is configured to run on are shown as available; for example: package:pkg1|node:node1|
available=yes. Then save the file (for example, as newstate.in) and run cmeval:
cmeval -v newstate.in

What You Can Preview 287


You would see output something like this:
package:pkg3|node:node2|action:failing
package:pkg2|node:node2|action:failing
package:pkg2|node:node1|action:starting
package:pkg3|node:node1|action:starting
package:pkg1|node:node1|action:starting
This shows that pkg1, when enabled, will “drag” pkg2 and pkg3 to its primary node, node1. It can do
this because of its higher priority; see Dragging Rules for Simple Dependencies. Running cmeval
confirms that all three packages will successfully start on node2 (assuming conditions do not change
between now and when you actually enable pkg1, and there are no failures in the run scripts.)

NOTE:

• cmeval cannot predict run and halt script failures.

• cmeval is not supported with cluster generic resource configured in a cluster.

This is a simple example; you can use cmeval for much more complex scenarios; see What You Can
Preview.

IMPORTANT: For detailed information and examples, see the cmeval (1m) manpage.

Reconfiguring a Halted Cluster


You can make a permanent change in cluster configuration when the cluster is halted. This procedure
must be used for changes marked “Cluster must not be running”, but it can be used for any other cluster
configuration changes as well.
Use the following steps:

1. Halt the cluster on all nodes.


2. On one node, reconfigure the cluster as described in Building an HA Cluster Configuration. You can
use cmgetconf to generate a template file, which you then edit.

3. Make sure that all nodes listed in the cluster configuration file are powered up and accessible. Use
cmapplyconf to copy the binary cluster configuration file to all nodes. This file overwrites any
previous version of the binary cluster configuration file.
4. Use cmruncl to start the cluster on all nodes, or on a subset of nodes.

Reconfiguring a Running Cluster


You can add new nodes to the cluster configuration or delete nodes from the cluster configuration while
the cluster is up and running. Note the following, however:

• You cannot remove an active node from the cluster. You must halt the node first.
• The only configuration change allowed while a node is unreachable (for example, completely
disconnected from the network) is to delete the unreachable node from the cluster configuration. If
there are also packages that depend upon that node, the package configuration must also be modified
to delete the node. This all must be done in one configuration request (cmapplyconf command).

• The access control list for the cluster can be changed while the cluster is running.

288 Reconfiguring a Halted Cluster


Changes to the package configuration are described in a later section.
The following sections describe how to perform dynamic reconfiguration tasks.

Adding Nodes to the Configuration While the Cluster is Running


Use the following procedure to add a node. For this example, nodes ftsys8 and ftsys9 are already
configured in a running cluster named cluster1, and you are adding node ftsys10.

NOTE: Before you start, make sure you have configured access to ftsys10 as described under
Configuring Root-Level Access.

1. Use the following command to store a current copy of the existing cluster configuration in a temporary
file in case you need to revert to it:
cmgetconf -C temp.conf

2. Specify a new set of nodes to be configured and generate a template of the new configuration (all on
one line):
cmquerycl -C clconfig.conf -c cluster1 -n ftsys8 -n ftsys9 -n ftsys10

3. Edit clconfig.conf to check the information about the new node.

4. Verify the new configuration:


cmcheckconf -C clconfig.conf

5. Apply the changes to the configuration and send the new binary configuration file to all cluster nodes:
cmapplyconf -C clconfig.conf

Use cmrunnode to start the new node, and, if you so decide, set the AUTOSTART_CMCLD parameter to
1 in the $SGAUTOSTART file (see Understanding the Location of Serviceguard Files on page 169) to
enable the new node to join the cluster automatically each time it reboots.

Removing Nodes from the Cluster while the Cluster Is Running


You can use Serviceguard Manager to delete nodes, or Serviceguard commands as shown below. The
following restrictions apply:

• The node must be halted. See Removing Nodes from Participation in a Running Cluster on page
271.
• If the node you want to delete is unreachable (disconnected from the LAN, for example), you can
delete the node only if there are no packages which specify the unreachable node. If there are
packages that depend on the unreachable node, halt the cluster; see Halting the Entire Cluster on
page 272.

Use the following procedure to delete a node with Serviceguard commands. In this example, nodes
ftsys8, ftsys9 and ftsys10 are already configured in a running cluster named cluster1, and you
are deleting node ftsys10.

NOTE: If you want to remove a node from the cluster, run the cmapplyconf command from another
node in the same cluster. If you try to issue the command on the node you want removed, you will get an
error message.

Adding Nodes to the Configuration While the Cluster is Running 289


1. Use the following command to store a current copy of the existing cluster configuration in a temporary
file:
cmgetconf -c cluster1 temp.conf

2. Specify the new set of nodes to be configured (omitting ftsys10) and generate a template of the new
configuration:
cmquerycl -C clconfig.conf -c cluster1 -n ftsys8 -n ftsys9

3. Edit the file clconfig.conf to check the information about the nodes that remain in the cluster.

4. Halt the node you are going to remove (ftsys10in this example):
cmhaltnode -f -v ftsys10

5. Verify the new configuration:


cmcheckconf -C clconfig.conf

6. From ftsys8 or ftsys9, apply the changes to the configuration and distribute the new binary
configuration file to all cluster nodes.:
cmapplyconf -C clconfig.conf

NOTE: If you are trying to remove an unreachable node on which many packages are configured to run,
you may see the following message:
The configuration change is too large to process while the cluster is running.
Split the configuration change into multiple requests or halt the cluster.
In this situation, you must halt the cluster to remove the node.

Changing the Cluster Networking Configuration while the Cluster Is


Running

What You Can Do


Online operations you can perform include:

• Add a network interface and its HEARTBEAT_IP or STATIONARY_IP.


• Delete a network interface and its HEARTBEAT_IP or STATIONARY_IP.
• Change a HEARTBEAT_IP or STATIONARY_IP interface from IPv4 to IPv6, or vice versa.
• Change the designation of an existing interface from HEARTBEAT_IP to STATIONARY_IP, or vice
versa.
• Change the NETWORK_POLLING_INTERVAL.
• Change IP Monitor parameters: SUBNET, IP_MONITOR, POLLING TARGET; see the entries for
these parameters underCluster Configuration Parameters on page 111 for more information.
• A combination of any of these in one transaction (cmapplyconf), given the restrictions below.

290 Changing the Cluster Networking Configuration while the Cluster Is Running
What You Must Keep in Mind
The following restrictions apply:

• You must not change the configuration of all heartbeats at one time, or change or delete the only
configured heartbeat.
At least one working heartbeat must remain unchanged.

• You cannot add interfaces or modify their characteristics unless those interfaces, and all other
interfaces in the cluster configuration, are healthy.
There must be no bad NICs or non-functional or locally switched subnets in the configuration, unless
you are deleting those components in the same operation.

• You cannot change the designation of an existing interface from HEARTBEAT_IP to STATIONARY_IP,
or vice versa, without also making the same change to all peer network interfaces on the same subnet
on all other nodes in the cluster.
Similarly, you cannot change an interface from IPv4 to IPv6 without also making the same change to
all peer network interfaces on the same subnet on all other nodes in the cluster.

• You cannot change the designation of an interface from STATIONARY_IP to HEARTBEAT_IP unless
the subnet is common to all nodes.
Remember that the HEARTBEAT_IP must be an IPv4 address, and must be on the same subnet on all
nodes, except in cross-subnetconfigurations; see Cross-Subnet Configurations).

• You cannot delete a subnet or IP address from a node while a package that uses it (as a
monitored_subnet, ip_subnet, or ip_address) is configured to run on that node.
Information about these parameters begins at monitored_subnet.

• You cannot change the IP configuration of an interface (NIC) used by the cluster in a single transaction
(cmapplyconf).
You must first delete the NIC from the cluster configuration, then reconfigure the NIC (using
ifconfig, for example), then add the NIC back into the cluster.
Examples of when you must do this include:

◦ moving a NIC from one subnet to another


◦ adding an IP address to a NIC
◦ removing an IP address from a NIC

CAUTION: Do not add IP addresses to network interfaces that are configured into the Serviceguard
cluster, unless those IP addresses themselves will be immediately configured into the cluster as
stationary IP addresses. If you configure any address other than a stationary IP address on a
Serviceguard network interface, it could collide with a relocatable package address assigned by
Serviceguard.

Some sample procedures follow.

What You Must Keep in Mind 291


Example: Adding a Heartbeat LAN
Suppose that a subnet 15.13.170.0 is shared by nodes ftsys9 and ftsys10 in a two-node cluster
cluster1, and you want to add it to the cluster configuration as a heartbeat subnet. Proceed as follows.

Procedure

1. Run cmquerycl to get a cluster configuration template file that includes networking information for
interfaces that are available to be added to the cluster configuration:
cmquerycl -c cluster1 -C clconfig.conf

NOTE: As of Serviceguard A.11.18, cmquerycl -c produces output that includes commented-out


entries for interfaces that are not currently part of the cluster configuration, but are available.

The networking portion of the resulting clconfig.conf file looks something like this:
NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.18
#NETWORK_INTERFACE lan0
#STATIONARY_IP 15.13.170.18
NETWORK_INTERFACE lan3
NODE_NAME ftsys10
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.19
#NETWORK_INTERFACE lan0
#STATIONARY_IP 15.13.170.19
NETWORK_INTERFACE lan3

2. Edit the file to uncomment the entries for the subnet that is being added (lan0 in this example), and
change STATIONARY_IP to HEARTBEAT_IP:
NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.18
NETWORK_INTERFACE lan0
HEARTBEAT_IP 15.13.170.18
NETWORK_INTERFACE lan3
NODE_NAME ftsys10
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.19
NETWORK_INTERFACE lan0
HEARTBEAT_IP 15.13.170.19
NETWORK_INTERFACE lan3

3. Verify the new configuration:


cmcheckconf -C clconfig.conf

4. Apply the changes to the configuration and distribute the new binary configuration file to all cluster
nodes:
cmapplyconf -C clconfig.conf

If you were configuring the subnet for data instead, and wanted to add it to a package configuration, you
would now need to:

292 Example: Adding a Heartbeat LAN


1. Halt the package
2. Add the new networking information to the package configuration file
3. Apply the new package configuration, and redistribute the control script if necessary.

For more information, see Reconfiguring a Package on a Running Cluster on page 295.

Example: Deleting a Subnet Used by a Package


In this example, we are deleting subnet 15.13.170.0 (lan0). Proceed as follows.

Procedure

1. Halt any package that uses this subnet and delete the corresponding networking information
(monitored_subnet, ip_subnet, ip_address; see the descriptions for these parameters starting with
monitored_subnet).
See Reconfiguring a Package on a Running Cluster on page 295 for more information.
2. Run cmquerycl to get the cluster configuration file:
cmquerycl -c cluster1 -C clconfig.conf

3. Comment out the network interfaces lan0 and lan3 and their network interfaces, if any, on all
affected nodes. The networking portion of the resulting file looks something like this:
NODE_NAME ftsys9
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.18
# NETWORK_INTERFACE lan0
# STATIONARY_IP 15.13.170.18
# NETWORK_INTERFACE lan3
NODE_NAME ftsys10
NETWORK_INTERFACE lan1
HEARTBEAT_IP 192.3.17.19
# NETWORK_INTERFACE lan0
# STATIONARY_IP 15.13.170.19
# NETWORK_INTERFACE lan3

4. Verify the new configuration:


cmcheckconf -C clconfig.conf

5. Apply the changes to the configuration and distribute the new binary configuration file to all cluster
nodes:
cmapplyconf -C clconfig.conf

Updating the Cluster Lock LUN Configuration Online


Proceed as follows.

IMPORTANT: See What Happens when You Change the Quorum Configuration Online for
important information.

Example: Deleting a Subnet Used by a Package 293


Procedure

1. In the cluster configuration file, modify the value of CLUSTER_LOCK_LUN for each node.
2. Run cmcheckconf to check the configuration.

3. Run cmapplyconf to apply the configuration.

If you need to replace the physical device, see Replacing a Lock LUN.

Resetting the cluster generic resource restart counter


The cluster generic resource restart counter tracks the number of times a cluster generic resource
command has been automatically restarted. This value is used to determine when the cluster generic
resource command has exceeded its maximum number of allowable automatic restarts.
When a cluster generic resource command successfully restarts after several attempts, the cluster
manager does not automatically reset the restart count. You can reset the counter online using
cmsetresource -r < resource-name> –R command.
For example, cmsetresource –r cpu_monitor –R will reset the restart count of a cluster generic
resource cpu_monitor to zero. You can view the current value of the restart counter in the output of
cmviewcl -v.

Changing MAX_CONFIGURED_PACKAGES
As of Serviceguard A.11.18, you can change MAX_CONFIGURED_PACKAGES while the cluster is
running. The default for MAX_CONFIGURED_PACKAGES is the maximum number allowed in the
cluster. You can use Serviceguard Manager to change MAX_CONFIGURED_PACKAGES, or
Serviceguard commands as shown below.
Use the cmgetconf command to obtain a current copy of the cluster's existing configuration, for
example:
cmgetconf -C <cluster_name> clconfig.conf
Edit the clconfig.conf file to include the new value for MAX_CONFIGURED_PACKAGES. Then use
the cmcheckconf command to verify the new configuration. Using the -k or -K option can significantly
reduce the response time.
Use the cmapplyconf command to apply the changes to the configuration and send the new
configuration file to all cluster nodes. Using -k or -K can significantly reduce the response time.

Changing the VxVM Storage Configuration


You can add VxVM disk groups to the cluster configuration while the cluster is running. Similarly, you can
delete VxVM disk groups provided they are not being used by a cluster node at that time.

NOTE:
If you are removing a disk group from the cluster configuration, ensure that you also modify or delete any
package configuration file that imports and deports this disk group. Be sure to remove the disk group from
the configuration of any package that used it, and the corresponding dependency_ parameters.

Reconfiguring a Package
You reconfigure a package in much the same way as you originally configured it; for modular packages,
see Configuring Packages and Their Services .

294 Resetting the cluster generic resource restart counter


The cluster can be either halted or running during package reconfiguration, and in some cases the
package itself can be running; the types of change you can make and the times when they take effect
depend on whether the package is running or not.
If you reconfigure a package while it is running, it is possible that the package could fail later, even if the
cmapplyconf succeeded.
For example, consider a package with two volume groups. When this package started, it activated both
volume groups. While the package is running, you could change its configuration to list only one of the
volume groups, and cmapplyconf would succeed. If you issue cmhaltpkg command, however, the halt
would fail. The modified package would not deactivate both of the volume groups that it had activated at
startup, because it would only see the one volume group in its current configuration file.
For more information, see Allowable Package States During Reconfiguration on page 297.

Reconfiguring a Package on a Running Cluster


You can reconfigure a package while the cluster is running, and in some cases you can reconfigure the
package while the package itself is running; see Allowable Package States During Reconfiguration on
page 297. You can do this in Serviceguard Manager or use Serviceguard commands.
To modify the package with Serviceguard commands, use the following procedure (pkg1 is used as an
example):

1. Halt the package if necessary:


cmhaltpkg pkg1
See Allowable Package States During Reconfiguration on page 297to determine whether this step
is needed.

2. If it is not already available, you can obtain a copy of the package's configuration file by using the
cmgetconf command, specifying the package name.
cmgetconf -p pkg1 pkg1.conf

3. Edit the package configuration file.

IMPORTANT: Restrictions on package names, dependency names, and service names have
become more stringent as of A.11.18. Packages that have or contain names that do not conform
to the new rules (spelled out under package_name) will continue to run, but if you reconfigure
these packages, you will need to change the names that do not conform; cmcheckconf and
cmapplyconf will enforce the new rules.

4. Verify your changes as follows:


cmcheckconf -v -P pkg1.conf

5. Distribute your changes to all nodes:


cmapplyconf -v -P pkg1.conf

Renaming or Replacing an External Script Used by a Running Package


In most cases, you can rename an external_script while the package that uses it is running, but you need
to be careful; follow the instructions below.

Reconfiguring a Package on a Running Cluster 295


Procedure

1. Make a copy of the old script, save it with the new name, and edit the copy as needed.
2. Edit the package configuration file to use the new name.
3. Distribute the new script to all nodes that are configured for that package.
Make sure you place the new script in the correct directory with the proper file modes and ownership.
4. Run cmcheckconf to validate the package configuration with the new external script.

CAUTION: If cmcheckconf fails, do not proceed to the next step until you have corrected all the
errors.

5. Run cmapplyconf on the running package.


This will stop any resources started by the original script, and then start any resources needed by the
new script.
6. You can now safely delete the original external script on all nodes that are configured to run the
package.

Reconfiguring a Package on a Halted Cluster


You can also make permanent changes in the package configuration while the cluster is not running. Use
the same steps as in Reconfiguring a Package on a Running Cluster on page 295.

Adding a Package to a Running Cluster


You can create a new package and add it to the cluster configuration while the cluster is up and while
other packages are running. The number of packages you can add is subject to the value of
MAX_CONFIGURED_PACKAGES in the cluster configuration file.
To create the package, follow the steps in the chapter Configuring Packages and Their Services . Then
use a command such as the following to verify the configuration of the newly created pkg1 on a running
cluster:
cmcheckconf -P $SGCONF/pkg1/pkg1conf.conf
Use a command such as the following to distribute the new package configuration to all nodes in the
cluster:
cmapplyconf -P $SGCONF/pkg1/pkg1conf.conf

Deleting a Package from a Running Cluster


Serviceguard will not allow you to delete a package if any other package is dependent on it. To check for
dependencies, use cmviewcl -v -l <package> . System multi-node packages cannot be deleted
from a running cluster.
You can use Serviceguard Manager to delete the package.
On the Serviceguard command line, you can (in most cases) delete a package from all cluster nodes by
using the cmdeleteconf command. This removes the package information from the binary configuration
file on all the nodes in the cluster. The command can only be executed when the package is down; the
cluster can be up.
The following example halts the failover package mypkg and removes the package configuration from the
cluster:
cmhaltpkg mypkg cmdeleteconf -p mypkg

296 Reconfiguring a Package on a Halted Cluster


The command prompts for a verification before deleting the files unless you use the -f option. The
directory $SGCONF/mypkg is not deleted by this command.

Resetting the Service Restart Counter


The service restart counter tracks the number of times a package service has been automatically
restarted. This value is used to determine when the package service has exceeded its maximum number
of allowable automatic restarts.
When a package service successfully restarts after several attempts, the package manager does not
automatically reset the restart count. You can reset the counter online using cmmodpkg -R -s, for
example:
cmmodpkg -R -s myservice pkg1
This sets the counter back to zero. The current value of the restart counter appears in the output of
cmviewcl -v.

Allowable Package States During Reconfiguration


In many cases, you can make changes to a package’s configuration while the package is running. The
table that follows shows exceptions — cases in which the package must not be running, or in which the
results might not be what you expect.

CAUTION: Be extremely cautious about changing a package's configuration while the package is
running.
If you reconfigure a package online (by executing cmapplyconf on a package while the package
itself is running) it is possible that the reconfiguration fails, even if the cmapplyconf succeeds,
validating the changes with no errors.
For example, if a file system is added to the package while the package is running, cmapplyconf
does various checks to verify that the file system and its mount point exist. But the actual file system
check and mount of the file system can be done only after cmapplyconf succeeds; and if one of
these tasks fails in a running package, the package reconfiguration fails.
Another example involves renaming, modifying, or replacing an external script while the package
that uses it is running. If the package depends on resources that are managed by the script, the
online recofiguration fails when you replace the script. See Renaming or Replacing an External
Script Used by a Running Package on page 295.

NOTE: Changes that are allowed, but whichHewlett Packard Enterprise does not recommend, are
labeled “should not be running”.

IMPORTANT: Actions not listed in the table can be performed for both types of package while the
package is running.

In all cases the cluster can be running, and packages other than the one being reconfigured can be
running. You can make changes to package configuration files at any time; but do not apply them (using
cmapplyconf or Serviceguard Manager) to a running package in the cases indicated in the table.

NOTE: All the nodes in the cluster must be powered up and accessible when you make package
configuration changes.

Resetting the Service Restart Counter 297


Table 14: Types of Changes to Packages

Change to the Package Required Package State

Delete a package Package must not be running.

NOTE: You cannot delete a package if another package has a


dependency on it.

Change package type Package must not be running.

Add or delete a module Package can be running.

Add or delete a service Package can be running.


Serviceguard treats any change to service_name or service_cmd as
deleting the existing service and adding a new one, meaning that the
existing service is halted.

Change service_restart Package can be running.


Serviceguard will not allow the change if the new value is less than the
current restart count. (You can use cmmodpkg -R<service_name>
<package> to reset the restart count if you need to.)

Add or remove an ip_subnet Package can be running.


See ip_subnet for important information. Serviceguard will reject the
change if you are trying to add an ip_subnet that is not configured on
all the nodes on the package's node_name list.

Add or remove an ip_address Package can be running.


See ip_subnet and ip_address for important information.
Serviceguard will reject the change if you are trying to add an
ip_address that cannot be configured on the specified ip_subnet, or is
on a subnet that is not configured on all the nodes on the package's
node_name list.

Add or delete nodes from Package can be running.


package’s ip_subnet_node list
Serviceguard will reject the change if you are trying to add a node on
in cross-subnet configurations
which the specified ip_subnet is not configured.

Add or remove monitoring for a Package can be running.


subnet: monitored_subnet for a
Serviceguard will not allow the change if the subnet being added is
modular package
down, as that would cause the running package to fail.

Add, change, or delete a pv Package must not be running.

NOTE: pv pv on page 244 is for use by Hewlett Packard Enterprise


partners only.

Add a volume group Package can be running.

Table Continued

298 Cluster and Package Maintenance


Change to the Package Required Package State

Remove a volume group Package can be running.

CAUTION: Serviceguard ignores the change if the volume group


is removed and its associated logical volumes are in use by the
same or different package within the cluster.

Change a file system Package should not be running (unless you are only changing
fs_umount_opt).
Changing file-system options other than fs_umount_opt may cause
problems because the file system must be unmounted (using the
existing fs_umount_opt) and remounted with the new options; the
CAUTION under “Remove a file system: modular package” applies in
this case as well.
If only fs_umount_opt is being changed, the file system will not be
unmounted; the new option will take effect when the package is halted
or the file system is unmounted for some other reason.

Add a file system Package can be running.


During the package reconfiguration, if the fsck command on a file
system fails, the package does not start.

CAUTION: To avoid this issue, run the fsck command on the


file system outside the package, and then add the file system to
the modular package.

Remove a file system Package can be running.

CAUTION: Removing a file system may cause problems if the


file system cannot be unmounted because it's in use by a
running process. In this case Serviceguard kills the process; and
keeps the package running with errors. For more information,
see Handling Failures During Online Package
Reconfiguration.

Change Package can be running.


concurrent_fsck_operations,
These changes in themselves will not cause any file system to be
fs_mount_retry_count,
unmounted.
fs_umount_retry_count

Add, change, or delete external Package can be running.


scripts and pre-scripts
Changes take effect when applied, whether or not the package is
running. If you add a script, Serviceguard validates it and then (if there
are no errors) runs it when you apply the change. If you delete a script,
Serviceguard stops it when you apply the change.

Change package auto_run Package can be either running or halted.


See Choosing Switching and Failover Behavior.

Table Continued

Cluster and Package Maintenance 299


Change to the Package Required Package State

Add or delete a configured Both packages can be either running or halted.


dependency
Special rules apply to packages in maintenance mode.
For dependency purposes, a package being reconfigured is
considered to be UP. This means that if pkgA depends on pkgB, and
pkgA is down and pkgB is being reconfigured, pkgA will run if it
becomes eligible to do so, even if pkgB's reconfiguration is not yet
complete.
Hewlett Packard Enterprise recommends that you separate package
dependency changes from changes that affect resources and services
that the newly dependent package will also depend on; reconfigure the
resources and services first and apply the changes, then configure the
package dependency.
For more information see About Package Dependencies.

Add a generic resource of Package can be running provided the status of the generic resource is
evaluation type not 'down'. For information on online changes to generic resources,
during_package_start see Online Reconfiguration of Generic Resources.

Add a generic resource of Package can be running if the status of generic resource is 'up', else
evaluation type package must be halted.
before_package_start

Remove a generic resource Package can be running.

Change the Package can be running if the status of generic resource is 'up'.
generic_resource_evaluation_t
Not allowed if changing the generic_resource_evaluation_type causes
ype
the package to fail.
For information on online changes to generic resources, see Online
Reconfiguration of Generic Resources.

Change the Package can be running for resources of evaluation type


generic_resource_up_criteria before_package_start or during_package_start provided the new up
criteria does not cause the resource status to evaluate to 'down'.
Not allowed if changing the generic_resource_up_criteria causes the
package to fail.
For information on online changes to generic resources, see Online
Reconfiguration of Generic Resources.

Add the VMware VMFS Package can be running. For more information about online changes
package parameters: to VMware VMFS parameters, see Online Reconfiguration of
VMware VMFS Parameters on page 130.
vmdk_file_name
datastore_name NOTE: You cannot modify or remove the VMware VMFS parameter
scsi_controller as this is not supported.
disk_type

Table Continued

300 Cluster and Package Maintenance


Change to the Package Required Package State

Change modular serviceguard- Package can be running. See Online Reconfiguration of


xdc package parameters: serviceguard-xdc Modular Package Parameters on page 135.
xdc/xdc/rpo_target
xdc/xdc/
raid_monitor_interval
xdc/xdc/raid_device
xdc/xdc/device_0
xdc/xdc/device_1

Add, change, or delete an email Package can be running.


attribute
NOTE: Do not include the email module in the modular package. The
serviceguard-xdc and toolkit packages automatically includes the
email module in the modular package.

NOTE: Consider a configuration in which the volume group and the corresponding filesystem are present
in two different packages. To perform online reconfiguration of such packages, the package with the
volume group must be reconfigured before you reconfigure the filesystem package. Hewlett Packard
Enterprise recommends that you do not perform online reconfiguration for both these packages in a
single command as it might cause one or more packages to fail.

Changes that Will Trigger Warnings


Changes to the following will trigger warnings, giving you a chance to cancel, if the change would cause
the package to fail.

NOTE: You will not be able to cancel if you use cmapplyconf -f.

• Package nodes
• Package dependencies
• Package weights (and also node capacity, defined in the cluster configuration file)
• Package priority
• auto_run
• failback_policy

Online Reconfiguration of Modular package


To modify the configuration of modular package while the package is up and running:

1. Obtain a copy of the package configuration file, if it is not already available:


#cmgetconf -p pkg1 pkg1.conf

2. Edit the package configuration file.


3. Verify the package configuration changes:

Changes that Will Trigger Warnings 301


#cmcheckconf -v -P pkg1.conf

4. Apply the changes to the configuration:


#cmapplyconf -v -P pkg1.conf
Once the cmapplyconf command succeeds, verify the following:

a. Any failures logged in syslog and package log files.


b. If for some reason the online package reconfiguration fails, the Serviceguard sets the
online_modification_failed flag to "yes". Verify the flag status using cmviewcl -f line
output.

To handle the failure during online package reconfiguration, see Handling Failures During Online
Package Reconfiguration.
Recommendations

• Hewlett Packard Enterprise recommends that when modifying package parameters online, the
modification must be done on one module at a time and also from one package only.
• You must consider only one package for online reconfiguration at a time.
• If you are adding a new module or a parameter when the package is UP, make the changes in the
Serviceguard package and later configure the application to use the changes.
For example, to add a mount point:

1. Edit the package configuration file and add the mount point.
2. Verify the package configuration file:
#cmcheckconf -P <pkg_name>

3. Apply the package configuration once the verification is successful:


#cmapplyconf -P <pkg_name>

4. Configure the application to use the mount point that is added.

• If you are deleting a module or parameter from the module, you must remove the configuration from
the application and later delete from the Serviceguard.
For example, to delete a mount point:

1. Remove the mount point from the application.


2. Edit the package configuration file and remove the mount point.
3. Verify the package configuration file:
#cmcheckconf -P <pkg_name>

4. Apply the package configuration once the verification is successful:


#cmapplyconf -P <pkg_name>

302 Cluster and Package Maintenance


For limitations on online reconfiguration of serviceguard-xdc package, see Online Reconfiguration of
serviceguard-xdc Modular Package Parameters on page 135.

Handling Failures During Online Package Reconfiguration


During online package reconfiguration, if there are any failure the online_modification_failed flag
is set to yes and the following restrictions apply to the package:

• The global switching of the package is disabled. This means, the package cannot failover to the
adoptive node. For more information, see cmviewcl (5) manpage.

• Live application detach (LAD) of the node where the package reconfiguration has failed are not
allowed.
• The package cannot be put into maintenance mode when an online_modification_failed flag
is set to yes.
• You cannot modify the package configuration online. For more information, see cmapplyconf (1m)
manpage.

The online_modification_failed flag can be cleared in one of the following ways:

• Halting the package using cmhaltpkg command. For more information, see cmhaltpkg (1m)
manpage.
• A new option -f is introduced for cmmodpkg command that can be used to clear the flag. The -f
option must be used after fixing the errors found during the previous online reconfiguration of the
package. This option is applicable for both failover and multi-node packages.

For example, if you enter a wrong fs_type value while adding a new filesystem to the pkg1.
*****************************
Package log during this time:
*****************************
Nov 28 23:41:19 root@test1.ind.hp.com master_control_script.sh[23516]:
###### reconfiguring package pkg1 ######
Nov 28 23:41:20 root@test1.ind.hp.com pr_util.sh[23621]: New VG vg_dd0
Nov 28 23:41:20 root@test1.ind.hp.com pr_util.sh[23621]: sg_activate_pr:
activating PR on /dev/sdc
Nov 28 23:41:21 root@test1.ind.hp.com volume_group.sh[23687]: New VG vg_dd0
Nov 28 23:41:21 root@test1.ind.hp.com volume_group.sh[23687]: Attempting to
addtag to vg vg_dd0...
Nov 28 23:41:21 root@test1.ind.hp.com volume_group.sh[23687]: addtag was
successful on vg vg_dd0.
Nov 28 23:41:21 root@test1.ind.hp.com volume_group.sh[23687]: Activating
volume group vg_dd0 .
Nov 28 23:41:22 root@test1.ind.hp.com filesystem.sh[23808]: FS added or
changed /dev/vg_dd0/lvol3
Nov 28 23:41:22 root@test1.ind.hp.com filesystem.sh[23808]: Checking
filesystems:
/dev/vg_dd0/lvol3
e2fsck 1.41.12 (17-May-2010)
mount: wrong fs type, bad option, bad superblock on /dev/vg_dd0/lvol3,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
:
:

Handling Failures During Online Package Reconfiguration 303


:
Nov 28 23:41:22 root@test1.ind.hp.com
master_control_script.sh[23516]:#############################################
############################
Nov 28 23:41:22 root@test1.ind.hp.com master_control_script.sh[23516]:
###### Package reconfigure completed with failures for pkg1 ######
Nov 28 23:41:22 root@test1.ind.hp.com master_control_script.sh[23516]:
###### Below is the summary of changes for pkg1 ######
Nov 28 23:41:20 root@test1.ind.hp.com pr_util.sh[23621]: SUCCESS:
Successfully activated PR on /dev/sdc
Nov 28 23:41:22 root@test1.ind.hp.com filesystem.sh[23808]: ERROR: Failed to
fsck /dev/vg_dd0/lvol3.
Nov 28 23:41:22 root@test1.ind.hp.com filesystem.sh[23808]: ERROR: Will not
attempt to apply remaining changes due to the error encountered
Nov 28 23:41:22 root@test1.ind.hp.com filesystem.sh[23808]: WARNING: Not
attempting to mount /dev/vg_dd0/lvol3 on /mnt1
Nov 28 23:41:22 root@test1.ind.hp.com master_control_script.sh[23516]:
#########################################################################

************************
syslog during this time:
************************
Nov 28 23:41:22 test1 cmserviced[18979]: Package Script for pkg1 failed with
an exit(18).
Nov 28 23:41:22 test1 cmcld[18900]: Reconfigured package pkg1 on node test1.
Nov 28 23:41:22 test1 cmcld[18900]: Online reconfiguration of package pkg1
on node test1 failed. Check the package log file for complete information.
Nov 28 23:41:22 test1 cmcld[18900]: Request from node test1 to disable
global switching for package pkg1.
To rectify the failures, do one of the following:

1. Halt the package.


2. Make the required changes.
3. Restart the package

or

1. Verify and fix the problem using fsck command.

2. Verify the filesystem before mounting the device:


#e2fsck /dev/vg_dd0/lvol3

3. Mount the filesystem:


#mount /dev/vg_dd0/lvol3 /mnt1

4. Clear the online_modification_failed flag:


#cmmodpkg -f pkg1

5. Enable the global switching for the package:


#cmmodpkg -e pkg1

304 Cluster and Package Maintenance


The table describes how to fix the errors in the affected modules that are encountered during online
addition of a package.

Table 15: Modules affected during online addition

Description Modules affected How to rectify the failures Example


during online addition or apply those that are not
attempted

Extending ext/xdc(xdc.sh) You cannot rectify the —


MD (for failures in XDC module
serviceguard- manually and must restart
xdc packages the package.
only)

Adding an sg/ If an external pre script To start the external pre script:
external pre external_pre_script which is added to the
#extern_pre_script.sh
script to the (external.sh) package configuration failed
start
package to start, run the script
manually with start option.
script_name start

Table Continued

Cluster and Package Maintenance 305


Extending sg/filesystem If addition of storage has To add PR keys, you must
storage to the (filesystem.sh) failed, ensure the following: register on all the paths and
package sg/volume_group then reserve on one path.
(volume_group.sh) • Attach the configured
sg/pr_cntl To view all the paths:
VMFS disks in the
(pr_cntl.sh) package using VMware #multipath -ll
sg/vmfs (vmfs.sh)
recommended methods. To view the node PR keys:
• Persistent reservation is
added to the disk. This NOTE: In this example, the
not applicable if the node_pr_key
storage is of type VMFS. is 72810001.
• Volume group is activated
with hosttags. #cmviewcl -v -f line |
grep node_pr_key
• Verify and repair the file
#sg_persist --out -G
system.
--param-
• The mount point is sark=72810001 /dev/sde
mounted. #sg_persist --out -R
--param-rk=72810001 --
For more information, see prout-type=5 /dev/sde
sg_persist(1m),
vgchange(1m),fsck(1m) Add hosttags for node
, multipath (1m), and test1.ind.hp.com:
mount(1m) manpage. #vgchange --addtag
test1.ind.hp.com
vg_dd1
Activate the disk:
#vgchange -a y vg_dd1
Run the following commands:
#e2fsck -y /dev/
vg_dd1/lvol1
#mount -t ext3 /dev/
vg_dd1/lvol2 /mnt1

Adding an IP sg/package_ip If an IP address which is To add the IP to the package


to the (package_ip.sh) added to the package 10.149.2.5:
package configuration failed to add or
#cmmodnet -a -I
not attempted, use the
10.149.2.5 10.149.2.0
cmmodnet command to add
the IP address.
For more information, see
cmmodnet (1m) manpage.

Table Continued

306 Cluster and Package Maintenance


Adding an sg/external_script If an external script which is To halt external pre script:
external script (external.sh) added to the package
#extern_pre_script.sh
to the configuration failed to start,
stop
package run the script manually with
start option.
script_name start

Adding a sg/service If a service which is added to To run the process as service:


service to the (service.sh) the package configuration
#cmrunserv
package failed to start or not
db1 /var/opt/db/
attempted, use the
database1
cmrunserv command to
start the service
For more information, see
cmrunserv (1m)
manpage.

The table describes how to fix the errors in the affected modules that are encountered during online
deletion of a package.

Table 16: Modules affected during online deletion

Description Modules affected during How to fix the failures or Example


online deletion apply those that are not
attempted

Removing sg/service If a service which is deleted To halt the service db1:


service from (service.sh) from the package
#cmhaltserv db1
the package configuration failed to halt or
not attempted, use
cmhaltserv command to
start the service.
For more information, see
cmhaltserv (1m)
manpage.

Removing sg/external_script If an external script which is To halt the external script:


external (external.sh) deleted from the package
#script:extern_script.s
script from configuration failed or not
h stop
the package attempted to stop, run the
script with stop option.
script_name stop

Table Continued

Cluster and Package Maintenance 307


Removing sg/package_ip If an IP address which is To remove the IP from the
IP from the (package_ip.sh) deleted to the package package 10.149.2.5:
package configuration failed to
#cmmodnet -r -I
remove or not attempted,
10.149.2.5 10.149.2.0
use the cmmodnet
command to remove the IP
address.
For more information, see
cmmodnet (1m) manpage.

Removing sg/filesystem If storage deleted from the To unmount the mount point
storage from (filesystem.sh) package has failed or not mnt1:
the package sg/volume_group attempted, ensure the
#umount /mnt1
(volume_group.sh) following:
sg/pr_cntl To delete the hostags from the
(pr_cntl.sh) • The mount point is vg_dd0 on node
unmounted. test1.ind.hp.com:

• Delete the hosttagss from #vgchange --deltag


the disk. test1.ind.hp.com vg_dd1

• Volume group is de- To remove the persistent


activated with hosttags. reservation from the
disk /dev/sde:
• Persistent reservation is
#pr_cleanup -
removed from the disk.
lun /dev/sde
For more information, see
sg_persist(1m),
vgchange(1m),
pr_cleanup(1m),
multipath(1m), and
mount(1m) manpage.

Removing sg/ If an external pre script To halt the external pre script:
external pre external_pre_script which is deleted from the
#extern_pre_script.sh
script from (external.sh) package configuration failed
stop
the package or not attempted to stop, run
the script with stop option.
script_name stop

Removing ext/xdc You cannot rectify the —


MD from the (xdc.sh) failures in XDC module
package (for manually and must restart
XDC the package.
packages)

Migrate generic resources from package to cluster


To move a generic resource configured in a package to a cluster, for various reasons such as, if the same
generic resource is used in more than one package or the generic resource status has to be available
before the package starts. Then it makes sense to move the generic resource from package to cluster.

308 Migrate generic resources from package to cluster


Reconfiguring a package for generic resource when the cluster is running or halted
You can reconfigure a package while the cluster is running. In some cases you can reconfigure the
package while the package itself is running. For more information see, Allowable Package States
During Reconfiguration . You can reconfigure the package from the Serviceguard Manager or from
Serviceguard CLI.
To modify the package configuration for generic resource and move them into cluster generic resources
using Serviceguard commands, complete the following steps. Here pkg1 is used as an example package
name.

Procedure

1. Obtain a copy of the package's configuration file if it is not already available.


cmgetconf -p pkg1 pkg1.conf
For example let us assume that the current package pkg1 is using the following generic resources
and service.

generic_resource_name cpu_monitor
generic_resource_evaluation_type during_package_start
generic_resource_up_criteria <= 60
service_name cpu_monitor_script
service_cmd $SGCONF/generic_resource_monitors/
cpu_monitor.sh
service_restart unlimited
service_fail_fast_enabled no
service_halt_timeout 30
service_halt_on_maintenance no

2. Edit the package configuration file.

a. Remove or comment out the service parameters associated with the generic resource.
b. Remove or comment out the package generic resource parameters of which you are planning to
move to cluster.

CAUTION: While moving generic resources from package to cluster it is mandatory to


remove the service associated with the package generic resource. Failing to do this step will
result in unexpected problems.

3. Verify your changes


cmcheckconf -v -P pkg1.conf

4. Distribute your changes to all nodes, this step will reconfigure the running package.
cmapplyconf -v -P pkg1.conf

5. After the reconfiguration of package completes. Get the cluster configuration file. If it is not already
available, you can obtain a copy of the cluster configuration file by using the cmgetconf command,
specifying the cluster name.
cmgetconf –c cluster1 cluster.conf

6. Add the generic resource parameters to cluster configuration.

GENERIC_RESOURCE_NAME cpu_monitor
GENERIC_RESOURCE_TYPE extended
GENERIC_RESOURCE_CMD $SGCONF/generic_resource_monitors/
cpu_monitor.sh
GENERIC_RESOURCE_SCOPE NODE
GENERIC_RESOURCE_RESTART unlimited
GENERIC_RESOURCE_HALT_TIMEOUT 30000000

Reconfiguring a package for generic resource when the cluster is running or halted 309
7. Verify the changes.
cmcheckconf -v –C cluster.conf

8. Distribute your changes to all nodes, this step will reform the running cluster by adding the cluster
generic resource.
cmapplyconf -v –C cluster.conf

9. Retrieve the package configuration.


cmgetconf -p pkg1 pkg1.conf

10. Add the cluster generic resource to package configuration.

generic_resource_name cpu_monitor
generic_resource_evaluation_type during_package_start
generic_resource_up_criteria <= 60

11. Verify your changes as follows:


cmcheckconf -v –P pkg1.conf

12. Distribute your changes to all nodes, this step will reform the running package by adding the generic
resource.
cmapplyconf -v –P pkg1.conf

NOTE: The above procedure involves modifying the cluster and package configuration in separate
operations. However this can be combined to one single operation also. To modify both the cluster
and package as part of one single operation you can skip the step 7-8. And perform the below
mentioned steps instead of steps 11-12.
Verify your changes as follows for both cluster and package configuration:
cmcheckconf -v –C cluster.conf –P pkg1.conf
Distribute your changes to all nodes, this step will reform the running cluster and package by adding
the generic resource into cluster and package:
cmapplyconf -v –C cluster.conf –P pkg1.conf

13. After completing the above steps make sure that the cluster and package generic resources are up
and running as per the configuration. Check the syslog and package logs for any errors and resolve
them.

Responding to Cluster Events


Serviceguard does not require much ongoing system administration intervention. As long as there are no
failures, your cluster will be monitored and protected. In the event of a failure, those packages that you
have designated to be transferred to another node will be transferred automatically. Your ongoing
responsibility as the system administrator will be to monitor the cluster and determine if a transfer of
package has occurred. If a transfer has occurred, you have to determine the cause and take corrective
actions.
The typical corrective actions to take in the event of a transfer of package include:

• Determining when a transfer has occurred.


• Determining the cause of a transfer.
• Repairing any hardware failures.
• Correcting any software problems.

310 Responding to Cluster Events


• Restarting nodes.
• Transferring packages back to their original nodes.
• Enabling package switching.

Single-Node Operation
In a multi-node cluster, you could have a situation in which all but one node has failed, or you have shut
down all but one node, leaving your cluster in single-node operation. This remaining node will probably
have applications running on it. As long as the Serviceguard daemon cmcld is active, other nodes can
rejoin the cluster.
If the Serviceguard daemon fails when the cluster is in single-node operation, it will leave the single node
up and your applications running

NOTE: This means that Serviceguard itself is no longer running.

It is not necessary to halt the single node in this scenario, since the application is still running, and no
other node is currently available for package switching. (This is different from the loss of the Serviceguard
daemon in a multi-node cluster, which halts the node (system reset), and causes packages to be
switched to adoptive nodes.)
You should not try to restart Serviceguard, since data corruption might occur if another node were to
attempt to start up a new instance of the application that is still running on the single node.
Instead of restarting the cluster, choose an appropriate time to shut down the applications and reboot the
node; this will allow Serviceguard to restart the cluster after the reboot.

Removing Serviceguard from a System


If you want to disable a node permanently from Serviceguard, use the rpm -e command to delete the
software.

CAUTION:
Remove the node from the cluster first. If you run the rpm -e command on a server that is still a
member of a cluster, it will cause that cluster to halt, and the cluster to be deleted.

To remove Serviceguard:

1. If the node is an active member of a cluster, halt the node first.


2. If the node is included in a cluster configuration, remove the node from the configuration.
3. If you are removing Serviceguard from more than one node, run rpm -eon one node at a time.

Single-Node Operation 311


Understanding Site Aware Disaster Tolerant
Architecture
With Serviceguard A.12.00.20, a new framework called SADTA (Site Aware Disaster Tolerant
Architecture) has been introduced to provide disaster recovery capabilities for complex application
workloads. These workloads may be a combination of components that need to run on single or multiple
node(s) in the cluster simultaneously and may require availability of other software components.
This solution allows you to create site-aware clusters where complex application workloads run on the
production site. In the event of failures on the production site, SADTA automatically detects these failures
and takes appropriate action to ensure the availability of your complex application workloads.

NOTE:

• If you have installed version earlier than 12.00.30, ensure that you have configured equal number of
nodes on both sites.
• If you have installed 12.00.30 or later versions, asymmetric node configuration in a Metrocluster
environment is supported with Smart Quorum enabled. For more information about Smart Quorum,
see Understanding the Smart Quorum on page 324
.

SADTA attempts to restart the complex application workloads on the other available nodes within the
production site if the site is not completely lost. In case, the entire site has failed, SADTA initiates a site
takeover and the application(s) run on the recovery site.
The following are the main components of SADTA:

• Site
• Complex workload
• Redundant configuration
• Site controller package
• Site safety latch and its status

Terms and Concepts


Site
Site is a logical grouping of nodes which are located apart from each other at different data centers.

312 Understanding Site Aware Disaster Tolerant Architecture


Figure 38: Typical Cluster Configuration with Sites

Figure 38: Typical Cluster Configuration with Sites on page 313 depicts a four node Serviceguard
cluster with sites, Site A and Site B, where Site A consists of Node 1 and Node 2 and Site B consists of
Node 3 and Node 4. The SITE_NAME and SITE parameter must be defined in the cluster configuration
file. For more information on how to configure sites in a cluster, see the parameter descriptions under
Cluster Configuration Parameters on page 111.

Complex Workload
A complex workload is a set of failover or multi-node package(s) or both configured to run on a site with
or without any dependencies among them. The workload may optionally include components that need to
be brought up on the disaster recovery site after bringing up all the components of the workload on the
primary site. Figure 39: Sample Complex Workload Configuration on page 313 shows the complex
workload packages configured on Site A which is primary site.

Figure 39: Sample Complex Workload Configuration

Redundant Configuration
The SADTA framework relies on redundant configuration to provide disaster recovery capabilities for
complex workloads. SADTA requires that each of the workload packages have a redundantly configured
recovery package which will be brought up on the recovery site in case of failure in the primary site. The

Complex Workload 313


SADTA will first attempt to restart the workload packages on the primary site. In case a restart on the
primary site is not possible then SADTA will ensure that the workload is completely halted on the primary
site and brings up the recovery workload packages on the recovery site. Figure 40: Sample Redundant
Complex Workload Configuration on page 314 shows the complex workload packages configured
redundantly on site B which is recovery site.

Figure 40: Sample Redundant Complex Workload Configuration

Site Controller Package


Site controller package is a failover package that manages the complex workload packages and its
recovery packages and are configured to run on all the nodes in the cluster.
The site controller package is configured to be aware of the redundantly configured complex workload
packages on both sites. The site controller package monitors the primary workload packages and in case
of failure, will try to restart the packages within the same site. In case of failure where the workloads can
no longer run on the primary site, the site controller package will initiate a site takeover.
Site Takeover
In the advent of a failure, where the workload packages cannot run any more on the primary site, the site
controller package ensures that the workload is no more running on the primary site and brings up the
recovery workload packages on the recovery site. This is called a site takeover. The site controller
package ensures that at no point in time the workload packages are running on both the sites.
The site controller package has different monitoring levels for the workload packages:

• Critical packages
When a package is configured as critical in the site controller package, the site controller package
monitors only these critical packages for any failures. If there are managed packages configured along
with the critical packages, the site controller package does not monitor the managed packages for any
failures. You can specify any number of critical packages and these packages can be of type failover
or multi-node packages. During monitoring, even if a single critical package fails and cannot be
brought up on any other node in the production site, the site controller package initiates a site takeover
to the recovery site.

• Managed packages
When a package is configured as managed in the site controller package, the site controller package
monitors all the managed packages for any failures only when no critical packages are configured. You

314 Site Controller Package


can specify any number of managed packages and these packages can be of type failover or multi-
node packages. During monitoring, if all the managed packages fail and cannot be brought up on any
other node in the production site, the site controller package initiates a site takeover to a recovery site.

• Remote managed packages


If your solution requires any component that needs to start on disaster recovery site, then you can use
remote managed package. For example, if you need to use SAP HANA system replication.
When a package is configured as remote managed in the site controller package, the site controller
package starts and stops the remote managed package on the recovery site. You can specify any
number of remote managed packages and these packages can be of type failover or multi-node
packages. These packages are brought up on the recovery site only after the critical and managed
packages are brought up on the production site. However, the site controller package does not monitor
the remote managed packages for any failures. If the remote managed package fails, the site
controller package does not halt the complex workload packages (that is, critical and managed
packages) running on the production site.

During monitoring, if the site controller package detects a failure, then the site controller package tries to
failover the package to the nodes within the site to resume the service in the production site. However, if
the packages cannot be accommodated on the production site, then the site controller package initiates a
site takeover.
The following are the scenarios where the site controller package initiates a site takeover:

• When any one of the critical packages fail (not administratively halted) and it cannot be brought up on
any other node in the production site, the site controller package initiates a site takeover to recovery
site.
• When no critical packages are configured and all the managed packages fail in a production site (not
administratively halted) and they cannot be brought up on any other node in the production site.

NOTE:

• A multi-node package is considered failed only if the package fails on all the nodes where it is
configured to run.
• If any of the critical or managed packages are administratively halted individually, then a site takeover
is not initiated in the event of failure of the critical or managed package respectively.
• If the site controller package or any complex workload packages (that is, critical, managed or remote
managed packages) are in detached or maintenance mode, then site takeover will not be initiated in
the event of a failure.

Site Safety Latch and its Status


Site safety latch is a mechanism to ensure that only one set of complex workload packages runs in the
entire site aware Serviceguard cluster. The site safety latch is implemented using a generic resource
which is configured in the site controller package and all the workload packages.

NOTE: Only root user can reset the value of site safety latch in the configuration file using the
cmresetsc command. For more information, see cmresetsc (1m) manpage.

Site safety latch status can be one of the following:

unknown

Site Safety Latch and its Status 315


When the cluster is started and the site controller package has not modified the site safety latch. This
status is called as unknown.

• intermediate
Until the complex workload packages are brought up, the site safety latch status on the production site
will be in intermediate state. This is a transient state. Once the complex workload packages are
running on the production site, the site safety latch value is immediately changed from
intermediate state by the site controller package.

• passive
If the site safety latch status is passive on any site, then this site is called as passive site by the site
controller package and is available for running the recovery workload packages on the disaster
recovery site.
• active
If the site safety latch status is active on any site, then this site is called as active site by the site
controller package and is running the workload packages on the production site. Then, the site
controller package starts the remote managed package on the passive site.

Status of Site Safety Latch


You can view the status of the site safety latch using the cmviewcl -v -f line or cmviewcl -v -p
command. The status can be active, passive, intermediate, or unknown. For description
about the site safety latch status, see Site Safety Latch and its Status on page 315.
Reset a Site Safety Latch
Under certain circumstances, if the site controller package start up fails, the status of the site safety latch
changes to intermediate state. To resolve this problem, you must reset the site safety latch status to
passive state using cmresetsc command. For more information, see cmresetsc (1m) manpage.

NOTE:

• If the status of the site safety latch on both the sites changes to intermediate state, ensure that you
run the cmresetsc command on one node from each site.

• If the status of the site safety latch changes to intermediate state on one of the node belonging to
the site, ensure that you run the cmresetsc command on that node.

• If the site controller package logs the following message in the package log file, then you need to run
the cmresetsc command:
sc.sh[15366]: Site Controller start up on the site "A" has failed
sc.sh[15366]: Clean up the site and start manager again
sc.sh[15366]: Check for any error messages in the package log file on all
sc.sh[15366]: nodes in the site "A" for the packages managed
sc.sh[15366]: by Site Controller (manager)
sc.sh[15366]: Fix any issue reported in the package log files and enable
sc.sh[15366]: node switching for the packages on nodes that have failed.
sc.sh[15366]: Reset the site "A" using "/usr/local/cmcluster/bin/cmresetsc"
sc.sh[15366]: command and start manager again
sc.sh[15366]: Site Controller startup failed

316 Understanding Site Aware Disaster Tolerant Architecture


How to Deploy and Configure the Complex Workloads for
Disaster Recovery using SADTA
This section describes how to deploy and configure the complex workload for disaster recovery using
SADTA.

Configuring the Workload Packages and its Recovery Packages


Assume that there are two sites configured in a cluster (that is, Site A and Site B) with the following
packages critical_pkg, managed_pkg, and rem_mng_pkg configured on active site and
critical_pkg1, managed_pkg1, and rem_mng_pkg1 configured on passive site as shown in
Figure 41: Typical Cluster Configuration with Sites on page 317.

Figure 41: Typical Cluster Configuration with Sites

To configure workload packages and its recovery packages using SADTA, follow these steps:

1. Configure site-aware Serviceguard cluster with sites. For more information about how to configure
site-aware Serviceguard cluster with sites, see Configure Site-aware Serviceguard Cluster with
Sites on page 317.
2. Configure the complex workload packages. For more information about how to configure complex
workload packages, see Configure the Complex Workload Packages on page 318.
3. Configure the redundant complex workload packages. For more information about how to configure
redundant complex workload packages, see Configure the Redundant Complex Workload
Packages on page 318.
4. Configure the site controller package. For more information about how to configure site controller
package, see Configuring the Site Controller Package for the Complex Workload on page 318.

Configure Site-aware Serviceguard Cluster with Sites


To configure site-aware Serviceguard cluster with sites, see Site on page 312.

How to Deploy and Configure the Complex Workloads for Disaster Recovery using SADTA 317
NOTE:
Ensure that you have configured equal number of nodes on both sites.

Configure the Complex Workload Packages


The complex workload packages can be of type failover or multi-node packages configured in the
package configuration file. For more information about how to configure a failover or multi-node
packages, see Configuring Packages and Their Services . However, the same
generic_resource_name parameter must be present in all the workload packages using the
corresponding site safety latch. The table describes how to configure generic resource parameters for
critical, managed, and remote managed packages.

Table 17: Configuring Generic Resource Parameters for critical, managed, and
remote managed packages

Package Generic Resource Generic Example


Type Evaluation Type Resource
UP Criteria

Critical before_package_start >3 GENERIC_RESOURCE_NAME


Package sitecontroller_genres
and GENERIC_RESOURCE_EVALUATION_TYPE
Managed = before_package_start
Package GENERIC_RESOURCE_UP_CRITERIA >3

Remote before_package_start ==3 GENERIC_RESOURCE_NAME


Managed sitecontroller_genres
Package GENERIC_RESOURCE_EVALUATION_TYPE
= before_package_start
GENERIC_RESOURCE_UP_CRITERIA ==3

Configure the Redundant Complex Workload Packages


You must configure the complex workload packages redundantly which will be brought up on the recovery
site as shown in Figure 40: Sample Redundant Complex Workload Configuration on page 314. Each
of these redundant complex workload packages must be configured to use the site safety latch
information.

Configuring the Site Controller Package for the Complex Workload


Once the disaster tolerant redundant complex workloads are configured at each site, the site controller
package is the final component to be configured. This section describes the procedure to configure the
site controller package.
Guidelines to Configure Site Controller Package

The default value of the failover_policy parameter for the site controller package is set to
site_preferred. You can set the value to site_preferred_manual, based on your requirement.
The site_preferred_manualfailover policy provides automatic failover of packages within a site and
across sites. The site_preferred value implies that during a site controller package failover, while
selecting nodes from the list of the node_name entries, the site controller package fails over to the nodes
that belong to the site of the node it last ran on, rather than the nodes that belong to the other site. The

318 Configure the Complex Workload Packages


site_preferred_manual failover policy provides automatic failover of packages within a site and
manual failover across sites.

The table describes how to configure generic resource parameters for site controller package.

Table 18: Configuring Generic Resource Parameters for Site Controller Package

Package Generic Resource Generic Example


Type Evaluation Type Resource
UP Criteria

Site during_package_start >1 GENERIC_RESOURCE_NAME


Controller sitecontroller_genres
GENERIC_RESOURCE_EVALUATION_TYPE
= during_package_start
GENERIC_RESOURCE_UP_CRITERIA >1

To configure the site controller package for the complex workload:

1. Create a site controller package configuration file using the sg/sc module:
#cmmakepkg -m sg/sc pkg_sc.config

2. Edit the site controller package configuration file and specify the generic resource parameters for
critical_package, managed_package, and remote_managed_package parameters as described in
Configuring Generic Resource Parameters for critical, managed, and remote managed
packages.
3. Edit the site controller package configuration file and specify the generic resource parameters for site
controller package as described in Configuring Generic Resource Parameters for Site Controller
Package.
4. Verify the site controller package configuration file:
#cmcheckconf —v —P pkg_sc.config

5. Ensure all the workloads are present before applying the configuration:
#cmapplyconf —v —P pkg_sc.config

6. View the site controller packages configured after applying the site controller package configuration:
#cmviewcl —v —p pkg_sc

The sg/sc module provides the following attributes sc_site, managed_package,


critical_package, and remote_managed_package to specify the complex workload’s redundant
configuration. The sc_site must be configured using the names of the sites defined in the cluster
configuration file.
The following configurations must be done in the site controller package configuration file:

Understanding Site Aware Disaster Tolerant Architecture 319


Table 19: Site controller package configuration file parameters

Parameter Description
node_name The node_name parameter must be specified in an order where
all the nodes of the preferred site appear before the remote
adoptive site nodes.

auto_run The auto_run parameter must be set to NO.

generic_resource_name The generic_resource_name parameter must be specified. For


more information on the generic resource parameters, see
Package Parameter Explanations.

NOTE: If you have specified sg/sc module in the site controller


package configuration file, then you can specify only one
generic resource.

sc_site_monitor_interval The sc_site_monitor_interval parameter specifies the time


interval, in seconds, at which the site controller package monitors
the complex workload packages. The default value is 30
seconds. Hewlett Packard Enterprise recommends that you do
not specify the value less than 30 seconds.

critical_package Specify the critical package. Can be of type failover or multi-node


packages.

managed_package Specify the managed packages. Can be of type failover or multi-


node packages.

remote_mangaed_package Specify the remote managed packages. Can be of type failover


or multi-node packages.

Starting the Disaster Tolerant Complex Workload


After you have completed configuring SADTA in your environment with the complex workload, then you
must start the disaster tolerant complex workload in the site-aware Serviceguard cluster.
To start the disaster tolerant complex workload:

1. View the complex workload configuration:


#cmviewcl —v —p pkg_sc

2. Enable all the nodes in the cluster for the site controller package:
#cmmodpkg –e –n site1node_1 –n site1node_2 -n site2node_1 –n site2node_2
pkg_sc

3. Start the site controller package:


#cmmodpkg -e pkg_sc
or
#cmrunpkg pkg_sc

320 Understanding Site Aware Disaster Tolerant Architecture


The site controller package and the complex workload package start up on local site.

4. Enable the global switching for the package:


#cmmodpkg -e pkg_sc

5. Check the site controller package log file to ensure clean startup.

Checking the Site Controller Packages


The table describes how to validate the site controller package, the command to use, and its description.

Table 20: Validation of Site Controller Packages and its Workload Packages

Validations/Checks Command Description

Check the critical, cmapplyconf Checks for the following:


managed, and remote
cmcheckconf [-P/-p]
managed package • Must be either multi-node or failover type.
configuration
• Must be configured in the cluster.
• Does not contain sg/sc module.

• Must not be a site controller package.


• Must not have failover_policy parameter
set to site_preferred or
site_preferred_manual.

• If the node name in the package


configuration file and the node name
configured with critical, managed, and
remote managed package matches.

Check if the sites cmapplyconf Checks if the site values in this package are
configured are valid the sites that are configured in the cluster
cmcheckconf [-P/-p]
configuration.

Verify if the site safety cmapplyconf Checks for the following:


latch prerequisites are
cmcheckconf [-P/-p]
configured properly for a • Only one generic resource is configured
complex workload per site controller.
• The critical, managed, and remote
managed packages must be dependent on
the generic resource configured as part of
site controller.

Table Continued

Understanding Site Aware Disaster Tolerant Architecture 321


Validations/Checks Command Description

Check the auto_run cmapplyconf The auto_run parameter for site controller
parameter of the site must be set to NO.
cmcheckconf [-P/-p]
controller package.

Check the failover_policy cmapplyconf The failover_policy parameter must be set to


parameter for the site site_preferred or
cmcheckconf [-P/-p]
controller package. site_preferred_manual.

Failure Scenarios
The site controller package initiates a site takeover when the site or the complex workload has failed.
The following are the steps that describe the site failover sequence:

• Remote managed packages are brought down on the passive site.


• Any running critical or managed packages are brought down on the active site.
• Marks the passive site as new active site and critical and managed packages on this site are started.
• Marks the earlier active site as new passive site and starts the remote managed package on this site.
• Site controller package continues to monitor these workloads.

Online Reconfiguration of Site Controller Package


Online operations such as, addition and deletion of workload packages to a site controller package and
modification of site controller monitor interval are supported. The following operations can be performed
online:

• Addition of any number of critical_package, managed_package, and remote_managed_package


packages. The critical_package and managed_package packages must be running on the active site
and remote_managed_package package must be running on the passive site, before you add in the
site controller package. Then, apply the configuration using the cmapplyconf command.

• Addition of critical_package, managed_package, and remote_managed_package packages and


modification of sc_site_monitor_interval parameter.
• Deletion of any number of critical_package, managed_package, and remote_managed_package
packages.
• Deletion of critical_package, managed_package, and remote_managed_package packages and
modification of sc_site_monitor_interval parameter.
• Modification of sc_site_monitor_interval parameter.

The following operations cannot be performed online:

Addition and removal of critical_package, managed_package, or remote_managed_package package in


one single operation.

Managing SADTA Configuration


This section describes how to manage a SADTA configuration in which complex workloads are
configured.

322 Managing SADTA Configuration


Moving the Site Controller Package Without Affecting Workloads
You can halt the site controller package that is running on any node and move the site controller package
to another node on any site, without affecting the workloads. During the time period between halt and run
operations, the site controller package does not monitor the workload packages for any failures.
Procedure
To move the site controller package without affecting workloads, use any one of the following option. For
example, assume that site controller package scp_pkg1 is running on node1.
Option 1: Touch the PACKAGENAME_DETACH file

1. Run the touch command in the packages run directory (that is, $[SGRUN]/log/$
[SG_PACKAGE_NAME]_DETACH):
#touch PACKAGENAME_DETACH

2. Halt the site controller package:


#cmhaltpkg —n node1 sc_pkg1

3. Log in to the other node in the cluster and start the site controller package:
#cmrunpkg —n node2 sc_pkg1

Option 2: Package maintenance mode:

1. Place the site controller package in maintenance mode:


#cmmodpkg -m on —n node1 sc_pkg1

2. Halt the site controller package and perform any maintenance operations:
#cmhaltpkg sc_pkg1

3. Bring the package out of maintenance mode:


#cmmodpkg -m off sc_pkg1

4. Log in to the other node in the cluster and start the site controller package:
#cmrunpkg sc_pkg1

Rules for a Site Controller Package in Maintenance Mode


• If the site controller package is placed in maintenance mode, the site controller package does not
monitor for any failures in workloads.
• If the site controller package is taken out of maintenance mode, the site controller package resumes to
monitor for any failure in workloads.
• If the site controller package in maintenance mode is halted using cmhaltpkg command, then only
the site controller package is halted but not the workload packages.
• If the site controller package in maintenance mode is halted using cmhaltnode command, then both
site controller package and workload packages running on that node are halted.

Moving the Site Controller Package Without Affecting Workloads 323


• If the workload packages are in maintenance mode, the site controller package considers workload
package to up and running, even if the package status is UP or DOWN.
• If any of the workload package is in maintenance and the site controller package is halted, then it halts
all the workloads except the ones that are in the maintenance mode and logs the message in the
package log file.
• When any workload package is in maintenance mode on any site, the site controller package does not
start.
• During a site failover, when any of the workload (critical or managed or remote managed package) is
in maintenance mode site failover does not happen and logs the message in the package log file.
• If the site controller package is in maintenance mode and when it starts, the site controller package
starts all the workloads, but it will not monitor any workloads on any site.

Detaching a Node When Running Site Controller Package


Using LAD (Live Application Detach) you can detach a node without affecting the site controller package
and its workloads.
The following are the rules for a site controller package in detached state:

• Site controller package cannot be started when the workloads are in the detached state.
• When you halt a site controller package with some of its workloads in detached state, then the site
controller package does not halt the workloads that are in detached state.
• When you halt a site controller package which is in detached state with some of its workloads also in
detached state, then the site controller package does not halt the workloads that are in detached state.
• When you halt a site controller package which is in detached state and none of its workloads are in
detached state, then site controller package will halt all the workloads.
• Site controller package monitors the workload packages, even if the node on which site controller
package is running is in detached state. In this case, the site failover is not initiated when there is a
failure.
• If you reattach a site controller package, then it continues to monitor the workloads that are not in the
detached state or in maintenance mode.

NOTE:
When you detach a node on which a site controller package is running and then halt the site controller
package on that node, the site controller package will halt all the workloads. Before you restart the site
controller package on any node, ensure that you reattach the detached node.

Understanding the Smart Quorum


Serviceguard for Linux 12.00.30 introduces Smart Quorum feature to handle the quorum grant requests
between the nodes of a cluster and the quorum server. This new approach is introduced to increase the
availability of critical workloads which can be deployed only in clusters configured with site-aware failover
capability.
If this feature is enabled (ON), then quorum server grants the quorum to a site running the critical
workloads in an event of a network split that occurs between the sites configured in a cluster. This
mechanism helps to prevent failover of an active workload thus, it increases the availability of an
application service.
Smart Quorum supports the following at each site:

324 Detaching a Node When Running Site Controller Package


• Equal number of node configuration
• Asymmetric number of node configuration

How to Use the Smart Quorum


To use Smart Quorum feature, you must enable QS_SMART_QUORUM parameter in the cluster
configuration file. For more information about this parameter, see Cluster Configuration Parameters on
page 111 and the cmquerycl (1m) manpage.
Also, ensure that sites are configured in a cluster and a generic resource with a predefined name
sitecontroller_genres must be configured. This generic resource determines the status of the site
and must be part of site controller package. If there is any split between the sites, then Smart Quorum
decides which site to be granted quorum based on the workload status information. The site which is
running the critical workload is granted the quorum and the other site automatically shutdown as quorum
is denied to it.
In a split-brain scenario, the group of nodes behind the split elects a coordinator for each group. The
node, selected as group coordinator, sends the request to quorum server comprising group members and
workload state derived from a generic resource named as sitecontroller_genres. Quorum server
grants the quorum to a group only if the workload state is ACTIVE in the group.
If the workload state is PASSIVE, then quorum server waits for a user defined arbitration wait period (set
in QS_ARBITRATION_WAIT parameter) to allow the other groups to send their requests. In such
scenarios, quorum server takes the decision and grants quorum as follows:

• Upon receiving request from any active group within the wait period, it will grant the request to that
group.
or

• If QS_ARBITRATION_WAIT parameter time expires and quorum server does not receive the request
from the other group within the wait period. Then, quorum server promotes passive site to become
active. See the description of QS_ARBITRATION_WAIT parameter under Cluster Configuration
Parameters on page 111.

Prerequisites

• All the cluster nodes must have Serviceguard version A.12.00.30 or later.
• Quorum server version must be A.12.00.30 or later.
• The cluster must be a site-aware cluster.
• The cluster must be configured with a generic resource named as sitecontroller_genres.

A group which did not receive the quorum automatically shuts down to prevent the cluster islands.

Examples
Example 1: Assume that there are two sites configured in a cluster, that is, Site A and Site B. The site
where critical workloads are up and running is ACTIVE site and the other is PASSIVE site. In an event of
split between the sites, nodes of Site A cannot communicate with nodes of Site B in the following
conditions:

• Figure 42: Typical cluster configuration when there is a split between two sites with equal
number of active nodes on page 326 illustrates the two sites configured with equal number of nodes

How to Use the Smart Quorum 325


and Smart Quorum is enabled at quorum server. If there is a split between the sites, quorum server
grants quorum to Site A which is running critical workload (ACTIVE site).

Figure 42: Typical cluster configuration when there is a split between two sites with equal
number of active nodes
• Figure 43: Typical cluster configuration when there is a split between two sites with unequal
number of active nodes on page 327 illustrates two sites configured with unequal number of nodes
and Smart Quorum is enabled at quorum server. If there is a split between the sites, quorum server
grants quorum to Site A which is running critical workload (ACTIVE site) even if Site A has fewer
number of nodes than the other site (Site B).

326 Understanding Site Aware Disaster Tolerant Architecture


Figure 43: Typical cluster configuration when there is a split between two sites with unequal
number of active nodes
• Figure 44: Typical cluster configuration when the active site goes down due to a disaster on
page 328 illustrates an active site running critical workload goes down completely due to a disaster
and a quorum server is enabled with Smart Quorum feature. Then, quorum server will wait for
QS_ARBITRATION_WAIT specified time and grants the quorum to passive site (Site B). After cluster
reformation is complete, cluster fails over the critical workload from Site A to Site B. This happens
irrespective of the number of nodes at the passive site.

Understanding Site Aware Disaster Tolerant Architecture 327


Figure 44: Typical cluster configuration when the active site goes down due to a disaster

Example 2: Assume that there are two sites configured in a cluster, that is, Site A and Site B, where site
A has Node 1, Node 2, and Node 3 and site B has Node 4, Node 5, and Node 6 as in Figure 45: Typical
cluster configuration when there is a split across site on page 329. If the split occurs in such a way
that Node 1, Node 2, Node 3, and Node 4 form one sub-cluster and Node 5 and Node 6 form another
sub-cluster. The group that has majority number of nodes spans across the site and forms a cluster
without any support from an external quorum server.

328 Understanding Site Aware Disaster Tolerant Architecture


Figure 45: Typical cluster configuration when there is a split across site

Limitation
During startup of the workload packages, the site controller package does not honor the node order
defined in the complex workload packages. For more information about the node, see Dragging Rules
for Simple Dependencies.

Limitation 329
Simulating a Serviceguard Cluster
Cluster simulation allows administrators to simulate different kinds of failures, such as node, network, and
workload failures (that is, Serviceguard packages). Cluster simulation is capable of simulating node and
network interface failures. Cluster simulation evaluates and analyzes what happens to the package due to
simulated failures, whether the packages will failover and start on a specified node. The cluster states
reported by the simulated cluster exactly matches with the real Serviceguard cluster. This helps in
analyzing high availability design for various failure scenarios.
Cluster simulation also enables the administrators to do the following:

• Run failure simulations on Serviceguard clusters.


• Import the state of the Serviceguard cluster in production environment and run the failure simulation.
• Import the entire state of a deployed Serviceguard cluster into simulated environment for future
availability analysis.
• Repeat a pre-defined set of all possible failures in a cluster and check for points of failure.

Advantages

• The commands in the cluster simulation are almost similar to the Serviceguard cluster commands.
• You can run cluster simulation on production clusters since it does not interfere with the deployed
clusters.
• There is no need for the actual hardware to simulate a cluster. You can simulate up to 32 nodes in the
cluster using a single node.
• You can also have multiple simulation sessions or clusters running on the same node simultaneously.

Modes of Simulation
The simulation commands are supported with two modes, namely:

• Simulation prompt mode — In this mode, you can set the cluster or session using setcluster
command on which all further commands can be run without having to specify its name explicitly with
--session option. For example, #cmsimulatecl clsim> setcluster test_cluster
clsim:test_cluster>

• Simulation command-line interface mode — In this mode, you can run the Serviceguard simulation
command from the shell. For example, #cmsimulatecl cmapplyconf -C
test_cluster.ascii

Not Supported
The following Serviceguard features are not supported on the simulated cluster:

• The cmquerycl, cmgetconf, cmcheckconf, and cmcheckdisk commands.

• The -t preview option with any Serviceguard command.

• LAD, Generic Resource, Load Balancing, serviceguard-xdc, Cluster Analytics, VxFS, and Online and
offline reconfiguration of cluster and package.

330 Simulating a Serviceguard Cluster


• Failure scenarios of storage.
• The cmrunpkg -m and cmrunpkg -e option in maintenance mode.

Simulating the Cluster


This section describes how to create, import, halt, run, and delete the cluster in the simulator view. You
can simulate a cluster in two ways:

• Create a cluster using cmapplyconf in a simulation environment

• Create a cluster by importing the existing cluster into simulation environment

Session name is same as the cluster name. For information about how to create cluster configuration file,
see Cluster Configuration Parameters on page 111 and the cmsimulatecl (1m) manpage.
For example, assume that you have cluster configuration file test_cluste.ascii and package config
files PKG1.conf and PKG2.conf. The following sections describe how to create a simulated cluster
using these configuration files.

Creating the Simulated Cluster


To create the simulated cluster:

1. Edit the cluster configuration file and apply the changes to the configuration file using cmapplyconf
command.
#cmsimulatecl cmapplyconf -C test_cluster.ascii

2. View the status of the cluster:


#cmsimulatecl --session cmviewcl

Importing the Cluster State


You can import the cluster state in three ways and view the status of the cluster in the simulation
environment:

• Import Existing Local Cluster State into Simulation Environment on page 331
• Import Existing Remote Cluster State into Simulation Environment on page 332
• Import the Status of the Cluster State Stored in a File on page 332

Import Existing Local Cluster State into Simulation Environment


You can import the existing local cluster into simulator on any one of the cluster nodes. You can view the
status of the imported simulated cluster by executing the following command:
#cmsimulatecl importcluster
Cluster test_cluster is imported. Please use cmviewcl to view the cluster

NOTE:
The importcluster command can import only the state of a cluster of same Serviceguard version.

Simulating the Cluster 331


Import Existing Remote Cluster State into Simulation Environment
You can import the state of the cluster using -c option, including the state of all its packages into the
simulator.
If you do not specify any option, the importcluster command imports the state of the cluster
configured on the node where simulator is running. You can also view the status of the cluster using
cmviewcl command.
#cmsimulatecl importcluster -c test_cluster
Cluster test_cluster is imported. Please use cmviewcl to view the cluster

Import the Status of the Cluster State Stored in a File


You can import the state of the cluster stored in the <cluster_state_file> file using -l option. The
state of any cluster can be saved on a file by redirecting the output of the cmviewcl -v -f line
command to <cluster_state_file> file.
#cmsimulatecl importcluster -l test_cluster_cmviewcl_line
Cluster test_cluster is imported. Please use cmviewcl to view the cluster

Listing the Existing Session


Lists all the sessions available in the current node. You can also list the session in the simulator prompt
mode.
To list the existing session:
#cmsimulatecl listsession
db_cluster
test_cluster
data_cluster

Managing the Cluster


This section describes how to manage the Serviceguard cluster in the simulated environment. Non-root
users with the appropriate privileges can perform these tasks.

• To start the cluster you can use cmruncl command:


#cmsimulatecl --session test_cluster cmruncl
Waiting for cluster to form ..... done
Cluster successfully formed
For more information, see the cmruncl (1m) manpage.

• To halt the cluster you can use cmhaltcl command:


#cmsimulatecl --session test_clsuter cmhaltcl
Waiting for nodes to halt ..... done
Successfully halted all nodes specified
For more information, see the cmhaltcl (1m) manpage.

• To delete the cluster you can use cmdeleteconf command:

332 Import Existing Remote Cluster State into Simulation Environment


#cmsimulatecl cmdeleteconf -c test_cluster
Delete cluster sim1_cluster ([y]/n)y
Completed the cluster deletion
For more information, see the cmdeleteconf (1m) manpage.

Managing the Nodes in the Simulated Cluster


This section describes how to manage the nodes in the Serviceguard cluster in the simulated
environment. Once the node is set as active all further commands can be run on that node. The -n option
is not required.

• To set the active node in the cluster, on which all the simulator commands can be run:
#cmsimulatecl --session test_cluster setnode test1
Node test1 is set as active node for session test_cluster.

NOTE: If no nodes are set by the user, the first node in the cluster configuration file will be chosen to
run all the simulator commands.

• To return an active node to the cluster, on which all the simulator commands can be executed. For
example, to check on which node in "test_cluster" you can run the simulator commands:
#cmsimulatecl --session test_cluster getnode
test1 is active node for session test_cluster.

• To start the node you can use cmrunnode command:


#cmsimulatecl --session <sessionName> cmrunnode -n <node_name>
For example,
#cmsimulatecl --session test_cluster cmrunnode test1
Cluster successfully formed
cmrunnode: Completed successfully

• To halt the node you can use cmhaltnode command:


#cmsimulatecl --session <sessionName> cmhaltnode -n <node_name> ... [-f]
For example,
#cmsimulatecl --session test_cluster cmhaltnode -f
Disabling all packages from starting on nodes to be halted
Warning: Do not modify or enable packages until the halt operation is completed
Waiting for nodes to halt ..... done
Successfully halted all nodes specified
Halt operation complete

Simulation Scenarios for the Package


This section describes how to create, halt, and run the package in the simulator view. For more
information about how to create a package configuration file, see Dragging Rules for Simple
Dependencies and the cmsimulate (1m) manpage.

Managing the Nodes in the Simulated Cluster 333


Creating a Simulated Package
To create a simulated package:

1. Edit the cluster configuration file and apply the changes to the configuration file using cmapplyconf
command.
#cmsimulatecl cmapplyconf -P PKG1.conf

2. View the status of the package:


#cmsimulatecl --session test_cluster cmviewcl

Limitation

• During package reconfiguration, you can only add the packages but cannot perform any other
operations.
• You cannot add nodes once the cluster is created.
• You cannot modify the configuration of cluster, node, or package once you have added.

Running a Package
You can use cmrunpkg to run the package on a particular node. For more information,
#cmsimulatecl --session test_cluster cmrunpkg PKG1
Running package PKG1 on node test1
Successfully started package PKG1 on node test1
cmrunpkg: All specified packages are running
For more information, see the cmrunpkg (1m) manpage.

Halting a Package
You can halt a package using the following command:
#cmsimulatecl --session test_cluster cmhaltpkg PKG1
One or more packages or package instances have been halted
cmhaltpkg: Completed successfully on all packages specified
For more information, see the cmhaltpkg (1m) manpage.

Deleting a Package
You can delete the package using the following command:
#cmsimulatecl --session test_cluster cmdeleteconf -p PKG1
Modify the package configuration ([y]/n)y
Completed the package deletion
For more information, see the cmdeletepkg (1m) manpage.

Enabling or Disabling Switching Attributes for a Package


To enable or disable switching attributes for a package:

334 Creating a Simulated Package


#cmsimulatecl --session test_cluster cmmodpkg -d PKG1.conf
cmmodpkg: Completed successfully on all packages specified
For more information, see the cmmodpkg (1m) manpage.

Simulating Failure Scenarios


You can simulate a failure or recovery of node or package or interface on a simulated cluster. After
simulating this failure, you may use the cmviewcl command in the simulator to verify the resulting
package placement and status. For more information, see cmsimulatecl (1m) manpage.
For example, when a node test1 fails in a cluster, the packages that failover to the other nodes might
not return back to node test1. When that node is recovered and only the packages that had
failback_policy parameter configured may fail back to node test1.
Failure and recovery of a node

• To simulate a failure on a specified node:


#cmsimulatecl --session test_cluster fail -n test1

NOTE:
To verify whether the packages running on test1 failed over or not, and where the packages are
currently running:
#cmsimulatecl --session test_cluster cmviewcl

• To recover from the last node failure:


#cmsimulatecl --session test_cluster recover -n test1

Failure and recovery of a package

• To simulate a failure on a specified package:


#cmsimulatecl --session test_cluster fail -p PKG1

NOTE:
To verify whether the failure of PKG1 affects any other packages:
#cmsimulatecl --session cmviewcl

• To recover from the last package failure:


#cmsimulatecl --session test_cluster recover -p PKG1

Failure and recovery of an interface

• To simulate an interface failure on a node:


#cmsimulatecl --session test_cluster fail -n test1 -i eth0

Simulating Failure Scenarios 335


NOTE:
To verify whether the packages running on test1 failed over or not, and where the packages are
currently running:
#cmsimulatecl --session test_cluster cmviewcl

• To recover the last interface failure on a node:


#cmsimulatecl --session test_cluster recover -n test1 -i eth0

NOTE:
Assume that you have failed node1 and node2. You can only recover the failure from node2, but cannot
recover from multiple sequential failures. Also, you can only recover from the last failure.

336 Simulating a Serviceguard Cluster


Cluster Analytics
Serviceguard cluster analytics provides a mechanism to the users to perform "availability audits" on
Serviceguard cluster, nodes, and application packages running on the clusters.
Key Performance Indicators and Key Events of Cluster Analytics
Key Performance Indicators, also known as KPIs, is a metric of Serviceguard cluster object, such as
cluster, node, or package for which a value can be obtained from the cluster analytics engine for a
specified time period. For example, a package availability value can be computed for a time period
selected by the user. For more information, see cmcashowkpi (1m) manpage.
Key Events, also known as KEs, in Serviceguard cluster is an event which is important from the
perspective of availability and change management. Key events are important because the [date stamp,
key event] pair allows you to reconstruct the entire history of any monitored object in the cluster. The
availability of a monitored entity is computed by looking at the key events of the entity and the
corresponding date stamps.
Supported KPIs of Cluster Analytics
The table describes the supported KPIs of the cluster, nodes, and packages:

NOTE: You can obtain better visualizations for KPIs using GUI.

Table 21: KPIs of the Cluster, Nodes, and Packages

KPIs of the Cluster


AVAILABILITY The percentage of time duration for which cluster is up
for the total time specified. By default, the total time is
considered to be the difference between current time
and time at which cluster is created. You can also
specify the start and end time for which the KPI value
will be computed.

MAINTENANCE The percentage of time duration for which cluster was


brought down gracefully for the total time specified. By
default, the total time is considered to be the difference
between current time and time at which cluster is
created. You can also specify the start and end time for
which the KPI value will be computed.

LAST_REFORMATION_TIME The time when the cluster was last reformed.

CREATION_TIME The time when the cluster was created.

REFORMATION_COUNT The number of times a cluster is reformed.1


(This can be an indicator of cluster stability)

LAST_MODIFICATION_TIME The time when the cluster was last modified.

FAILURE_PROTECTION_LEVEL This is the ratio of available nodes to the total number of


nodes configured in a cluster.

Table Continued

Cluster Analytics 337


KPIs of Nodes in the Cluster
AVAILABILITY The percentage of time duration for which node is up in
the cluster for the total time specified. By default, the
total time is considered to be the difference between
current time and time at which node is added to the
cluster. You can also specify the start and end time for
which the KPI value will be computed.

MAINTENANCE The percentage of time duration for which node was


brought down gracefully for the total time specified. By
default, the total time is considered to be the difference
between current time and time at which node is added
to the cluster. You can also specify the start and end
time for which the KPI value will be computed.

UNAVAILABILITY The percentage of time duration for which node was


unavailable due to technical issues for the total time
specified. By default, the total time is considered to be
the difference between current time and time at which
node is added to the cluster. You can also specify the
start and end time for which the KPI value will be
computed.

LAST_JOINING_TIME The last time at which the node is added to the cluster
or node configured to be part of the cluster.

LAST_HALT_TIME The last time node has been halted successfully.

KPIs of Packages
AVAILABILITY The percentage of time duration for which package is up
in the cluster for the total time specified. By default, the
total time is considered to be the difference between
current time and time at which package is configured in
the cluster. You can also specify the start and end time
for which the KPI value will be computed.

MAINTENANCE The percentage of time duration for which package was


brought down gracefully for the total time specified. By
default, the total time is considered to be the difference
between current time and time at which package is
configured in the cluster. You can also specify the start
and end time for which the KPI value will be computed.

UNAVAILABILITY The percentage of time duration for which package was


unavailable due to technical issues for the total time
specified. By default, the total time is considered to be
the difference between current time and time at which
package is configured in the cluster. You can also
specify the start and end time for which the KPI value
will be computed.

Table Continued

338 Cluster Analytics


LAST_FAILOVER_TIME After last failure time at which package will start on
adoptive node.

FAILOVER_COUNT The number of times a package failed over.2


(for failover packages)

FAILURE_PROTECTION_LEVEL This is the ratio of available nodes to the total number of


nodes configured in a package.3

LAST_CREATION_TIME The time when the package was created.

LAST_MODIFICATION_TIME The time when the package was last modified.

MAX_TIME_TO_START The maximum time taken by the package to start on a


node.

MAX_TIME_TO_STOP The maximum time taken by the package to stop on a


node.

MIN_TIME_TO_START The minimum time taken by the package to start on a


node.

MIN_TIME_TO_STOP The minimum time taken by the package to stop on a


node.

1 The REFORMATION_COUNT KPI value is incremented by ‘n’, where, ‘n’ is the number of nodes up in the cluster at that
instance, when the cluster is halted. Also, the REFORMATION_COUNT KPI value gets incremented by ‘1’ when a
cluster is started.
2 When a node is halted using cmhaltnode -f command, the LAST_FAILOVER_TIME and FAILOVER_COUNT KPI
values of a package remain unchanged.
3 The node switching attribute and global switching attribute of the package are not considered for computing
FAILURE_PROTECTION_LEVEL KPI. For more information, see Package Switching Attributes on page 258.

Upgrading the Cluster Analytics Software


Pre-requisites
Following are the prerequisites for upgrading serviceguard-analytics version to A.12.00.20:

• sqlite-3.3 and later for Red Hat Enterprise Linux 5


• sqlite-3.6 and later for Red Hat Enterprise Linux 6
• sqlite3-3.7 and later for SUSE Linux Enterprise Server 11

Upgrading serviceguard-analytics Software


Before you upgrade to the patch, ensure that Serviceguard for Linux 12.00.00 is installed on your system.
For more information about how to upgrade from A.12.00.X to A.12.00.Y see the following documents
available at http://www.hpe.com/info/linux-serviceguard-docs:

Upgrading the Cluster Analytics Software 339


• HPE Serviceguard for Linux Base edition 12.00.40 Release Notes
• HPE Serviceguard for Linux Advanced edition 12.00.40 Release Notes
• HPE Serviceguard for Linux Enterprise edition 12.00.40 Release Notes

Verifying serviceguard-analytics Installation


After you install serviceguard-analytics, run the following command to ensure that the software is
installed:
#rpm -qa | serviceguard-analytics
In the output, the product name, serviceguard-analytics-A.12.00.20-0 will be listed. The
presence of the line in the output of the command verifies that the installation is successful.

Removing serviceguard-analytics Software


To remove serviceguard-analytics software:
On Red Hat 5:
#rpm –e serviceguard-analytics-A.12.00.20-0.rhel5
On Red Hat 6:
#rpm –e serviceguard-analytics-A.12.00.20-0.rhel6
On SUSE:
#rpm –e serviceguard-analytics-A.12.00.20-0.sles11

Configuring NFS as Shared Storage


To configure NFS as shared storage:

1. Ensure that cluster analytics daemon is stopped:


#cmcaadmin stop

2. Export a location from NFS server to all the nodes that are part of the cluster.
3. Mount the exported location:
#mount -t nfs <`nfs_server`:location> -o rw -o nolock <`mnt_dir`>

4. Add the following entry in /etc/fstab file:


<`nfs_server`:location> <`mnt_dir`> nfs rw,nolock 0 0

5. Edit $SGCONF/cmanalytics.conf file and include the following line:


SG_CA_DIRECTORY`=<`mnt_dir`>

NOTE:
Any changes made to the $SGCONF/cmanalytics.conf file do not take effect until you start (or
restart) the cluster analytics daemon.

6. Repeat step 3 to 5 for all the nodes in the cluster.

340 Verifying serviceguard-analytics Installation


Limitation
You must ensure that <mnt_dirpath> path is always available and is same on all the nodes in the
cluster.

Cluster Analytics Database Migration to Shared Storage


To perform migration from existing cluster analytics database to NFS shared storage:

1. Ensure that cluster analytics daemon is stopped:


#cmcaadmin stop

2. Copy $SGCONF/cluster.db3 file either from local or remote node, which has latest event message
stored to a location pointed by NFS shared storage mentioned in the SG_CA_DIRECTORY parameter
in the $SGCONF/cmanalytics.conf file.

Starting Cluster Analytics Daemon


Cluster analytics daemon must be running only on one node in the cluster. You can start the
cmanalyticsd daemon even before you run or configure the cluster. Use cmcaadmin start
command from one of the nodes on a cluster to start the cluster analytics daemon. Cluster Analytics
daemon gets started only in cluster coordinator node. For example, #cmcaadmin start

• If the cluster is not configured, then cmcaadmin start command displays the following message:

cmcaadmin: Starting Cluster Analytics daemon based on entries in $SGCONF/cmclnodelist


cmcaadmin: Started Cluster Analytics daemon successfully on node test1

• If the cluster is configured and down, then cmcaadmin start command displays the following
message:

cmcaadmin: Starting Cluster Analytics daemon on node test1


cmcaadmin: Started Cluster Analytics daemon successfully on node test1
• If the cluster is configured and up, then cmcaadmin start command displays the following
message:

cmcaadmin: Starting Cluster Analytics daemon on cluster coordinator test1


cmcaadmin: Started Cluster Analytics daemon successfully on node test1

NOTE: If the cluster is not configured, ensure that the nodes which are to be configured in the cluster are
present in the $SGCONF/cmclnodelist file. If $SGCONF/cmclnodelist file does not exist, then
Cluster Analytics daemon starts on node from which cmcaadmin startcommand has been issued.

Cluster Event Message Consolidation


The cluster analytics daemon process performs the following operations related to the database as part of
its initialization sequence and uses the database for consolidation of all cluster event messages:

Cluster Analytics Database Migration to Shared Storage 341


• Opens an existing $SGCONF/cluster.db3 database file if the cluster and database file is already
present on the system.
• Creates a new database file, if the cluster is configured or running and its analytics database file is not
created earlier.
• Performs the backup of an existing database file if any change in the cluster name is detected and
creates a new database file by appending current date and time.

Stopping Cluster Analytics Daemon


Use cmcaadmin stop command to stop the cluster analytics daemon. For example, #cmcaadmin
stop

• If the cluster is not configured, then cmcaadmin stop command displays the following message:

cmcaadmin: Stopping Cluster Analytics daemon based on entries in $SGCONF/cmclnodelist


cmcaadmin: Stopped Cluster Analytics daemon successfully on node test1

• If the cluster is configured and down, then cmcaadmin stop command displays the following
message:

cmcaadmin: Stopping Cluster Analytics daemon on node test1


cmcaadmin: Stopped Cluster Analytics daemon successfully on node test1
• If the cluster is configured and up, then cmcaadmin stop command displays the following message:

cmcaadmin: Stopping Cluster Analytics daemon on cluster coordinator test1


cmcaadmin: Stopped Cluster Analytics daemon successfully on node test1

Verifying Cluster Analytics Daemon


To verify whether cluster analytics daemon is running on any node in a cluster:
#cmcaadmin status
This command is used to check the state of the cluster analytics daemon. It also provides the information
whether the cluster analytics daemon is running or not. For example, #cmcaadmin status

• If the cluster is not configured and the cluster analytics daemon is not running, then cmcaadmin
status command displays the following message:

cmcaadmin: Checking Cluster Analytics daemon based on entries in $SGCONF/cmclnodelist


cmcaadmin: Cluster Analytics daemon is not running

• If the cluster is not configured and the cluster analytics daemon is running, then cmcaadmin status
command displays the following message:

cmcaadmin: Checking Cluster Analytics daemon based on entries in $SGCONF/cmclnodelist


cmcaadmin: Cluster Analytics daemon is running on node test1

• If the cluster is configured and down but cluster analytics daemon is not running, then cmcaadmin
status command displays the following message:

cmcaadmin: Checking Cluster Analytics daemon on node test1


cmcaadmin: Cluster Analytics daemon is not running
• If the cluster is configured and down but cluster analytics daemon is running, then cmcaadmin
status command displays the following message:

342 Stopping Cluster Analytics Daemon


cmcaadmin: Checking Cluster Analytics daemon on node test1
cmcaadmin: Cluster Analytics daemon is running on node test1
• If the cluster is running and cluster analytics daemon is not running, then cmcaadmin status
command displays the following message:

cmcaadmin: Checking Cluster Analytics daemon on cluster coordinator test1


cmcaadmin: Cluster Analytics daemon is not running
• If the cluster is running and cluster analytics daemon is also running, then cmcaadmin status
command displays the following message:

cmcaadmin: Checking Cluster Analytics daemon on cluster coordinator test1


cmcaadmin: Cluster Analytics daemon is running on node test1

Removing Cluster Analytics State Configuration File


Cluster analytics daemon stores its state information in a configuration file for internal administration
purposes. Under certain circumstances, you must remove the cluster analytics state configuration file
using cmcaadmin cleanup command as follows:

• If the cluster is not configured and the cluster analytics daemon is running on one of the nodes listed in
the $SGCONF/cmclnodelist file, then the cmcaadmin cleanup command displays the following
message:
cmcaadmin: Cleaning Cluster Analytics state configuration on node test1
ERROR: Fail to clean Cluster Analytics state configuration, Cluster Analytics daemon is running on node test1

• If the cluster is not configured and the cluster analytics daemon is not running on any node listed in the
$SGCONF/cmclnodelist file, then the cmcaadmin cleanup command displays the following
message:

cmcaadmin: Cleaning Cluster Analytics state configuration on node test1


cmcaadmin: Cleaned Cluster Analytics state configuration successfully on node test1

• If the cluster is configured and down but cluster analytics daemon is halted, then the cmcaadmin
cleanup command displays the following message:

cmcaadmin: Cleaning Cluster Analytics state configuration on node test1


ERROR: Fail to clean Cluster Analytics state configuration as node is part of cluster

• If the cluster is configured and down but cluster analytics daemon is running, then the cmcaadmin
cleanup command displays the following messages:
cmcaadmin: Cleaning Cluster Analytics state configuration on node test1
ERROR: Fail to clean Cluster Analytics state configuration, Cluster Analytics daemon is running on node test1

or

cmcaadmin: Cleaning Cluster Analytics state configuration on node test1


ERROR: Fail to clean Cluster Analytics state configuration as node is part of cluster

Command to Retrieve KPIs


Using cmcashowkpi command, you can retrieve KPIs for an object type, such as cluster, node, or
package. You can also run cmcashowkpi command from any node in the cluster, even if the node in the
cluster is UP or DOWN state. For detailed description about KPIs of a node, cluster, or package, see
cmcashowkpi (1m) manpage.
For example,

Removing Cluster Analytics State Configuration File 343


#cmcashowkpi -o node test1 -s "2013-10-07 05:07:13"

NOTE:

• The KPI value cannot be calculated for a specific date or time.


• The date must be either in YYYY-MM-DD HH:MM:SS or YY-MM-DD format only.

• The $SGCONF/cmclnodelist file must be populated to run the cmcashowkpi command.

Limitation
The data retrieval operation on cluster event database file is not supported.

344 Limitation
Integrating Application Tuner Express
HPE Application Tuner Express (HPE-ATX) is an utility that enables applications achieve maximum
performance while running on larger x86 servers. For more information about HPE ATX see, HPE ATX
documentation.
You can integrate ATX with Serviceguard to run applications with ATX. In a Serviceguard environment,
you can configure the applications as Serviceguard packages.

Procedure

1. Define a Package Environment Variable (PEV) in the package configuration file.


For more information about PEV see, pev_
2. Set the variable name to PEV_ATX.
The value for the variable is the input to the command line options required by HPE ATX.
3. Apply the package configuration with PEV.
4. Modify the start of the application to append hpe-atx $PEV_ATX application startup
script.
For more information about how to start or stop applications in the Serviceguard environment, see
Integrating HA Applications in Multiple Systems
5. Start the application using cmrunpkg command.

NOTE: You can apply PEV_ATX parameter when the package is online but the changes are applied
after the package restarts.

6. View the package log information for any errors or warnings when the application starts with HPE ATX.

Serviceguard utility functions


You can use the Serviceguard defined utility functions to perform ATX specific validations. You can use
these functions to verify the license validity for ATX. You can call these function as part of their validate
section of external_scripts defined in the Package. For more information about external scripts, see
About External Scripts.

Integrating Application Tuner Express 345


Troubleshooting Your Cluster
Cause
This chapter describes how to verify cluster operation, how to review cluster status, how to add and
replace hardware, and how to solve some typical cluster problems. Topics are as follows:

• Testing Cluster Operation


• Monitoring Hardware
• Replacing Disks
• Replacing LAN Cards
• Replacing a Failed Quorum Server System
• Troubleshooting Approaches
• Solving Problems
• Troubleshooting serviceguard-xdc package
• Troubleshooting cmvmusermgmt Utility on page 361

Testing Cluster Operation


Once you have configured your Serviceguard cluster, you should verify that the various components of
the cluster behave correctly in case of a failure. In this section, the following procedures test that the
cluster responds properly in the event of a package failure, a node failure, or a LAN failure.

CAUTION:
In testing the cluster in the following procedures, be aware that you are causing various
components of the cluster to fail, so that you can determine that the cluster responds correctly to
failure situations. As a result, the availability of nodes and applications may be disrupted.

Testing the Package Manager


To test that the package manager is operating correctly, perform the following procedure for each
package on the cluster:

1. Obtain the PID number of a service in the package by entering

ps -ef | grep <service_cmd>

where service_cmd is the executable specified in the package configuration file by means of the
service_cmd parameter. The service selected must have the default service_restart value (none).

2. To kill the service_cmd PID, enter

kill <PID>
3. To view the package status, enter

cmviewcl -v

346 Troubleshooting Your Cluster


The package should be running on the specified adoptive node.

4. Halt the package, then move it back to the primary node using the cmhaltpkg, cmmodpkg, and
cmrunpkg commands:

cmhaltpkg <PackageName>

cmmodpkg -e <PrimaryNode> <PackageName>

cmrunpkg -v <PackageName>

Depending on the specific databases you are running, perform the appropriate database recovery.

You can also test the package manager using generic resources. Perform the following procedure for
each package on the cluster:

1. Obtain the generic resource that is configured in a package by entering


cmviewcl -v -p <pkg_name>

2. Set the status of generic resource to DOWN using the following command:
cmsetresource -r <res1> –s down

3. To view the package status, enter


cmviewcl -v
The package should be running on the specified adoptive node.

4. Move the package back to the primary node (see Moving a Failover Package ).

NOTE: If there was a monitoring script configured for this generic resource, then the monitoring script
would also be attempting to set the status of the generic resource.

Testing the Cluster Manager


To test that the cluster manager is operating correctly, perform the following steps for each node on the
cluster:

1. Turn off the power to the node.


2. To observe the cluster reforming, enter the following command on some other configured node:

cmviewcl -v

You should be able to observe that the powered down node is halted, and that its packages have been
correctly switched to other nodes.

3. Turn on the power to the node.


4. To verify that the node is rejoining the cluster, enter the following command on any configured node:

cmviewcl -v

The node should be recognized by the cluster, but its packages should not be running.

5. Move the packages back to the original node:

Testing the Cluster Manager 347


cmhaltpkg <pkgname>

cmmodpkg -e -n <originalnode>

cmrunpkg <pkgname>

Depending on the specific databases you are running, perform the appropriate database recovery.

6. Repeat this procedure for all nodes in the cluster one at a time.

Monitoring Hardware
Good standard practice in handling a high availability system includes careful fault monitoring so as to
prevent failures if possible or at least to react to them swiftly when they occur. For information about disk
monitoring, see Creating a Disk Monitor Configuration on page 254. In addition, the following should
be monitored for errors or warnings of all kinds:

• CPUs
• Memory
• NICs
• Power sources
• All cables
• Disk interface cards

Some monitoring can be done through simple physical inspection, but for the most comprehensive
monitoring, you should examine the system log file (/var/log/messages) periodically for reports on all
configured HA devices. The presence of errors relating to a device will show the need for maintenance.

Replacing Disks
The procedure for replacing a faulty disk mechanism depends on the type of disk configuration you are
using. Refer to your Smart Array documentation for issues related to your Smart Array.

Replacing a Faulty Mechanism in a Disk Array


You can replace a failed disk mechanism by simply removing it from the array and replacing it with a new
mechanism of the same type. The resynchronization is handled by the array itself. There may be some
impact on disk performance until the resynchronization is complete. For details on the process of hot
plugging disk mechanisms, refer to your disk array documentation.

Replacing a Lock LUN


You can replace an unusable lock LUN while the cluster is running. You can do this without any cluster
reconfiguration if you do not change the devicefile name; or, if you do need to change the devicefile, you
can do the necessary reconfiguration while the cluster is running.
If you need to use a different devicefile, you must change the name of the devicefile in the cluster
configuration file; see Updating the Cluster Lock LUN Configuration Online.

348 Monitoring Hardware


CAUTION: Before you start, make sure that all nodes have logged a message such as the following
in syslog:
WARNING: Cluster lock LUN /dev/sda1 is corrupt: bad label. Until this
situation is corrected, a single failure could cause all nodes in the
cluster to crash.

Once all nodes have logged this message, use a command such as the following to specify the new
cluster lock LUN:
cmdisklock reset /dev/sda1

CAUTION: You are responsible for determining that the device is not being used by LVM or any
other subsystem on any node connected to the device before using cmdisklock. If you use
cmdisklock without taking this precaution, you could lose data.

NOTE: cmdisklock is needed only when you are repairing or replacing a lock LUN; see the
cmdisklock (1m) manpage for more information.

Serviceguard checks the lock LUN every 75 seconds. After using the cmdisklock command, review the
syslog file of an active cluster node for not more than 75 seconds. By this time you should see a
message showing that the lock disk is healthy again.

Revoking Persistent Reservations after a Catastrophic


Failure
For information about persistent reservations (PR) and how they work, see About Persistent
Reservations.
Under normal circumstances, Serviceguard clears all persistent reservations when a package halts. In the
case of a catastrophic cluster failure however, you may need to do the cleanup yourself as part of the
recovery. Use the $SGCONF/scripts/sg/pr_cleanup script to do this. (The script is also in
$SGCONF/bin/. See Understanding the Location of Serviceguard Files on page 169 for the locations
of Serviceguard directories on various Linux distributions.)
Invoke the script as follows, specifying either the device special file (DSF) of a LUN, or a file containing a
list of DSF names:
pr_cleanup lun -v -k <key> [-f <filename_path> | <list of DSFs>]

• lun, if used, specifies that a LUN, rather than a volume group, is to be operated on.

• -v, if used, specifies verbose output detailing the actions the script performs and their status.

• -k <key> , if used, specifies the key to be used in the clear operation.

• -f <filename_path>, if used, specifies that the name of the DSFs to be operated on are listed in
the file specified by <filename_path>. Each DSF must be listed on a separate line.
• <list of DSFs> specifies one or more DSFs on the command line, if -f <filename_path> is not
used.

Examples
The following command will clear all the PR reservations registered with the key abc12 on the set of
LUNs listed in the file /tmp/pr_device_list

Revoking Persistent Reservations after a Catastrophic Failure 349


pr_cleanup -k abc12 lun -f /tmp/pr_device_list
pr_device_list contains entries such as the following:
/dev/sdb1
/dev/sdb2
Alternatively you could enter the device-file names on the command line:
pr_cleanup -k abc12 lun /dev/sdb1 /dev/sdb2
The next command clears all the PR reservations registered with the PR key abcde on the underlying
LUNs of the volume group vg01:
pr_cleanup -k abcde vg01

NOTE:
Because the keyword lun is not included, the device is assumed to be a volume group.

Replacing LAN Cards


If you need to replace a LAN card, use the following steps. It is not necessary to bring the cluster down to
do this.

Procedure

1. Halt the node using the cmhaltnode command.

2. Shut down the system:


shutdown -h
Then power off the system.
3. Remove the defective LAN card.
4. Install the new LAN card. The new card must be exactly the same card type, and it must be installed in
the same slot as the card you removed.
5. Power up the system.
6. The kudzu program detects and reports the hardware changes only on Red Hat Enterprise Linux 5.
Accept the changes and add any information needed for the new LAN card. On SUSE systems, run
YAST2 after the system boots and make adjustments to the NIC setting of the new LAN card. If the old
LAN card was part of a “bond”, the new LAN card needs to be made part of the bond. See
Implementing Channel Bonding (Red Hat) on page 175 or Implementing Channel Bonding
(SUSE) on page 177.
7. If necessary, add the node back into the cluster using the cmrunnode command.
(You can omit this step if the node is configured to join the cluster automatically.)

Now Serviceguard will detect that the MAC address (LLA) of the card has changed from the value stored
in the cluster binary configuration file, and it will notify the other nodes in the cluster of the new MAC
address. The cluster will operate normally after this.
Hewlett Packard Enterprise recommends that you update the new MAC address in the cluster binary
configuration file by re-applying the cluster configuration. Use the following steps for online
reconfiguration:

1. Use the cmgetconf command to obtain a fresh ASCII configuration file, as follows:

350 Replacing LAN Cards


cmgetconf config.conf
2. Use the cmapplyconf command to apply the configuration and copy the new binary file to all cluster
nodes:

cmapplyconf -C config.conf

This procedure updates the binary file with the new MAC address and thus avoids data inconsistency
between the outputs of the cmviewconf and ifconfig commands.

Replacing a Failed Quorum Server System


When a quorum server fails or becomes unavailable to the clusters it is providing quorum services for, this
will not cause a failure on any cluster. However, the loss of the quorum server does increase the
vulnerability of the clusters in case there is an additional failure. Use the following procedure to replace a
defective quorum server system. If you use this procedure, you do not need to change the configuration
of any cluster nodes.

IMPORTANT: Make sure you read the latest version of the HPE Serviceguard Quorum Server
Release Notes before you proceed. You can find them at http://www.hpe.com/info/linux-
serviceguard-docs (Select HP Serviceguard Quorum Server Software). You should also consult
the Quorum Server white papers at the same location.

1. Remove the old quorum server system from the network.


2. Set up the new system and configure it with the old quorum server’s IP address and hostname.
3. Install and configure the quorum server software on the new system. Be sure to include in the new QS
authorization file (for example, /usr/local/qs/conf/qs_authfile) on all of the nodes that were
configured for the old quorum server. Refer to the qs(1) man page for details about configuring the
QS authorization file.

NOTE: The quorum server reads the authorization file at startup. Whenever you modify the file
qs_authfile, run the following command to force a re-read of the file. For example, on a Red Hat
distribution:

/usr/local/qs/bin/qs -update

On a SUSE distribution:

/opt/qs/bin/qs -update

4. Start the quorum server as follows:

• Use the init q command to run the quorum server.


Or

• Create a package in another cluster for the Quorum Server, as described in the Release Notes for
your version of Quorum Server. They can be found at http://www.hpe.com/info/linux-
serviceguard-docs (Select HP Serviceguard Quorum Server Software).

5. All nodes in all clusters that were using the old quorum server will connect to the new quorum server.
Use the cmviewcl -v command from any cluster that is using the quorum server to verify that the
nodes in that cluster have connected to the QS.

Replacing a Failed Quorum Server System 351


6. The quorum server log file on the new quorum server will show a log message like the following for
each cluster that uses the quorum server:
Request for lock /sg/<ClusterName> succeeded. New lock owners: N1, N2

7. To check that the quorum server has been correctly configured and to verify the connectivity of a node
to the quorum server, you can execute the following command from your cluster nodes as follows:

cmquerycl -q <QSHostName> -n <Node1> -n <Node2>


...

The command will output an error message if the specified nodes cannot communicate with the
quorum server.

CAUTION: Make sure that the old system does not rejoin the network with the old IP address.

NOTE: While the old quorum server is down and the new one is being set up:

• The cmquerycl, cmcheckconf and cmapplyconf commands will not work

• The cmruncl, cmhaltcl, cmrunnode, and cmhaltnode commands will work

• If there is a node or network failure that creates a 50-50 membership split, the quorum server will not
be available as a tie-breaker, and the cluster will fail.

Troubleshooting Approaches
Cause
The following sections offer a few suggestions for troubleshooting by reviewing the state of the running
system and by examining cluster status data, log files, and configuration files. Topics include:

• Reviewing Package IP Addresses


• Reviewing the System Log File
• Reviewing Configuration Files
• Reviewing the Package Control Script
• Using cmquerycl and cmcheckconf

• Using cmviewcl

• Reviewing the LAN Configuration

Reviewing Package IP Addresses


The ifconfigcommand can be used to examine the LAN configuration. The command, if executed on
ftsys9 after the halting of node ftsys10, shows that the package IP addresses are assigned to eth1:1 and
eth1:2 along with the heartbeat IP address on eth1.
eth0 Link encap:Ethernet HWaddr 00:01:02:77:82:75
inet addr:15.13.169.106 Bcast:15.13.175.255 Mask:255.255.248.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

352 Troubleshooting Approaches


RX packets:70826196 errors:0 dropped:0 overruns:1 frame:0
TX packets:5741486 errors:1 dropped:0 overruns:1 carrier:896
collisions:26706 txqueuelen:100
Interrupt:9 Base address:0xdc00
eth1 Link encap:Ethernet HWaddr 00:50:DA:64:8A:7C
inet addr:192.168.1.106 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2337841 errors:0 dropped:0 overruns:0 frame:0
TX packets:1171966 errors:0 dropped:0 overruns:0 carrier:0
collisions:6 txqueuelen:100
Interrupt:9 Base address:0xda00
eth1:1 Link encap:Ethernet HWaddr 00:50:DA:64:8A:7C
inet addr:192.168.1.200 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:9 Base address:0xda00
eth1:2 Link encap:Ethernet HWaddr 00:50:DA:64:8A:7C
inet addr:192.168.1.201 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:9 Base address:0xda00
lo Link encap:Local Loopback
inet addr:127.0.0.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP LOOPBACK RUNNING MULTICAST MTU:3924 Metric:1
RX packets:2562940 errors:0 dropped:0 overruns:1 frame:0
TX packets:2562940 errors:1 dropped:0 overruns:1 carrier:896
collisions:0 txqueuelen:0

Reviewing the System Log File


Messages from the Cluster Manager and Package Manager are written to the system log file. The default
location of the log file may vary according to Linux distribution; the Red Hat default is /var/log/
messages. You can use a text editor, such as vi, or the more command to view the log file for historical
information on your cluster.
This log provides information on the following:

• Commands executed and their outcome.


• Major cluster events which may, or may not, be errors.
• Cluster status information.

NOTE:
Many other products running on Linux in addition to Serviceguard use the syslog file to save messages.
Refer to your Linux documentation for additional information on using the system log.

Sample System Log Entries


The following sample entries from the syslog file show a package that failed to run because of a
problem in the pkg5_run script. You would look at the pkg5_run.log for details.
Dec 14 14:33:48 star04 cmcld[2048]: Starting cluster management protocols.
Dec 14 14:33:48 star04 cmcld[2048]: Attempting to form a new cluster
Dec 14 14:33:53 star04 cmcld[2048]: 3 nodes have formed a new cluster
Dec 14 14:33:53 star04 cmcld[2048]: The new active cluster membership is:
star04(id=1) , star05(id=2), star06(id=3)
Dec 14 17:33:53 star04 cmlvmd[2049]: Clvmd initialized successfully.
Dec 14 14:34:44 star04 CM-CMD[2054]: cmrunpkg -v pkg5

Reviewing the System Log File 353


Dec 14 14:34:44 star04 cmcld[2048]: Request from node star04 to start
package pkg5 on node star04.
Dec 14 14:34:44 star04 cmcld[2048]: Executing '/usr/local/cmcluster/conf/pkg5/pkg5_run
start' for package pkg5.
Dec 14 14:34:45 star04 LVM[2066]: vgchange -a n /dev/vg02
Dec 14 14:34:45 star04 cmcld[2048]: Package pkg5 run script exited with
NO_RESTART.
Dec 14 14:34:45 star04 cmcld[2048]: Examine the file
/usr/local/cmcluster/pkg5/pkg5_run.log for more details.

The following is an example of a successful package starting:


Dec 14 14:39:27 star04 CM-CMD[2096]: cmruncl
Dec 14 14:39:27 star04 cmcld[2098]: Starting cluster management protocols.
Dec 14 14:39:27 star04 cmcld[2098]: Attempting to form a new cluster
Dec 14 14:39:27 star04 cmclconfd[2097]: Command execution message
Dec 14 14:39:33 star04 cmcld[2098]: 3 nodes have formed a new cluster
Dec 14 14:39:33 star04 cmcld[2098]: The new active cluster membership is:
star04(id=1), star05(id=2), star06(id=3)
Dec 14 17:39:33 star04 cmlvmd[2099]: Clvmd initialized successfully.
Dec 14 14:39:34 star04 cmcld[2098]: Executing '/usr/local/cmcluster/conf/pkg4/pkg4_run
start' for package pkg4.
Dec 14 14:39:34 star04 LVM[2107]: vgchange /dev/vg01
Dec 14 14:39:35 star04 CM-pkg4[2124]: cmmodnet -a -i 15.13.168.0 15.13.168.4
Dec 14 14:39:36 star04 CM-pkg4[2127]: cmrunserv Service4 /vg01/MyPing 127.0.0.1
>>/dev/null
Dec 14 14:39:36 star04 cmcld[2098]: Started package pkg4 on node star04.

Reviewing Configuration Files


Review the following ASCII configuration files:

• Cluster configuration file.


• Package configuration files.

Ensure that the files are complete and correct according to your configuration planning worksheets.

Using the cmquerycl and cmcheckconf Commands


In addition, cmquerycl and cmcheckconf can be used to troubleshoot your cluster just as they were
used to verify its configuration. The following example shows the commands used to verify the existing
cluster configuration on ftsys9 and ftsys10:

cmquerycl -v -C $SGCONF/verify.conf -n ftsys9 -n ftsys10


cmcheckconf -v -C $SGCONF/verify.conf

cmcheckconf checks:

• The network addresses and connections.


• Quorum Server connectivity, if a quorum server is configured.
• Lock LUN connectivity, if a lock LUN is used.
• The validity of configuration parameters of the cluster and packages for:

354 Reviewing Configuration Files


◦ The uniqueness of names.
◦ The existence and permission of scripts.

It doesn't check:

• The correct setup of the power circuits.


• The correctness of the package configuration script.

Reviewing the LAN Configuration


The following networking commands can be used to diagnose problems:

• ifconfig can be used to examine the LAN configuration. This command lists all IP addresses
assigned to each LAN interface card.
• arp -a can be used to check the arp tables.

• cmscancl can be used to test IP-level connectivity between network interfaces in the cluster.

• cmviewcl -v shows the status of primary LANs.

Use these commands on all nodes.

Solving Problems
Problems with Serviceguard may be of several types. The following is a list of common categories of
problem:

• Serviceguard Command Hangs.


• Cluster Re-formations.
• System Administration Errors.
• Package Control Script Hangs.
• Package Movement Errors.
• Node and Network Failures.
• Quorum Server Messages.

Name Resolution Problems


Many Serviceguard commands, including cmviewcl, depend on name resolution services to look up the
addresses of cluster nodes. When name services are not available (for example, if a name server is
down), Serviceguard commands may hang, or may return a network-related error message. If this
happens, use the host command on each cluster node to see whether name resolution is correct. For
example:

host ftsys9

ftsys9.cup.hp.com has address 15.13.172.229

Reviewing the LAN Configuration 355


If the output of this command does not include the correct IP address of the node, then check your name
resolution services further.

Networking and Security Configuration Errors


In many cases, a symptom such as Permission denied... or Connection refused... is the
result of an error in the networking or security configuration. Most such problems can be resolved by
correcting the entries in /etc/hosts. See Configuring Name Resolution for more information.

Halting a Detached Package


When you attempt to halt a detached package using the cmhaltpkg and the given node is not
reachable, you will get an error message as follows:
Unable to halt the detached package <package_name> on node <node_name> as the
node is not reachable. Retry once the node is reachable.
In such a case, the node should be powered up and be accessible. You must then rerun the cmhaltpkg
command.

Cluster Re-formations Caused by Temporary Conditions


You may see Serviceguard error messages, such as the following, which indicate that a node is having
problems:
Member node_name seems unhealthy, not receiving heartbeats from it.
This may indicate a serious problem, such as a node failure, whose underlying cause is probably a too-
aggressive setting for the MEMBER_TIMEOUT parameter; see the next section, Cluster Re-formations
Caused by MEMBER_TIMEOUT Being Set too Low. Or it may be a transitory problem, such as
excessive network traffic or system load.
What to do: If you find that cluster nodes are failing because of temporary network or system-load
problems (which in turn cause heartbeat messages to be delayed in network or during processing), you
should solve the networking or load problem if you can. Failing that, you can increase the value of
MEMBER_TIMEOUT, as described in the next section.

Cluster Re-formations Caused by MEMBER_TIMEOUT Being Set too Low


If you have set the MEMBER_TIMEOUT parameter too low, the cluster demon, cmcld, will write warnings
to syslog that indicate the problem. There are three in particular that you should watch for:

1. Warning: cmcld was unable to run for the last <n.n> seconds. Consult the
Managing Serviceguard manual for guidance on setting MEMBER_TIMEOUT, and
information on cmcld.
This means that cmcld was unable to get access to a CPU for a significant amount of time. If this
occurred while the cluster was re-forming, one or more nodes could have failed. Some commands
(such as cmhaltnode (1m), cmrunnode (1m), cmapplyconf (1m)), cause the cluster to re-
form, so there's a chance that running one of these commands could precipitate a node failure; that
chance is greater the longer the hang.
What to do: If this message appears once a month or more often, increase MEMBER_TIMEOUT to
more than 10 times the largest reported delay. For example, if the message that reports the largest
number says that cmcld was unable to run for the last 1.6 seconds, increase MEMBER_TIMEOUT to
more than 16 seconds.

2. This node is at risk of being evicted from the running cluster. Increase
MEMBER_TIMEOUT.

356 Networking and Security Configuration Errors


This means that the hang was long enough for other nodes to have noticed the delay in receiving
heartbeats and marked the node “unhealthy”. This is the beginning of the process of evicting the node
from the cluster; see What Happens when a Node Times Out for an explanation of that process.
What to do: In isolation, this could indicate a transitory problem, as described in the previous section. If
you have diagnosed and fixed such a problem and are confident that it won't recur, you need take no
further action; otherwise you should increase MEMBER_TIMEOUT as instructed in item 1.

3. Member node_name seems unhealthy, not receiving heartbeats from it.


This is the message that indicates that the node has been found “unhealthy” as described in the
previous bullet.
What to do: See item 2.

For more information, including requirements and recommendations, see the MEMBER_TIMEOUT
discussion under Cluster Configuration Parameters on page 111.

System Administration Errors


There are a number of errors you can make when configuring Serviceguard that will not show up when
you start the cluster. Your cluster can be running, and everything appears to be fine, until there is a
hardware or software failure and control of your packages are not transferred to another node as you
would have expected.
These are errors caused specifically by errors in the cluster configuration file and package configuration
scripts. Examples of these errors include:

• Volume groups not defined on adoptive node.


• Mount point does not exist on adoptive node.
• Network errors on adoptive node (configuration errors).
• User information not correct on adoptive node.

You can use the following commands to check the status of your disks:

• df - to see if your package’s volume group is mounted.

• vgdisplay -v - to see if all volumes are present.

• strings /etc/lvmconf/*.conf - to ensure that the configuration is correct.

• fdisk -v /dev/sdx - to display information about a disk.

Package Control Script Hangs or Failures


When a RUN_SCRIPT_TIMEOUT or HALT_SCRIPT_TIMEOUT value is set, and the control script hangs,
causing the timeout to be exceeded, Serviceguard kills the script and marks the package “Halted.”
Similarly, when a package control script fails, Serviceguard kills the script and marks the package
“Halted.” In both cases, the following also take place:

• Control of the package will not be transferred.


• The run or halt instructions may not run to completion.

System Administration Errors 357


• Global switching will be disabled.
• The current node will be disabled from running the package.

Following such a failure, since the control script is terminated, some of the package’s resources may be
left activated. Specifically:

• Volume groups may be left active.


• File systems may still be mounted.
• IP addresses may still be installed.
• Services may still be running.

In this kind of situation, Serviceguard will not restart the package without manual intervention. You must
clean up manually before restarting the package. Use the following steps as guidelines:

1. Perform application specific cleanup. Any application specific actions the control script might have
taken should be undone to ensure successfully starting the package on an alternate node. This might
include such things as shutting down application processes, removing lock files, and removing
temporary files.
2. Ensure that package IP addresses are removed from the system. This step is accomplished via the
cmmodnet(1m) command. First determine which package IP addresses are installed by inspecting
the output resulting from running the ifconfig command. If any of the IP addresses specified in the
package control script appear in the ifconfig output under the inet addr: in the ethX:Y block,
use cmmodnet to remove them:

cmmodnet -r -i <ip-address> <subnet>

where <ip-address> is the address indicated above and <subnet> is the result of masking the <ip-
address> with the mask found in the same line as the inet address in the ifconfig output.

3. Ensure that package volume groups are deactivated. First unmount any package logical volumes
which are being used for file systems. This is determined by inspecting the output resulting from
running the command df -l. If any package logical volumes, as specified by the LV[] array
variables in the package control script, appear under the “Filesystem” column, use umount to
unmount them:

fuser -ku <logical-volume>


umount <logical-volume>

Next, deactivate the package volume groups. These are specified by the VG[] array entries in the
package control script.

vgchange -a n <volume-group>

4. Finally, re-enable the package for switching.

cmmodpkg -e <package-name>

If after cleaning up the node on which the timeout occurred it is desirable to have that node as an
alternate for running the package, remember to re-enable the package to run on the node:

cmmodpkg -e -n <node-name> <package-name>

358 Troubleshooting Your Cluster


The default Serviceguard control scripts are designed to take the straightforward steps needed to get an
application running or stopped. If the package administrator specifies a time limit within which these steps
need to occur and that limit is subsequently exceeded for any reason, Serviceguard takes the
conservative approach that the control script logic must either be hung or defective in some way. At that
point the control script cannot be trusted to perform cleanup actions correctly, thus the script is terminated
and the package administrator is given the opportunity to assess what cleanup steps must be taken.
If you want the package to switch automatically in the event of a control script timeout, set the
node_fail_fast_enabled parameter to YES. In this case, Serviceguard will cause a reboot on the node
where the control script timed out. This effectively cleans up any side effects of the package’s run or halt
attempt. In this case the package will be automatically restarted on any available alternate node for which
it is configured.

Node and Network Failures


These failures cause Serviceguard to transfer control of a package to another node. This is the normal
action of Serviceguard, but you have to be able to recognize when a transfer has taken place and decide
to leave the cluster in its current condition or to restore it to its original condition.
Possible node failures can be caused by the following conditions:

• reboot
• Kernel Oops
• Hangs
• Power failures

You can use the following commands to check the status of your network and subnets:

• ifconfig - to display LAN status and check to see if the package IP is stacked on the LAN card.

• arp -a - to check the arp tables.

Since your cluster is unique, there are no cookbook solutions to all possible problems. But if you apply
these checks and commands and work your way through the log files, you will be successful in identifying
and solving problems.

Troubleshooting the Quorum Server


Cause

NOTE:
See the
HPE Serviceguard Quorum Server Version A.12.00.30 Release Notes
for information about configuring the Quorum Server. Do not proceed without reading the Release Notes
for your version.

Authorization File Problems


The following kind of message in a Serviceguard node’s syslog file or in the output of cmviewcl -v
may indicate an authorization problem:
Access denied to quorum server 192.6.7.4

Node and Network Failures 359


The reason may be that you have not updated the authorization file. Verify that the node is included in the
file, and try using /usr/lbin/qs -update to re-read the quorum server authorization file.

Timeout Problems
The following kinds of message in a Serviceguard node’s syslog file may indicate timeout problems:
Unable to set client version at quorum server 192.6.7.2: reply timed out
Probe of quorum server 192.6.7.2 timed out
These messages could be an indication of an intermittent network problem; or the default quorum server
timeout may not be sufficient. You can set the QS_TIMEOUT_EXTENSION to increase the timeout, or
you can increase the MEMBER_TIMEOUT value. See Cluster Configuration Parameters on page 111
for more information about these parameters.
A message such as the following in a Serviceguard node’s syslog file indicates that the node did not
receive a reply to its lock request on time. This could be because of delay in communication between the
node and the Quorum Server or between the Quorum Server and other nodes in the cluster:
Attempt to get lock /sg/cluser1 unsuccessful. Reason:
request_timedout

Messages
The coordinator node in Serviceguard sometimes sends a request to the quorum server to set the lock
state. (This is different from a request to obtain the lock in tie-breaking.) If the quorum server’s connection
to one of the cluster nodes has not completed, the request to set may fail with a two-line message like the
following in the quorum server’s log file:
Oct 008 16:10:05:0: There is no connection to the applicant
2 for lock /sg/lockTest1
Oct 08 16:10:05:0:Request for lock /sg/lockTest1 from
applicant 1 failed: not connected to all applicants.
This condition can be ignored. The request will be retried a few seconds later and will succeed. The
following message is logged:
Oct 008 16:10:06:0: Request for lock /sg/lockTest1
succeeded. New lock owners: 1,2.

Lock LUN Messages


If the lock LUN device fails, the following message will be entered in the syslog file:
Oct 008 16:10:05:0: WARNING: Cluster lock lun /dev/sdc1 has failed.

Host IO Timeout Messages


Hewlett Packard Enterprise recommends you to use VMware tools which is a suite of utilities that
enhances the performance of the virtual machine’s guest operating system. VMware tools also improve
the virtual machine management which enables some of the important functionality. For more information
about the benefits of using VMware tools and installation instructions, see Installing and Configuring
VMware Tools at http://www.vmware.com/support/ws55/doc/ws_newguest_tools_linux.html.
For installation of VMware tools, you may also refer Installing and Configuring VMware Tools document at
https://www.vmware.com/pdf/vmware-tools-installation-configuration.pdf.
When Serviceguard is configured on a VMware VM (Virtual Machine) node on which VMware tools are
not installed, the following warning message is displayed while creating or editing the cluster:

Warning: Unable to obtain host IO timeout setting. Assigning


the most conservative value 70 seconds.

360 Timeout Problems


The value of host_io_timeout parameter is set internally by Serviceguard based on VM configurations.
The host_io_timeout parameter cannot be viewed or configured by the user. Serviceguard uses
host_io_timeout parameter to allow all the IO requests from the VM guests through the VM host
virtualization layer to complete before cluster activities can resume following a cluster reformation.
On physical nodes, the value of host_io_timeout parameter is set to zero.
On VM nodes, if you have installed VMware tools, you can obtain the value of host_io_timeout parameter
using the cmvminfo -M command.

NOTE:
If VMware commands cannot be run on the VM node, Serviceguard sets the value of host_io_timeout
parameter to 70 seconds.

Troubleshooting serviceguard-xdc package


Cause
For information about how to troubleshoot issues related to serviceguard-xdc package, see HPE
Serviceguard Extended Distance Cluster for Linux A.12.00.40 Deployment Guide at http://
www.hpe.com/info/linux-serviceguard-docs.

Troubleshooting cmvmusermgmt Utility


Cause
The cmvmusermgmt utility is used to manage the Serviceguard Credential Store (SCS).
If cmvmusermgmt command fails frequently with the following error message:
ERROR: One more instance of "cmvmusermgmt" command is running on
Then, you need to manually cleanup the lock file maintained by cmvmusermgmt utility before using the
utility in any of the cluster nodes.
In this scenario, you must check list of running processes (ps –ef | grep cmvmusermgmt) on all
cluster nodes for any instances of cmvmusermgmt. If there are any instance running, then wait for it to
complete. If there is no instance running on any of the cluster nodes, then delete lock file at path
$SGRUN/cmvmusermgmt.lck on the node as specified in the error message.

Troubleshooting serviceguard-xdc package 361


Support and other resources

Accessing Hewlett Packard Enterprise Support

• For live assistance, go to the Contact Hewlett Packard Enterprise Worldwide website:
www.hpe.com/assistance

• To access documentation and support services, go to the Hewlett Packard Enterprise Support Center
website:
www.hpe.com/support/hpesc

Information to collect

• Technical support registration number (if applicable)


• Product name, model or version, and serial number
• Operating system name and version
• Firmware version
• Error messages
• Product-specific reports and logs
• Add-on products or components
• Third-party products or components

Accessing updates

• Some software products provide a mechanism for accessing software updates through the product
interface. Review your product documentation to identify the recommended software update method.
• To download product updates, go to either of the following:

◦ Hewlett Packard Enterprise Support Center Get connected with updates page:
www.hpe.com/support/e-updates

◦ Updates location:
http://www.hpe.com/downloads/software

• To view and update your entitlements, and to link your contracts and warranties with your profile, go to
the Hewlett Packard Enterprise Support Center More Information on Access to Support Materials
page:
www.hpe.com/support/AccessToSupportMaterials

362 Support and other resources


IMPORTANT: Access to some updates might require product entitlement when accessed
through the Hewlett Packard Enterprise Support Center. You must have an HP Passport set up
with relevant entitlements.

Websites
Website Link

Hewlett Packard Enterprise Information Library www.hpe.com/info/enterprise/docs

Hewlett Packard Enterprise Support Center www.hpe.com/support/hpesc

Contact Hewlett Packard Enterprise Worldwide www.hpe.com/assistance

Subscription Service/Support Alerts www.hpe.com/support/e-updates

Software Depot www.hpe.com/support/softwaredepot

Customer Self Repair www.hpe.com/support/selfrepair

Insight Remote Support www.hpe.com/info/insightremotesupport/docs

Serviceguard Solutions for HP-UX www.hpe.com/info/hpux-serviceguard-docs

Single Point of Connectivity Knowledge (SPOCK) www.hpe.com/storage/spock


Storage compatibility matrix

Storage white papers and analyst reports www.hpe.com/storage/whitepapers

Related documents
For additional information, see the latest documents at www.hpe.com/info/linux-serviceguard-docs:

• HPE Serviceguard for Linux Release Notes


• HPE Serviceguard for Linux Base edition Release Notes
• HPE Serviceguard for Linux Advanced edition Release Notes
• HPE Serviceguard for Linux Enterprise edition Release Notes
• HPE Serviceguard Quorum Server Release Notes
• HPE Serviceguard Extended Distance Cluster for Linux Deployment Guide
• Clusters for High Availability: a Primer of HPE Solutions
. Second Edition. Hewlett Packard Enterprise Press, 2001 (ISBN 0-13-089355-2)
• HPE Serviceguard for Linux Configuration Guide
(for information on supported configurations)
• HPE Serviceguard for Linux Certification Matrix

Websites 363
(for updated information on supported hardware and Linux distributions)

Customer self repair


Hewlett Packard Enterprise customer self repair (CSR) programs allow you to repair your product. If a
CSR part needs to be replaced, it will be shipped directly to you so that you can install it at your
convenience. Some parts do not qualify for CSR. Your Hewlett Packard Enterprise authorized service
provider will determine whether a repair can be accomplished by CSR.
For more information about CSR, contact your local service provider or go to the CSR website:
www.hpe.com/support/selfrepair

Remote support
Remote support is available with supported devices as part of your warranty or contractual support
agreement. It provides intelligent event diagnosis, and automatic, secure submission of hardware event
notifications to Hewlett Packard Enterprise, which will initiate a fast and accurate resolution based on your
product’s service level. Hewlett Packard Enterprise strongly recommends that you register your device for
remote support.
For more information and device support details, go to the following website:
www.hpe.com/info/insightremotesupport/docs

Documentation feedback
Hewlett Packard Enterprise is committed to providing documentation that meets your needs. To help us
improve the documentation, send any errors, suggestions, or comments to Documentation Feedback
(docsfeedback@hpe.com). When submitting your feedback, include the document title, part number,
edition, and publication date located on the front cover of the document. For online help content, include
the product name, product version, help edition, and publication date located on the legal notices page.

364 Customer self repair


Designing Highly Available Cluster Applications
This appendix describes how to create or port applications for high availability, with emphasis on the
following topics:

• Automating Application Operation


• Controlling the Speed of Application Failover
• Designing Applications to Run on Multiple Systems
• Restoring Client Connections
• Handling Application Failures
• Minimizing Planned Downtime

Designing for high availability means reducing the amount of unplanned and planned downtime that users
will experience. Unplanned downtime includes unscheduled events such as power outages, system
failures, network failures, disk crashes, or application failures. Planned downtime includes scheduled
events such as scheduled backups, system upgrades to new OS revisions, or hardware replacements.
Two key strategies should be kept in mind:

1. Design the application to handle a system reboot or panic. If you are modifying an existing application
for a highly available environment, determine what happens currently with the application after a
system panic. In a highly available environment there should be defined (and scripted) procedures for
restarting the application. Procedures for starting and stopping the application should be automatic,
with no user intervention required.
2. The application should not use any system-specific information such as the following if such use would
prevent it from failing over to another system and running properly:

• The application should not refer to uname() or gethostname().

• The application should not refer to the SPU ID.


• The application should not refer to the MAC (link-level) address.

Automating Application Operation


Can the application be started and stopped automatically or does it require operator intervention?
This section describes how to automate application operations to avoid the need for user intervention.
One of the first rules of high availability is to avoid manual intervention. If it takes a user at a terminal,
console or GUI interface to enter commands to bring up a subsystem, the user becomes a key part of the
system. It may take hours before a user can get to a system console to do the work necessary. The
hardware in question may be located in a far-off area where no trained users are available, the systems
may be located in a secure datacenter, or in off hours someone may have to connect via modem.
There are two principles to keep in mind for automating application relocation:

• Insulate users from outages.


• Applications must have defined startup and shutdown procedures.

Designing Highly Available Cluster Applications 365


You need to be aware of what happens currently when the system your application is running on is
rebooted, and whether changes need to be made in the application's response for high availability.

Insulate Users from Outages


Wherever possible, insulate your end users from outages. Issues include the following:

• Do not require user intervention to reconnect when a connection is lost due to a failed server.
• Where possible, warn users of slight delays due to a failover in progress.
• Minimize the reentry of data.
• Engineer the system for reserve capacity to minimize the performance degradation experienced by
users.

Define Application Startup and Shutdown


Applications must be restartable without manual intervention. If the application requires a switch to be
flipped on a piece of hardware, then automated restart is impossible. Procedures for application startup,
shutdown and monitoring must be created so that the HA software can perform these functions
automatically.
To ensure automated response, there should be defined procedures for starting up the application and
stopping the application. In Serviceguard these procedures are placed in the package control script.
These procedures must check for errors and return status to the HA control software. The startup and
shutdown should be command-line driven and not interactive unless all of the answers can be
predetermined and scripted.
In an HA failover environment, HA software restarts the application on a surviving system in the cluster
that has the necessary resources, such as access to the necessary disk drives. The application must be
restartable in two aspects:

• It must be able to restart and recover on the backup system (or on the same system if the application
restart option is chosen).
• It must be able to restart if it fails during the startup and the cause of the failure is resolved.

Application administrators need to learn to startup and shutdown applications using the appropriate HA
commands. Inadvertently shutting down the application directly will initiate an unwanted failover.
Application administrators also need to be careful that they don't accidently shut down a production
instance of an application rather than a test instance in a development environment.
A mechanism to monitor whether the application is active is necessary so that the HA software knows
when the application has failed. This may be as simple as a script that issues the command ps -ef |
grep xxx for all the processes belonging to the application.
To reduce the impact on users, the application should not simply abort in case of error, since aborting
would cause an unneeded failover to a backup system. Applications should determine the exact error and
take specific action to recover from the error rather than, for example, aborting upon receipt of any error.

Controlling the Speed of Application Failover


What steps can be taken to ensure the fastest failover?
If a failure does occur causing the application to be moved (failed over) to another node, there are many
things the application can do to reduce the amount of time it takes to get the application back up and
running. The topics covered are as follows:

366 Insulate Users from Outages


• Replicate Non-Data File Systems
• Use Raw Volumes
• Evaluate the Use of a journaled file system
• Minimize Data Loss
• Use Restartable Transactions
• Use Checkpoints
• Design for Multiple Servers
• Design for Replicated Data Sites

Replicate Non-Data File Systems


Non-data file systems should be replicated rather than shared. There can only be one copy of the
application data itself. It will be located on a set of disks that is accessed by the system that is running the
application. After failover, if these data disks are filesystems, they must go through filesystems recovery
(fsck) before the data can be accessed. To help reduce this recovery time, the smaller these filesystems
are, the faster the recovery will be. Therefore, it is best to keep anything that can be replicated off the
data filesystem. For example, there should be a copy of the application executables on each system
rather than having one copy of the executables on a shared filesystem. Additionally, replicating the
application executables makes them subject to a rolling upgrade if this is desired.

Evaluate the Use of a Journaled Filesystem (JFS)


If a file system must be used, a JFS offers significantly faster file system recovery than an HFS. However,
performance of the JFS may vary with the application. An example of an appropriate JFS is the VxFS,
ext3, ext4, or XFS.

Minimize Data Loss


Minimize the amount of data that might be lost at the time of an unplanned outage. It is impossible to
prevent some data from being lost when a failure occurs. However, it is advisable to take certain actions
to minimize the amount of data that will be lost, as explained in the following discussion.

Minimize the Use and Amount of Memory-Based Data


Any in-memory data (the in-memory context) will be lost when a failure occurs. The application should be
designed to minimize the amount of in-memory data that exists unless this data can be easily
recalculated. When the application restarts on the standby node, it must recalculate or reread from disk
any information it needs to have in memory.
One way to measure the speed of failover is to calculate how long it takes the application to start up on a
normal system after a reboot. Does the application start up immediately? Or are there a number of steps
the application must go through before an end-user can connect to it? Ideally, the application can start up
quickly without having to reinitialize in-memory data structures or tables.
Performance concerns might dictate that data be kept in memory rather than written to the disk. However,
the risk associated with the loss of this data should be weighed against the performance impact of posting
the data to the disk.
Data that is read from a shared disk into memory, and then used as read-only data can be kept in
memory without concern.

Replicate Non-Data File Systems 367


Keep Logs Small
Some databases permit logs to be buffered in memory to increase online performance. Of course, when a
failure occurs, any in-flight transaction will be lost. However, minimizing the size of this in-memory log will
reduce the amount of completed transaction data that would be lost in case of failure.
Keeping the size of the on-disk log small allows the log to be archived or replicated more frequently,
reducing the risk of data loss if a disaster were to occur. There is, of course, a trade-off between online
performance and the size of the log.

Eliminate Need for Local Data


When possible, eliminate the need for local data. In a three-tier, client/server environment, the middle tier
can often be dataless (i.e., there is no local data that is client specific or needs to be modified). This
“application server” tier can then provide additional levels of availability, load-balancing, and failover.
However, this scenario requires that all data be stored either on the client (tier 1) or on the database
server (tier 3).

Use Restartable Transactions


Transactions need to be restartable so that the client does not need to re-enter or back out of the
transaction when a server fails, and the application is restarted on another system. In other words, if a
failure occurs in the middle of a transaction, there should be no need to start over again from the
beginning. This capability makes the application more robust and reduces the visibility of a failover to the
user.
A common example is a print job. Printer applications typically schedule jobs. When that job completes,
the scheduler goes on to the next job. If, however, the system dies in the middle of a long job (say it is
printing paychecks for 3 hours), what happens when the system comes back up again? Does the job
restart from the beginning, reprinting all the paychecks, does the job start from where it left off, or does
the scheduler assume that the job was done and not print the last hours worth of paychecks? The correct
behavior in a highly available environment is to restart where it left off, ensuring that everyone gets one
and only one paycheck.
Another example is an application where a clerk is entering data about a new employee. Suppose this
application requires that employee numbers be unique, and that after the name and number of the new
employee is entered, a failure occurs. Since the employee number had been entered before the failure,
does the application refuse to allow it to be re-entered? Does it require that the partially entered
information be deleted first? More appropriately, in a highly available environment the application will
allow the clerk to easily restart the entry or to continue at the next data item.

Use Checkpoints
Design applications to checkpoint complex transactions. A single transaction from the user's perspective
may result in several actual database transactions. Although this issue is related to restartable
transactions, here it is advisable to record progress locally on the client so that a transaction that was
interrupted by a system failure can be completed after the failover occurs.
For example, suppose the application being used is calculating PI. On the original system, the application
has gotten to the 1,000th decimal point, but the application has not yet written anything to disk. At that
moment in time, the node crashes. The application is restarted on the second node, but the application is
started up from scratch. The application must recalculate those 1,000 decimal points. However, if the
application had written to disk the decimal points on a regular basis, the application could have restarted
from where it left off.

Balance Checkpoint Frequency with Performance


It is important to balance checkpoint frequency with performance. The trade-off with checkpointing to disk
is the impact of this checkpointing on performance. Obviously if you checkpoint too often the application
slows; if you don't checkpoint often enough, it will take longer to get the application back to its current

368 Keep Logs Small


state after a failover. Ideally, the end-user should be able to decide how often to checkpoint. Applications
should provide customizable parameters so the end-user can tune the checkpoint frequency.

Design for Multiple Servers


If you use multiple active servers, multiple service points can provide relatively transparent service to a
client. However, this capability requires that the client be smart enough to have knowledge about the
multiple servers and the priority for addressing them. It also requires access to the data of the failed
server or replicated data.
For example, rather than having a single application which fails over to a second system, consider having
both systems running the application. After a failure of the first system, the second system simply takes
over the load of the first system. This eliminates the start up time of the application. There are many ways
to design this sort of architecture, and there are also many issues with this sort of design. This discussion
will not go into details other than to give a few examples.
The simplest method is to have two applications running in a master/slave relationship where the slave is
simply a hot standby application for the master. When the master fails, the slave on the second system
would still need to figure out what state the data was in (i.e., data recovery would still take place).
However, the time to fork the application and do the initial startup is saved.
Another possibility is having two applications that are both active. An example might be two application
servers which feed a database. Half of the clients connect to one application server and half of the clients
connect to the second application server. If one server fails, then all the clients connect to the remaining
application server.

Design for Replicated Data Sites


Replicated data sites are a benefit for both fast failover and disaster recovery. With replicated data, data
disks are not shared between systems. There is no data recovery that has to take place. This makes the
recovery time faster. However, there may be performance trade-offs associated with replicating data.
There are a number of ways to perform data replication, which should be fully investigated by the
application designer.
Many of the standard database products provide for data replication transparent to the client application.
By designing your application to use a standard database, the end-user can determine if data replication
is desired.

Designing Applications to Run on Multiple Systems


If an application can be failed to a backup node, how will it work on that different system?
The previous sections discussed methods to ensure that an application can be automatically restarted.
This section will discuss some ways to ensure the application can run on multiple systems. Topics are as
follows:

• Avoid Node Specific Information


• Assign Unique Names to Applications
• Use Uname(2) With Care

• Bind to a Fixed Port


• Bind to a Relocatable IP Addresses
• Give Each Application its Own Volume Group

Design for Multiple Servers 369


• Use Multiple Destinations for SNA Applications
• Avoid File Locking

Avoid Node Specific Information


Typically, when a new system is installed, an IP address must be assigned to each active network
interface. This IP address is always associated with the node and is called a stationary IP address.
The use of packages containing highly available applications adds the requirement for an additional set of
IP addresses, which are assigned to the applications themselves. These are known as relocatable
application IP addresses. Serviceguard’s network sensor monitors the node’s access to the subnet on
which these relocatable application IP addresses reside. When packages are configured in Serviceguard,
the associated subnetwork address is specified as a package dependency, and a list of nodes on which
the package can run is also provided. When failing a package over to a remote node, the subnetwork
must already be active on the target node.
Each application or package should be given a unique name as well as a relocatable IP address.
Following this rule separates the application from the system on which it runs, thus removing the need for
user knowledge of which system the application runs on. It also makes it easier to move the application
among different systems in a cluster for load balancing or other reasons. If two applications share a single
IP address, they must move together. Instead, using independent names and addresses allows them to
move separately.
For external access to the cluster, clients must know how to refer to the application. One option is to tell
the client which relocatable IP address is associated with the application. Another option is to think of the
application name as a host, and configure a name-to-address mapping in the Domain Name System
(DNS). In either case, the client will ultimately be communicating via the application’s relocatable IP
address. If the application moves to another node, the IP address will move with it, allowing the client to
use the application without knowing its current location. Remember that each network interface must
have a stationary IP address associated with it. This IP address does not move to a remote system in the
event of a network failure.

Obtain Enough IP Addresses


Each application receives a relocatable IP address that is separate from the stationary IP address
assigned to the system itself. Therefore, a single system might have many IP addresses, one for itself
and one for each of the applications that it normally runs. Therefore, IP addresses in a given subnet
range will be consumed faster than without high availability. It might be necessary to acquire additional IP
addresses.
Multiple IP addresses on the same network interface are supported only if they are on the same
subnetwork.

Allow Multiple Instances on Same System


Applications should be written so that multiple instances, each with its own application name and IP
address, can run on a single system. It might be necessary to invoke the application with a parameter
showing which instance is running. This allows distributing the users among several systems under
normal circumstances, but it also allows all of the users to be serviced in the case of a failure on a single
system.

Avoid Using SPU IDs or MAC Addresses


Design the application so that it does not rely on the SPU ID or MAC (link-level) addresses. The SPU ID
is a unique hardware ID contained in non-volatile memory, which cannot be changed. A MAC address
(also known as a NIC id) is a link-specific address associated with the LAN hardware. The use of these
addresses is a common problem for license servers, since for security reasons they want to use
hardware-specific identification to ensure the license isn't copied to multiple nodes. One workaround is to

370 Avoid Node Specific Information


have multiple licenses; one for each node the application will run on. Another way is to have a cluster-
wide mechanism that lists a set of SPU IDs or node names. If your application is running on a system in
the specified set, then the license is approved.
Previous generation HA software would move the MAC address of the network card along with the IP
address when services were moved to a backup system. This is no longer allowed in Serviceguard.
There were a couple of reasons for using a MAC address, which have been addressed below:

• Old network devices between the source and the destination such as routers had to be manually
programmed with MAC and IP address pairs. The solution to this problem is to move the MAC address
along with the IP address in case of failover.
• Up to 20 minute delays could occur while network device caches were updated due to timeouts
associated with systems going down. This is dealt with in current HA software by broadcasting a new
ARP translation of the old IP address with the new MAC address.

Assign Unique Names to Applications


A unique name should be assigned to each application. This name should then be configured in DNS so
that the name can be used as input to gethostbyname(3), as described in the following discussion.

Use DNS
DNS provides an API which can be used to map hostnames to IP addresses and vice versa. This is
useful for BSD socket applications such as telnet which are first told the target system name. The
application must then map the name to an IP address in order to establish a connection. However, some
calls should be used with caution.
Applications should not reference official hostnames or IP addresses. The official hostname and
corresponding IP address for the hostname refer to the primary LAN card and the stationary IP address
for that card. Therefore, any application that refers to, or requires the hostname or primary IP address
may not work in an HA environment where the network identity of the system that supports a given
application moves from one system to another, but the hostname does not move.
One way to look for problems in this area is to look for calls to gethostname(2) in the application. HA
services should use gethostname() with caution, since the response may change over time if the
application migrates. Applications that use gethostname() to determine the name for a call to
gethostbyname(3) should also be avoided for the same reason. Also, the gethostbyaddr() call
may return different answers over time if called with a stationary IP address.
Instead, the application should always refer to the application name and relocatable IP address rather
than the hostname and stationary IP address. It is appropriate for the application to call
gethostbyname(3), specifying the application name rather than the hostname. gethostbyname(3)
will pass in the IP address of the application. This IP address will move with the application to the new
node.
However, gethostbyname(3) should be used to locate the IP address of an application only if the
application name is configured in DNS. It is probably best to associate a different application name with
each independent HA service. This allows each application and its IP address to be moved to another
node without affecting other applications. Only the stationary IP addresses should be associated with the
hostname in DNS.

Use uname(2) With Care


Related to the hostname issue discussed in the previous section is the application's use of uname(2),
which returns the official system name. The system name is unique to a given system whatever the
number of LAN cards in the system. By convention, the uname and hostname are the same, but they do
not have to be. Some applications, after connection to a system, might call uname(2) to validate for

Assign Unique Names to Applications 371


security purposes that they are really on the correct system. This is not appropriate in an HA environment,
since the service is moved from one system to another, and neither the uname nor the hostname are
moved. Applications should develop alternate means of verifying where they are running. For example,
an application might check a list of hostnames that have been provided in a configuration file.

Bind to a Fixed Port


When binding a socket, a port address can be specified or one can be assigned dynamically. One issue
with binding to random ports is that a different port may be assigned if the application is later restarted on
another cluster node. This may be confusing to clients accessing the application.
The recommended method is using fixed ports that are the same on all nodes where the application will
run, instead of assigning port numbers dynamically. The application will then always return the same port
number regardless of which node is currently running the application. Application port assignments
should be put in /etc/services to keep track of them and to help ensure that someone will not choose
the same port number.

Bind to Relocatable IP Addresses


When sockets are bound, an IP address is specified in addition to the port number. This indicates the IP
address to use for communication and is meant to allow applications to limit which interfaces can
communicate with clients. An application can bind to INADDR_ANY as an indication that messages can
arrive on any interface.
Network applications can bind to a stationary IP address, a relocatable IP address, or INADDR_ANY. If the
stationary IP address is specified, then the application may fail when restarted on another node, because
the stationary IP address is not moved to the new system. If an application binds to the relocatable IP
address, then the application will behave correctly when moved to another system.
Many server-style applications will bind to INADDR_ANY, meaning that they will receive requests on any
interface. This allows clients to send to the stationary or relocatable IP addresses. However, in this case
the networking code cannot determine which source IP address is most appropriate for responses, so it
will always pick the stationary IP address.
For TCP stream sockets, the TCP level of the protocol stack resolves this problem for the client since it is
a connection-based protocol. On the client, TCP ignores the stationary IP address and continues to use
the previously bound relocatable IP address originally used by the client.
With UDP datagram sockets, however, there is a problem. The client may connect to multiple servers
utilizing the relocatable IP address and sort out the replies based on the source IP address in the server’s
response message. However, the source IP address given in this response will be the stationary IP
address rather than the relocatable application IP address. Therefore, when creating a UDP socket for
listening, the application must always call bind(2) with the appropriate relocatable application IP
address rather than INADDR_ANY.

Call bind() before connect()


When an application initiates its own connection, it should first call bind(2), specifying the application IP
address before calling connect(2). Otherwise the connect request will be sent using the stationary IP
address of the system's outbound LAN interface rather than the desired relocatable application IP
address. The client will receive this IP address from the accept(2) call, possibly confusing the client
software and preventing it from working correctly.

Give Each Application its Own Volume Group


Use separate volume groups for each application that uses data. If the application doesn't use disk, it is
not necessary to assign it a separate volume group. A volume group (group of disks) is the unit of storage
that can move between nodes. The greatest flexibility for load balancing exists when each application is
confined to its own volume group, i.e., two applications do not share the same set of disk drives. If two

372 Bind to a Fixed Port


applications do use the same volume group to store their data, then the applications must move together.
If the applications’ data stores are in separate volume groups, they can switch to different nodes in the
event of a failover.
The application data should be set up on different disk drives and if applicable, different mount points.
The application should be designed to allow for different disks and separate mount points. If possible, the
application should not assume a specific mount point.

Use Multiple Destinations for SNA Applications


SNA is point-to-point link-oriented; that is, the services cannot simply be moved to another system, since
that system has a different point-to-point link which originates in the mainframe. Therefore, backup links in
a node and/or backup links in other nodes should be configured so that SNA does not become a single
point of failure. Note that only one configuration for an SNA link can be active at a time. Therefore,
backup links that are used for other purposes should be reconfigured for the primary mission-critical
purpose upon failover.

Avoid File Locking


In an NFS environment, applications should avoid using file-locking mechanisms, where the file to be
locked is on an NFS Server. File locking should be avoided in an application both on local and remote
systems. If local file locking is employed and the system fails, the system acting as the backup system will
not have any knowledge of the locks maintained by the failed system. This may or may not cause
problems when the application restarts.
Remote file locking is the worst of the two situations, since the system doing the locking may be the
system that fails. Then, the lock might never be released, and other parts of the application will be unable
to access that data. In an NFS environment, file locking can cause long delays in case of NFS client
system failure and might even delay the failover itself.

Restoring Client Connections


How does a client reconnect to the server after a failure?
It is important to write client applications to specifically differentiate between the loss of a connection to
the server and other application-oriented errors that might be returned. The application should take
special action in case of connection loss.
One question to consider is how a client knows after a failure when to reconnect to the newly started
server. The typical scenario is that the client must simply restart their session, or relog in. However, this
method is not very automated. For example, a well-tuned hardware and application system may fail over
in 5 minutes. But if users, after experiencing no response during the failure, give up after 2 minutes and
go for coffee and don't come back for 28 minutes, the perceived downtime is actually 30 minutes, not 5.
Factors to consider are the number of reconnection attempts to make, the frequency of reconnection
attempts, and whether or not to notify the user of connection loss.
There are a number of strategies to use for client reconnection:

• Design clients which continue to try to reconnect to their failed server.


Put the work into the client application rather than relying on the user to reconnect. If the server is
back up and running in 5 minutes, and the client is continually retrying, then after 5 minutes, the client
application will reestablish the link with the server and either restart or continue the transaction. No
intervention from the user is required.

• Design clients to reconnect to a different server.


If you have a server design which includes multiple active servers, the client could connect to the
second server, and the user would only experience a brief delay.

Use Multiple Destinations for SNA Applications 373


The problem with this design is knowing when the client should switch to the second server. How long
does a client retry to the first server before giving up and going to the second server? There are no
definitive answers for this. The answer depends on the design of the server application. If the
application can be restarted on the same node after a failure (see Handling Application Failures
following), the retry to the current server should continue for the amount of time it takes to restart the
server locally. This will keep the client from having to switch to the second server in the event of a
application failure.

• Use a transaction processing monitor or message queueing software to increase robustness.


Use transaction processing monitors such as Tuxedo or DCE/Encina, which provide an interface
between the server and the client. Transaction processing monitors (TPMs) can be useful in creating a
more highly available application. Transactions can be queued such that the client does not detect a
server failure. Many TPMs provide for the optional automatic rerouting to alternate servers or for the
automatic retry of a transaction. TPMs also provide for ensuring the reliable completion of
transactions, although they are not the only mechanism for doing this. After the server is back online,
the transaction monitor reconnects to the new server and continues routing it the transactions.

• Queue Up Requests
As an alternative to using a TPM, queue up requests when the server is unavailable. Rather than
notifying the user when a server is unavailable, the user request is queued up and transmitted later
when the server becomes available again. Message queueing software ensures that messages of any
kind, not necessarily just transactions, are delivered and acknowledged.
Message queueing is useful only when the user does not need or expect response that the request
has been completed (that is, the application is not interactive).

Handling Application Failures


What happens if part or all of an application fails?
All of the preceding sections have assumed the failure in question was not a failure of the application, but
of another component of the cluster. This section deals specifically with application problems. For
instance, software bugs may cause an application to fail, or system resource issues (such as low swap/
memory space) may cause an application to die. The section deals with how to design your application to
recover after these types of failures.

Create Applications to be Failure Tolerant


An application should be tolerant to failure of a single component. Many applications have multiple
processes running on a single node. If one process fails, what happens to the other processes? Do they
also fail? Can the failed process be restarted on the same node without affecting the remaining pieces of
the application?
Ideally, if one process fails, the other processes can wait a period of time for that component to come
back online. This is true whether the component is on the same system or a remote system. The failed
component can be restarted automatically on the same system and rejoin the waiting processing and
continue on. This type of failure can be detected and restarted within a few seconds, so the end user
would never know a failure occurred.
Another alternative is for the failure of one component to still allow bringing down the other components
cleanly. If a database SQL server fails, the database should still be able to be brought down cleanly so
that no database recovery is necessary.
The worse case is for a failure of one component to cause the entire system to fail. If one component fails
and all other components need to be restarted, the downtime will be high.

374 Handling Application Failures


Be Able to Monitor Applications
All components in a system, including applications, should be able to be monitored for their health. A
monitor might be as simple as a display command or as complicated as a SQL query. There must be a
way to ensure that the application is behaving correctly. If the application fails and it is not detected
automatically, it might take hours for a user to determine the cause of the downtime and recover from it.

Minimizing Planned Downtime


Planned downtime (as opposed to unplanned downtime) is scheduled; examples include backups,
systems upgrades to new operating system revisions, or hardware replacements. For planned downtime,
application designers should consider:

• Reducing the time needed for application upgrades/patches.


Can an administrator install a new version of the application without scheduling downtime? Can
different revisions of an application operate within a system? Can different revisions of a client and
server operate within a system?

• Providing for online application reconfiguration.


Can the configuration information used by the application be changed without bringing down the
application?

• Documenting maintenance operations.


Does an operator know how to handle maintenance operations?

When discussing highly available systems, unplanned failures are often the main point of discussion.
However, if it takes 2 weeks to upgrade a system to a new revision of software, there are bound to be a
large number of complaints.
The following sections discuss ways of handling the different types of planned downtime.

Reducing Time Needed for Application Upgrades and Patches


Once a year or so, a new revision of an application is released. How long does it take for the end-user to
upgrade to this new revision? This answer is the amount of planned downtime a user must take to
upgrade their application. The following guidelines reduce this time.

Provide for Rolling Upgrades


Provide for a “rolling upgrade” in a client/server environment. For a system with many components, the
typical scenario is to bring down the entire system, upgrade every node to the new version of the
software, and then restart the application on all the affected nodes. For large systems, this could result in
a long downtime. An alternative is to provide for a rolling upgrade. A rolling upgrade rolls out the new
software in a phased approach by upgrading only one component at a time. For example, the database
server is upgraded on Monday, causing a 15 minute downtime. Then on Tuesday, the application server
on two of the nodes is upgraded, which leaves the application servers on the remaining nodes online and
causes no downtime. On Wednesday, two more application servers are upgraded, and so on. With this
approach, you avoid the problem where everything changes at once, plus you minimize long outages.
The trade-off is that the application software must operate with different revisions of the software. In the
above example, the database server might be at revision 5.0 while the some of the application servers
are at revision 4.0. The application must be designed to handle this type of situation.

Be Able to Monitor Applications 375


Do Not Change the Data Layout Between Releases
Migration of the data to a new format can be very time intensive. It also almost guarantees that rolling
upgrade will not be possible. For example, if a database is running on the first node, ideally, the second
node could be upgraded to the new revision of the database. When that upgrade is completed, a brief
downtime could be scheduled to move the database server from the first node to the newly upgraded
second node. The database server would then be restarted, while the first node is idle and ready to be
upgraded itself. However, if the new database revision requires a different database layout, the old data
will not be readable by the newly updated database. The downtime will be longer as the data is migrated
to the new layout.

Providing Online Application Reconfiguration


Most applications have some sort of configuration information that is read when the application is started.
If to make a change to the configuration, the application must be halted and a new configuration file read,
downtime is incurred.
To avoid this downtime use configuration tools that interact with an application and make dynamic
changes online. The ideal solution is to have a configuration tool which interacts with the application.
Changes are made online with little or no interruption to the end-user. This tool must be able to do
everything online, such as expanding the size of the data, adding new users to the system, adding new
users to the application, etc. Every task that an administrator needs to do to the application system can
be made available online.

Documenting Maintenance Operations


Standard procedures are important. An application designer should make every effort to make tasks
common for both the highly available environment and the normal environment. If an administrator is
accustomed to bringing down the entire system after a failure, he or she will continue to do so even if the
application has been redesigned to handle a single failure. It is important that application documentation
discuss alternatives with regards to high availability for typical maintenance operations.

376 Do Not Change the Data Layout Between Releases


Integrating HA Applications with Serviceguard
The following is a summary of the steps you should follow to integrate an application into the
Serviceguard environment:

1. Read the rest of this book, including the chapters on cluster and package configuration, and the
appendix “Designing Highly Available Cluster Applications.”
2. Define the cluster’s behavior for normal operations:

• What should the cluster look like during normal operation?


• What is the standard configuration most people will use? (Is there any data available about user
requirements?)
• Can you separate out functions such as database or application server onto separate machines, or
does everything run on one machine?

3. Define the cluster’s behavior for failover operations:

• Does everything fail over together to the adoptive node?


• Can separate applications fail over to the same node?
• Is there already a high availability mechanism within the application other than the features
provided by Serviceguard?

4. Identify problem areas

• What does the application do today to handle a system reboot or panic?


• Does the application use any system-specific information such as uname() or gethostname(),
SPU_ID or MAC address which would prevent it from failing over to another system?

Checklist for Integrating HA Applications


This section contains a checklist for integrating HA applications in both single and multiple systems.

Defining Baseline Application Behavior on a Single System


1. Define a baseline behavior for the application on a standalone system:

• Install the application, database, and other required resources on one of the systems. Be sure to
follow Serviceguard rules in doing this:

◦ Install all shared data on separate external volume groups.


◦ Use a Journaled filesystem (JFS) as appropriate.

• Perform some sort of standard test to ensure the application is running correctly. This test can be
used later in testing with Serviceguard. If possible, try to connect to the application through a client.
• Crash the standalone system, reboot it, and test how the application starts up again. Note the
following:

Integrating HA Applications with Serviceguard 377


◦ Are there any manual procedures? if so, document them.
◦ Can everything start up from rc scripts?

• Try to write a simple script which brings everything up without having to do any keyboard typing.
Figure out what the administrator would do at the keyboard, then put that into the script.
• Try to write a simple script to bring down the application. Again, figure out what the administrator
would do at the keyboard, then put that into the script.

Integrating HA Applications in Multiple Systems


1. Install the application on a second system.

• Create the LVM infrastructure on the second system.


• Add the appropriate users to the system.
• Install the appropriate executables.
• With the application not running on the first system, try to bring it up on the second system. You
might use the script you created in the step above. Is there anything different that you must do?
Does it run?
• Repeat this process until you can get the application to run on the second system.

2. Configure the Serviceguard cluster:

• Create the cluster configuration.


• Create a package.
• Create the package script.
• Use the simple scripts you created in earlier steps as the customer defined functions in the
package control script.

3. Start the cluster and verify that applications run as planned.

Testing the Cluster


1. Test the cluster:

• Have clients connect.


• Provide a normal system load.
• Halt the package on the first node and move it to the second node:
cmhaltpkg pkg1
cmrunpkg -n node2 pkg1
cmmodpkg -e pkg1

378 Integrating HA Applications in Multiple Systems


• Move it back.
cmhaltpkg pkg1
cmrunpkg -n node1 pkg1
cmmodpkg -e pkg1

• Fail one of the systems. For example, turn off the power on node 1. Make sure the package starts
up on node 2.
• Repeat failover from node 2 back to node 1.

2. Be sure to test all combinations of application load during the testing. Repeat the failover processes
under different application states such as heavy user load versus no user load, batch jobs versus
online transactions, etc.
3. Record timelines of the amount of time spent during the failover for each application state. A sample
timeline might be 45 seconds to reconfigure the cluster, 15 seconds to run fsck on the filesystems, 30
seconds to start the application and 3 minutes to recover the database.

Integrating HA Applications with Serviceguard 379


Blank Planning Worksheets
This appendix reprints blank versions of the planning worksheets described in the “Planning” chapter. You
can duplicate any of these worksheets that you find useful and fill them in as a part of the planning
process. The worksheets included in this appendix are as follows:

• Hardware Worksheet on page 380


• Power Supply Worksheet on page 381
• Quorum Server Worksheet on page 381
• Volume Group and Physical Volume Worksheet on page 382
• Cluster Configuration Worksheet on page 382
• Package Configuration Worksheet on page 383

Hardware Worksheet
=============================================================================
SPU Information:

Host Name ____________________ Server Series____________

Memory Capacity ____________ Number of I/O Slots ____________


=============================================================================
LAN Information:

Name of Name of Node IP Traffic


Master _________ Interface __________ Addr________________ Type ________

Name of Name of Node IP Traffic


Master __________ Interface __________ Addr________________ Type ________

Name of Name of Node IP Traffic


Master _________ Interface __________ Addr_______________ Type __________

===============================================================================

Quorum Server Name: __________________ IP Address:


____________________

=============================================================================

Disk I/O Information for Shared Disks:

Bus Type ______ Slot Number ____ Address ____ Disk Device File _________

Bus Type ______ Slot Number ___ Address ____ Disk Device File __________

Bus Type ______ Slot Number ___ Address ____ Disk Device File _________

Bus Type ______ Slot Number ___ Address ____ Disk Device File _________

380 Blank Planning Worksheets


Power Supply Worksheet
============================================================================
SPU Power:

Host Name ____________________ Power Supply


_____________________

Host Name ____________________ Power Supply


_____________________

============================================================================
Disk Power:

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

Disk Unit __________________________ Power Supply _______________________

============================================================================
Tape Backup Power:

Tape Unit __________________________ Power Supply _______________________

Tape Unit __________________________ Power Supply _______________________

============================================================================
Other Power:

Unit Name __________________________ Power Supply _______________________

Unit Name __________________________ Power Supply _______________________

Quorum Server Worksheet


Quorum Server Data:
==============================================================================

QS Hostname: _________________IP Address: ______________________

OR

Cluster Name: _________________

Package Name: ____________ Package IP Address: ___________________

Hostname Given to Package by Network Administrator: _________________

==============================================================================

Power Supply Worksheet 381


Quorum Services are Provided for:

Cluster Name: ___________________________________________________________

Host Names ____________________________________________

Host Names ____________________________________________

Cluster Name: ___________________________________________________________

Host Names ____________________________________________

Host Names ____________________________________________

Cluster Name: ___________________________________________________________

Host Names ____________________________________________

Host Names ____________________________________________

Volume Group and Physical Volume Worksheet


==============================================================================

Volume Group Name: ___________________________________

Physical Volume Name: _________________

Physical Volume Name: _________________

Physical Volume Name: _________________

=============================================================================

Volume Group Name: ___________________________________

Physical Volume Name: _________________

Physical Volume Name: _________________

Physical Volume Name: _________________

Cluster Configuration Worksheet


===============================================================================
Name and Nodes:
===============================================================================
Cluster Name: ______________________________

Node Names: ________________________________________________

Maximum Configured Packages: ______________

382 Volume Group and Physical Volume Worksheet


===============================================================================
Cluster Lock Data:
================================================================================
If using a quorum server:
Quorum Server Host Name or IP Address: ____________________

Quorum Server Polling Interval: ______________ microseconds

Quorum Server Timeout Extension: _______________ microseconds


==============================================================================
If using a lock lun:
Lock LUN Name on Node 1: __________________
Lock LUN Name on Node 2: __________________
Lock LUN Name on Node 3: __________________
Lock LUN Name on Node 4: __________________
===============================================================================
Subnets:
===============================================================================
Heartbeat Subnet: __________________________

Monitored Non-heartbeat Subnet: __________________

Monitored Non-heartbeat Subnet: ___________________


===============================================================================
Timing Parameters:
===============================================================================
Heartbeat Interval: __________
===============================================================================
Node Timeout: ______________
===============================================================================
Network Polling Interval: __________
===============================================================================
Autostart Delay: _____________
Access Policies
User: ________ Host: ________ Role: ________
User: _________ Host: _________ Role: __________

Package Configuration Worksheet


=============================================================================
Package Configuration File Data:
==========================================================================
Package Name: __________________Package Type:______________
Primary Node: ____________________ First Failover Node:__________________
Additional Failover Nodes:__________________________________
Run Script Timeout: _____ Halt Script Timeout: _____________
Package AutoRun Enabled? ______
Node Failfast Enabled? ________
Failover Policy:_____________
Failback_policy:___________________________________
Access Policies:
User:_________________ From node:_______ Role:_____________________________
User:_________________ From node:_______
Role:______________________________________________
Log level____ Log
file:________________________________________________________________________
_______________

Package Configuration Worksheet 383


Priority_____________ Successor_halt_timeout____________
dependency_name _____ dependency_condition _____
dependency_location _______
==========================================================================
LVM Volume Groups:
vg____vg01___________vg________________vg________________vg________________
vgchange_cmd:
_____________________________________________________________________________
_____________________
Logical Volumes and File Systems:
fs_name___________________ fs_directory________________
fs_mount_opt_______________fs_umount_opt______________
fs_fsck_opt________________fs_type_________________
fs_name____________________fs_directory________________
fs_mount_opt_______________fs_umount_opt_____________
fs_fsck_opt________________fs_type_________________
fs_name____________________fs_directory________________
fs_mount_opt_______________fs_umount_opt_____________
fs_fsck_opt________________fs_type_________________
fs_mount_retry_count: ____________
fs_umount_retry_count:___________________
Concurrent mount/umount operations: ______________________________________
Concurrent fsck operations:
______________________________________________===============================
================================================
Network Information:
IP ________ IP__________IP___________subnet __________
IP__________IP__________IP___________subnet___________
Monitored
subnet:_______________________________________________________________
=============================================================================
====================
Service Name: _______ Command: _________ Restart:___ Fail Fast enabled:_____
Halt on Maintenance:____
Service Name: _______ Command: _________ Restart: __ Fail Fast enabled:_____
Halt on Maintenance:____
Service Name: _______ Command: _________ Restart: __ Fail Fast enabled:_____
Halt on Maintenance:____
=============================================================================
====================
Package environment variable:________________________________________________
Package environment variable:________________________________________________
External pre-script:_________________________________________________________
External
script:_____________________________________________________________=========
=======================================================================

384 Blank Planning Worksheets


IPv6 Network Support
This appendix describes some of the characteristics of IPv6 network addresses, specifically:

• IPv6 Address Types


• Network Configuration Restrictions
• Configuring IPv6 on Linux on page 389

IPv6 Address Types


Several IPv6 types of addressing schemes are specified in the RFC 2373 (IPv6 Addressing Architecture).
IPv6 addresses are 128-bit identifiers for interfaces and sets of interfaces. There are various address
formats for IPv6 defined by the RFC 2373. IPv6 addresses are broadly classified as unicast, anycast, and
multicast.
The following table explains the three types.

Table 22: IPv6 Address Types

Unicast An address for a single interface. A packet sent to a unicast address is delivered to
the interface identified by that address.

Anycast An address for a set of interfaces. In most cases these interfaces belong to different
nodes. A packet sent to an anycast address is delivered to one of these interfaces
identified by the address. Since the standards for using anycast addresses are still
evolving, they are not supported in Linux at present.

Multicast An address for a set of interfaces (typically belonging to different nodes). A packet
sent to a multicast address will be delivered to all interfaces identified by that address.

Unlike IPv4, IPv6 has no broadcast addresses; their functions are superseded by multicast.

Textual Representation of IPv6 Addresses


There are three conventional forms for representing IPv6 addresses as text strings:

• The first form is x:x:x:x:x:x:x:x, where the x’s are the hexadecimal values of the eight 16-bit pieces of
the 128-bit address. Example:
2001:fecd:ba23:cd1f:dcb1:1010:9234:4088.

• Some of the IPv6 addresses may contain a long strings of zero bits. In order to make it easy for
representing such addresses textually a special syntax is available. The use of “ ::” indicates that
there are multiple groups of 16-bits of zeros. The “ ::” can appear only once in an address and it can
be used to compress the leading, trailing, or contiguous sixteen-bit zeroes in an address. Example:
fec0:1:0:0:0:0:0:1234can be represented as fec0:1::1234.

• In a mixed environment of IPv4 and IPv6 nodes an alternative form of IPv6 address will be used. It is
x:x:x:x:x:x:d.d.d.d, where the x’s are the hexadecimal values of higher order 96 bits of IPv6 address
and the d’s are the decimal values of the 32-bit lower order bits. Typically IPv4 Mapped IPv6

IPv6 Network Support 385


addresses and IPv4 Compatible IPv6 addresses will be represented in this notation. These addresses
are discussed in later sections.
Examples:
0:0:0:0:0:0:10.1.2.3
and
::10.11.3.123

IPv6 Address Prefix


IPv6 Address Prefix is similar to CIDR in IPv4 and is written in CIDR notation. An IPv6 address prefix is
represented by the notation:
IPv6-address/prefix-length where ipv6-address is an IPv6 address in any notation listed above and prefix-
length is a decimal value representing how many of the leftmost contiguous bits of the address comprise
the prefix. Example:
fec0:0:0:1::1234/64
The first 64-bits of the address fec0:0:0:1 forms the address prefix. An address prefix is used in IPv6
addresses to denote how many bits in the IPv6 address represent the subnet.

Unicast Addresses
IPv6 unicast addresses are classified into different types. They are: global aggregatable unicast address,
site-local address and link-local address. Typically a unicast address is logically divided as follows:

n bits 128-n bits

Subnet prefix Interface ID

Interface identifiers in a IPv6 unicast address are used to identify the interfaces on a link. Interface
identifiers are required to be unique on that link. The link is generally identified by the subnet prefix.
A unicast address is called an unspecified address if all the bits in the address are zero. Textually it is
represented as “::”.
The unicast address ::1 or 0:0:0:0:0:0:0:1 is called the loopback address. It is used by a node to
send packets to itself.

IPv4 and IPv6 Compatibility


There are a number of techniques for using IPv4 addresses within the framework of IPv6 addressing.

IPv4 Compatible IPv6 Addresses


The IPv6 transition mechanisms use a technique for tunneling IPv6 packets over the existing IPv4
infrastructure. IPv6 nodes that support such mechanisms use a special kind of IPv6 addresses that carry
IPv4 addresses in their lower order 32-bits. These addresses are called IPv4 Compatible IPv6 addresses.
They are represented as follows:

80 bits 16 bits 32 bits

zeros 0000 IPv4 address

Example:

386 IPv6 Address Prefix


::192.168.0.1

IPv4 Mapped IPv6 Address


There is a special type of IPv6 address that holds an embedded IPv4 address. This address is used to
represent the addresses of IPv4-only nodes as IPv6 addresses. These addresses are used especially by
applications that support both IPv6 and IPv4. These addresses are called as IPv4 Mapped IPv6
Addresses. The format of these address is as follows:

80 bits 16 bits 32 bits

zeros FFFF IPv4 address

Example:
::ffff:192.168.0.1

Aggregatable Global Unicast Addresses


The global unicast addresses are globally unique IPv6 addresses. This address format is very well
defined in the RFC 2374 (An IPv6 Aggregatable Global Unicast Address Format). The format is:

3 13 8 24 16 64 bits

FP TLA ID RES NLA ID SLA ID Interface ID

where
FP = Format prefix. Value of this is “001” for Aggregatable Global unicast addresses.
TLA ID = Top-level Aggregation Identifier.
RES = Reserved for future use.
NLA ID = Next-Level Aggregation Identifier.
SLA ID = Site-Level Aggregation Identifier.
Interface ID = Interface Identifier.

Link-Local Addresses
Link-local addresses have the following format:

10 bits 54 bits 64 bits

1111111010 0 interface ID

Link-local address are supposed to be used for addressing nodes on a single link. Packets originating
from or destined to a link-local address will not be forwarded by a router.

Site-Local Addresses
Site-local addresses have the following format:

10 bits 38 bits 16 bits 64 bits

1111111011 0 subnet ID interface ID

IPv4 Mapped IPv6 Address 387


Link-local address are supposed to be used within a site. Routers will not forward any packet with site-
local source or destination address outside the site.

Multicast Addresses
A multicast address is an identifier for a group of nodes. Multicast addresses have the following format:

8 bits 4 bits 4 bits 112 bits

11111111 flags scop group ID

“FF” at the beginning of the address identifies the address as a multicast address.
The “flags” field is a set of 4 flags “000T”. The higher order 3 bits are reserved and must be zero. The last
bit ‘T’ indicates whether it is permanently assigned or not. A value of zero indicates that it is permanently
assigned otherwise it is a temporary assignment.
The “scop” field is a 4-bit field which is used to limit the scope of the multicast group. For example, a
value of ‘1’ indicates that it is a node-local multicast group. A value of ‘2’ indicates that the scope is link-
local. A value of “5” indicates that the scope is site-local.
The “group ID” field identifies the multicast group. Some frequently used multicast groups are the
following:
All Node Addresses = FF02:0:0:0:0:0:0:1 (link-local)
All Router Addresses = FF02:0:0:0:0:0:0:2 (link-local)
All Router Addresses = FF05:0:0:0:0:0:0:2 (site-local)

Network Configuration Restrictions


Serviceguard supports IPv6 for data and heartbeat IP.
The restrictions on support for IPv6 in Serviceguard for Linux are:

• Auto-configured IPv6 addresses are not supported in Serviceguard. as HEARTBEAT_IP or


STATIONARY_IP addresses. IPv6 addresses that are part of a Serviceguard cluster configuration
must not be auto-configured through router advertisements. Instead, they must be manually
configured in /etc/sysconfig/network-scripts/ifcfg-<eth-ID> on Red Hat or /etc/
sysconfig/network/ifcfg-<eth-ID> on SUSE. See Configuring IPv6 on Linux on page 389
for instructions and examples.
• Link-local IP addresses are not supported, as package IPs, HEARTBEAT_IPs, or STATIONARY_IPs.
Depending on the requirements, the package IP address could be of type site-local or global.
• Serviceguard supports only one IPv6 address belonging to each scope type (site-local and global) on
each network interface (that is, restricted multi-netting). This means that a maximum of two IPv6
HEARTBEAT_IP or STATIONARY_IP addresses can be listed in the cluster configuration file for a
NETWORK_INTERFACE:, one being the site-local IPv6 address, and the other being the global IPv6
address.

NOTE: This restriction applies to cluster configuration, not package configuration: it does not affect the
number of IPv6 relocatable addresses of the same scope type (site-local or global) that a package can
use on an interface.

• Bonding is supported for IPv6 addresses, but only in active-backup mode.


• Serviceguard supports IPv6 only on the Ethernet networks, including 10BT, 100BT, and Gigabit
Ethernet.

388 Multicast Addresses


IMPORTANT: For important information, see also Cross-Subnet Configurations, the description
of the HOSTNAME_ADDRESS_FAMILY, QS_HOST and QS_ADDR parameters under Cluster
Configuration Parameters on page 111, Configuring Name Resolution, and the Release Notes
for your version of Serviceguard for Linux.
For special instructions that may apply to using IPv6 addresses to connect your version of
Serviceguard for Linux and the Quorum Server, see “Configuring Serviceguard to Use the Quorum
Server” in the latest version HPE Serviceguard Quorum Server Version Release Notes, at http://
www.hpe.com/info/linux-serviceguard-docs (Select HP Serviceguard Quorum Server
Software).

Configuring IPv6 on Linux


Red Hat Enterprise Linux and SUSE Linux Enterprise Server already have the proper IPv6 tools installed,
including the /sbin/ip command. This section explains how to configure IPv6 stationary IP addresses
on these systems.

Enabling IPv6 on Red Hat Linux


Add the following lines to /etc/sysconfig/network:
NETWORKING_IPV6=yes # Enable global IPv6 initialization
IPV6FORWARDING=no # Disable global IPv6 forwarding
IPV6_AUTOCONF=no # Disable global IPv6 autoconfiguration
IPV6_AUTOTUNNEL=no # Disable automatic IPv6 tunneling

Adding persistent IPv6 Addresses on Red Hat Linux


This can be done by modifying the system configuration script, for example, /etc/sysconfig/
network-scripts/ifcfg-eth1:
DEVICE=eth1BOOTPROTO=static
BROADCAST=192.168.1.255
IPADDR=192.168.1.10
NETMASK=255.255.255.0
NETWORK=192.168.1.0
ONBOOT=yes
IPV6INIT=yes
IPV6ADDR=3ffe:ffff:0000:f101::10/64
IPV6ADDR_SECONDARIES=fec0:0:0:1::10/64
IPV6_MTU=1280

Configuring a Channel Bonding Interface with Persistent IPv6 Addresses


on Red Hat Linux
Configure the following parameters in /etc/sysconfig/network-scripts/ifcfg-bond0:
DEVICE=bond0
IPADDR=12.12.12.12
NETMASK=255.255.255.0
NETWORK=12.12.12.0
BROADCAST=12.12.12.255
IPV6INIT=yes
IPV6ADDR=3ffe:ffff:0000:f101::10/64
IPV6ADDR_SECONDARIES=fec0:0:0:1::10/64
IPV6_MTU=1280
ONBOOT=yes

Configuring IPv6 on Linux 389


BOOTPROTO=none
USERCTL=no
Add the following two lines to /etc/modprobe.conf to cause the bonding driver to be loaded on
reboot:
alias bond0 bonding
options bond0 miimon=100 mode=1 # active-backup mode

Adding Persistent IPv6 Addresses on SUSE


This can be done by modifying the system configuration script, for example, /etc/sysconfig/
network/ifcfg-eth1:
BOOTPROTO=static
BROADCAST=10.10.18.255
IPADDR=10.10.18.18
MTU=""
NETMASK=255.255.255.0
NETWORK=10.10.18.0
REMOTE_IPADDR=""
STARTMODE=onboot
IPADDR1=3ffe::f101:10/64
IPADDR2=fec0:0:0:1::10/64

Configuring a Channel Bonding Interface with Persistent IPv6 Addresses


on SUSE
Configure the following parameters in /etc/sysconfig/network/ifcfg-bond0:
BOOTPROTO=static
BROADCAST=10.0.2.255
IPADDR=10.0.2.10
NETMASK=255.255.0.0
NETWORK=0.0.2.0
REMOTE_IPADDR=""
STARTMODE=onboot
IPADDR1=3ffe::f101:10/64IPADDR2=fec0:0:0:1::10/64
BONDING_MASTER=yes
BONDING_MODULE_OPTS="mode=active-backup miimon=100"
BONDING_SLAVE0=eth1BONDING_SLAVE1=eth2
For each additional IPv6 address, specify an additional parameter with IPADDR<num> in the
configuration file.
Bonding module options are specified in each of the bond device files, so nothing needs to specified
in/etc/modprobe.conf

390 Adding Persistent IPv6 Addresses on SUSE


Maximum and Minimum Values for Parameters
The table shows the range of possible values for cluster configuration parameters.

Table 23: Minimum and Maximum Values of Cluster Configuration Parameters

Cluster Minimum Value Maximum Value Default Value Notes


Parameter

Member Timeout See See 14,000,000


MEMBER_TIME MEMBER_TIMEO microseconds
OUT under UT under “Cluster
“Cluster Configuration
Configuration Parameters” in
Parameters” in Chapter 4.
Chapter 4.

AutoStart Timeout 60,000,000 No Limit 600,000,000


microseconds (ULONG_MAX) microseconds

Network Polling 100,000 No Limit 2,000,000


Interval microseconds (ULONG_MAX) microseconds

Maximum 0 300 300


Configured
Packages

ULONG_MAX is a number equal to 4,294,967,295, which is therefore a practical limit.


The table shows the range of possible values for package configuration parameters.

Table 24: Minimum and Maximum Values of Package Configuration Parameters

Package Minimum Maximum Value Default Value Notes


Parameter Value

Run Script 10 seconds 4294 seconds if a 0 (NO_TIMEOUT) This is a recommended


Timeout non-zero value is value.
specified

Halt Script 10 seconds 4294 seconds if a 0 (NO_TIMEOUT) This is a recommended


Timeout non-zero value is value, but note that the
specified Halt Timeout value must
be greater than the sum
of all Service Timeout
values

Service Halt 0 seconds 4294 seconds 0 (no time waited


Timeout before the service is
terminated)

Maximum and Minimum Values for Parameters 391


Monitoring Script for Generic Resources
Monitoring scripts are the scripts written by an end-user and must contain the core logic to monitor a
resource and set the status of a generic resource. These scripts are started as a part of the package
start.

• You can set the status/value of a simple/extended resource respectively using the
cmsetresource(1m) command.

• You can define the monitoring interval in the script.


• The monitoring scripts can be launched within the Serviceguard environment by configuring them as
services, or outside of Serviceguard environment. It is recommended to launch the monitoring scripts
by configuring them as services.
For more information, see Launching Monitoring Scripts.

Template Scripts
Hewlett Packard Enterprise provides a monitoring script template. The template provided by Hewlett
Packard Enterprise is:

generic_resource_monitor.template

This is located in the $SGCONF/examples/ directory.


See the template to get an idea about how to write a monitoring script.
How to monitor a resource is at the discretion of an end-user and the script logic must be written
accordingly. Hewlett Packard Enterprisedoes not suggest the content that goes into the monitoring script.
However, the following recommendations might be useful:

• Choose the monitoring interval based on how quick the failures must be detected by the application
packages configured with a generic resource.
• Get the status/value of a generic resource using cmgetresource before setting its status/value.

• Set the status/value only if it has changed.

See Getting and Setting the Status/Value of a Simple/Extended Generic Resource on page 134 and
the cmgetresource(1m) and cmsetresource(1m) manpages.
See Using the Generic Resources Monitoring Service.

Launching Monitoring Scripts


Monitoring scripts can be launched in the following ways:
For resources of evaluation_type: during_package_start

• Monitoring scripts can be launched through the services functionality that is available in packages, as
indicated by service_name, service_cmd, and service_halt_timeout. This makes the scripts highly
available, since Serviceguard monitors them and is the recommended approach.
• Monitoring scripts can also be launched through external_script or external_pre_script as part of the
package.

392 Monitoring Script for Generic Resources


• Monitoring scripts can also be launched outside of the Serviceguard environment, init, rc scripts, etc.
(Serviceguard does not monitor them)
• It is not mandatory to have the same name for a generic resource and its monitoring script, i.e.,
service_name and generic_resource_name. However, it is good practice to have the same name, so
that it is easier to identify the monitor.
• A common resource specified across multiple packages can be monitored using one monitoring script.

For resources of evaluation_type: before_package_start

• Monitoring scripts can also be launched outside of the Serviceguard environment, init, rc scripts, etc.
(Serviceguard does not monitor them).
• The monitoring scripts for all the resources in a cluster of type before_package_start can be
configured in a single multi-node package by using the services functionality and any packages that
require the resources can mention the generic resource name in their package configuration file.
This makes the scripts highly available, since Serviceguard monitors them and is the recommended
approach. The monitoring script has to be configured to run on all the nodes the package is configured
to run on. See the recommendation and an example below.

For explanation of generic resource parameters, see under Package Parameter Explanations.
Hewlett Packard Enterprise recommends you to:

• Create a single multi-node package and configure all the monitoring scripts for generic resources of
type before_package_start in this multi-node package using the services functionality.

• Mention the generic resource name in the application package and configure the generic resource as
before_package_start.

• Configure a dependency for better readability, where the application package is dependent on this
multi-node package.

For example:
package_name generic_resource_monitors
package_type multi_node

service_name lan1_monitor
service_cmd $SGCONF/generic_resource_monitors/lan1.sh

service_name cpu_monitor
service_cmd $SGCONF/generic_resource_monitors/cpu_monitor.sh
The above example shows a sample multi-node package named generic_resource_monitors and
has two monitoring scripts configured — one each to monitor a LAN and CPU. These monitoring scripts
will monitor the LAN interface, CPU and sets the status of the generic resources defined in them
accordingly.
Consider a package pkg1 having the LAN resource configured as before_package_start and the
monitoring script for this is running in the multi-node package generic_resource_monitors. A
dependency is created such that the multi-node package must be UP in order to start the package pkg1.
Once the multi-node package is started, the monitoring of resource 'lan1' is started as part of the

Monitoring Script for Generic Resources 393


monitoring script 'lan1.sh'. The script will set the status of the generic resource 'lan1' and once the is UP,
the package pkg1 is eligible to be started.
package_name pkg1
package_type failover

generic_resource_name lan1
generic_resource_evaluation_type before_package_start

dependency_name generic_resource_monitors
dependency_condition generic_resource_monitors = up
dependency_location same_node
Similarly, consider another package pkg2 that requires the 'CPU' to be configured as
before_package_start.
package_name pkg2
package_type failover

generic_resource_name cpu
generic_resource_evaluation_type before_package_start

generic_resource_name lan1
generic_resource_evaluation_type before_package_start

dependency_name generic_resource_monitors
dependency_condition generic_resource_monitors = up
dependency_location same_node
Thus, the monitoring scripts for all the generic resources of type before_package_start are
configured in one single multi-node package and any package that requires this generic resource can just
configure the generic resource name.
If a common resource has to be monitored in multiple packages, the monitoring scripts can be configured
in the multi-node package described above and multiple packages can define the same generic resource
name in their package configuration files as seen for the generic resource 'lan1' in the above example.
The figure depicts a multi-node package containing two monitoring scripts configured — one to monitor a
lan and other to monitor a CPU. The two packages are configured with the generic resource names and
are dependent on the multi-node package.

394 Monitoring Script for Generic Resources


Figure 46: Multi-node package configured with all the monitoring scripts for generic resources of
type before_package_start

Template of a Monitoring Script


Monitoring Script Template — $SGCONF/examples/generic_resource_monitor.template

# **********************************************************************
# * *
# * This script is a template that can be used as a service when *
# * creating a customer defined sample monitor script for *
# * generic resource(s). *
# * *
# * Once created, this script can be configured into the package *
# * configuration file as a service with the "service_name", *
# * "service_cmd" and "service_halt_timeout" parameters. *
# * Note that the respective "sg/service" and the *
# * "sg/generic_resource" modules need to be specified in the package *
# * configuraton file in order to configure these parameters. *
# * *
# * *
# * --------------------------------- *
# * U T I L I T Y F U N C T I O N S *
# * --------------------------------- *
# * The following utility functions are sourced in from $SG_UTILS *
# * ($SGCONF/scripts/mscripts/utils.sh) and available for use: *
# * *
# * sg_log <log level> <log msg> *
# * *
# * By default, only log messages with a log level of 0 will *
# * be output to the log file. If parameter "log_level" is *
# * configured in the package configuration file, then log *
# * messages that have a log level that is equal to or *
# * greater than the configured log level will be output. *
# * *
# * In addition, the format of the time stamp is prefixed in *
# * front of the log message. *
# * *
# * *
# **********************************************************************
#
###################################################################

###########################################

Template of a Monitoring Script 395


# Initialize the variables & command paths
###########################################
#set the path for the command rm
<RM= PATH>
###########################
# Source utility functions.
###########################

if [[ -z $SG_UTILS ]]
then
. /etc/cmcluster.conf
SG_UTILS=$SGCONF/scripts/mscripts/utils.sh
fi

if [[ -f ${SG_UTILS} ]]
then
. ${SG_UTILS}
if (( $? != 0 ))
then
echo "ERROR: Unable to source package utility functions file: ${SG_UTILS}"
exit 1
fi
else
echo "ERROR: Unable to find package utility functions file: ${SG_UTILS}"
exit 1
fi

###########################################
# Source the package environment variables.
###########################################

typeset postfix=$(date +"%H.%M.%S")


SG_ENV_FILE=/var/tmp/${SG_PACKAGE}.$postfix.$$.tmp
$SGSBIN/cmgetpkgenv $SG_PACKAGE > $SG_ENV_FILE
if (( $? != 0 ))
then
echo "ERROR: Unable to retrieve package attributes."
exit 1
fi
. $SG_ENV_FILE
$RM -f $SG_ENV_FILE

#########################################################################
#
# start_command
#
# This function should define actions to take when the package starts
#
#########################################################################

function start_command
{

sg_log 5 "start_command"

# ADD your service start steps here

return 0
}

#########################################################################
#
# stop_command
#
# This function should define actions to take when the package halts
#
#

396 Monitoring Script for Generic Resources


#########################################################################

function stop_command
{

sg_log 5 "stop_command"

# ADD your halt steps here

exit 1
}

################
# main routine
################

sg_log 5 "customer defined monitor script"

#########################################################################
#
# Customer defined monitor script should be doing following
# functionality.
#
# When the package is halting, cmhaltpkg will issue a SIGTERM signal to
# the service(s) configured in package. Use SIGTERM handler to stop
# the monitor script.
#
# Monitor the generic resource configured in package using customer
# defined tools and set the status or value to generic resource by using
# "cmsetresource" command. When setting the status or value get the current
# status or value using "cmgetresource" and set only if they are different.
#
#########################################################################

start_command $*

# SIGTERM signal handler to stop the monitor script


trap "stop_command" SIGTERM

while [ 1 ]
do

# Using customer defined tools get the status or value


# of generic resource(s) which are configured in package.

# Set the status or value of the generic resource using


# "cmsetresource" command. Before setting the stauts or value
# compare the new status or value by getting the existing status or
# value using "cmgetresource" and set only if they are different.

# Wait for customer defined interval to check the status or value


# for next time.

Monitoring Script for Generic Resources 397


Monitoring Script for Cluster Generic
Resources
Monitoring scripts are the scripts that are written by end users. The scripts must contain the core logic to
monitor a cluster generic resource and set the status of a generic resource. These scripts are started as a
part of the cluster start.

• You can set the status/value of a simple/extended resource respectively using the
cmsetresource(1m) command.

• You can define the monitoring interval in the script.


• The monitoring scripts can be launched within the Serviceguard environment by configuring them as
Cluster generic resource commands of cluster configuration.

Cluster Generic Resources template scripts


Hewlett Packard Enterprise provides a monitoring script template. The template provided by Hewlett
Packard Enterprise is:
cluster_generic_resource_monitor.template
The template is at the $SGCONF/examples/ directory location.
See the template to get an idea about how to write a monitoring script for cluster generic resource. How
to monitor a resource is at the discretion of an end-user and the script logic must be written accordingly.
Hewlett Packard Enterprise does not suggest the content that goes into the monitoring script. However,
the following recommendations might be useful:

• Choose the monitoring interval based on how quick the failures must be detected by the application
packages configured with a cluster generic resource.
• Set the status or the value of a generic resource using cmgetresource before setting its status or
value.
• Set the status or value only if it has changed.

For more information, see Getting and Setting the Status/Value of a Simple/Extended Generic
Resource, Using the Cluster Generic Resources Monitoring Serviceand the cmgetresource(1m)
and cmsetresource(1m) manpages.
Template of a Monitoring Script.
Monitoring Script Template — $SGCONF/cluster_generic_resource_monitor.template

# **********************************************************************
# * *
# * This script is a template that can be used as a cluster generic *
# * resource command when creating a customer defined sample monitor *
# * script for cluster generic resource(s). *
# * *
# * Once created, this script can be specified in the cluster *
# * configuration file as a generic resource command with the *
# * "GENERIC_RESOURCE_NAME ", GENERIC_RESOURCE_TYPE and *
# * "GENERIC_RESOURCE_CMD" parameters. *
# * *

398 Monitoring Script for Cluster Generic Resources


# * Information pertaining to the generic resource can be accessed in *
# * monitoring script, by means of below mentioned environmental *
# * variables. *
# * *
# * Generic resource name : GENERIC_RESOURCE_NAME. *
# * The name of the generic resource *
# * configured in this cluster that corresponds to this *
# * monitoring script. *
# * *
# * Generic resource type : GENERIC_RESOURCE_TYPE (simple/extended) *
# * It is the type of cluster generic resource. *
# * *
# * Generic resource scope : GENERIC_RESOURCE_SCOPE (node) *
# * It specifies the visibility of the cluster generic *
# * resource. Node scope generic resource status or values *
# * are unique across all nodes in a cluster. *
# * *
# * Generic resource halt timeout : GENERIC_RESOURCE_HALT_TIMEOUT *
# * It is time in micro seconds used to determine *
# * the duration Serviceguard will wait for the command *
# * specified in generic resource to halt. *
# * *
# * Generic resource restart count : SG_RESTART_COUNT *
# * The number of times that the generic resource command *
# * has failed and was restarted. *
# * *
# * Below example shows how to get the environment variables *
# * into the script, which will be written by user or admin. *
# * RES_NAME=$GENERIC_RESOURCE_NAME *
# * RES_SCOPE=$GENERIC_RESOURCE_SCOPE *
# **********************************************************************
#
###################################################################

###########################################
# Initialize the variables & command paths
###########################################
#set the path for the command rm
RM=<PATH>

#set the values from environment variables


(get the value for whichever environment variable is required).

RES_NAME=$GENERIC_RESOURCE_NAME
RES_SCOPE=$GENERIC_RESOURCE_SCOPE
RES_TYPE=$GENERIC_RESOURCE_TYPE
RES_RESTART_COUNT=$SG_RESTART_COUNT
RES_HALT_TIMEOUT=$GENERIC_RESOURCE_HALT_TIMEOUT

###########################
# Source utility functions.
###########################

if [[ -z $SG_UTILS ]]
then
. /etc/cmcluster.conf
SG_UTILS=$SGCONF/scripts/mscripts/utils.sh
fi

Monitoring Script for Cluster Generic Resources 399


if [[ -f ${SG_UTILS} ]]
then
. ${SG_UTILS}
if (( $? != 0 ))
then
echo "ERROR: Unable to source package utility functions file: ${SG_UTILS}"
exit 1
fi
else
echo "ERROR: Unable to find package utility functions file: ${SG_UTILS}"
exit 1
fi

#########################################################################
#
# start_command
#
# This function should define actions to take when the cluster starts
#
#########################################################################

function start_command
{

echo "start_command"

# ADD your generic resource command start steps here

return 0
}

#########################################################################
#
# stop_command
#
# This function should define actions to take when the cluster halts
#
#
#########################################################################

function stop_command
{

echo "stop_command"

# ADD your halt steps here

exit 1
}

################
# main routine
################

echo "customer defined monitor script"

###########################################################################

400 Monitoring Script for Cluster Generic Resources


#
# Customer defined monitor script should be doing following
# functionality.
#
# When the cluster or node is halting, cmhaltcl/cmhaltnode will issue a
# SIGTERM signal to the cluster generic resource commands configured in
# cluster. Use SIGTERM handler to stop the resource commands.
#
# Monitor the generic resource configured in cluster using customer
# defined tools and set the status or value to generic resource by using
# "cmsetresource" command. When setting the status or value get the current
# status or value using "cmgetresource" and set only if they are different.
#
###########################################################################

start_command $*

# SIGTERM signal handler to stop the generic resource command script.


trap "stop_command" SIGTERM

while [ 1 ]
do

# Using customer defined tools get the status or value


# of generic resource(s) which are configured in cluster.

# Set the status or value of the cluster generic resource using


# "cmsetresource" command. Before setting the status or value
# compare the new status or value by getting the existing status or
# value using "cmgetresource" and set only if they are different.

# Wait for customer defined interval to check the status or value


# for next time.
done

Monitoring Script for Cluster Generic Resources 401


Using Serviceguard RESTful Application
Programming Interface
Starting with Serviceguard for Linux A.12.10.00 you can leverage the RESTful Application Programming
Interface (API) based on the REST protocol. In addition to Command Line Interface (CLI) and
Serviceguard Manager (Graphical User Interface), you can use the RESTful APIs to program your
infrastructure to configure, monitor, and manage Serviceguard clusters, packages, and workloads.
You can use the Serviceguard REST APIs to accomplish the following:

• Attain high availability (HA) or disaster recovery (DR) protection for critical workloads.
• Automate or program the cluster, package, or workload deployment operations.
• Call the REST APIs from the datacenter level Orchestration and automation tools for infrastructure
management.
• Manage Serviceguard cluster through the Custom Graphical User Interfaces (GUI). You can program
the APIs into custom UI and use them to configure, monitor, or manage clusters, packages and
workloads.

For more information on the REST API see, the HPE Serviceguard REST API Reference Guide available
at http://www.hpe.com/info/linux-serviceguard-docs.

Launching Serviceguard RESTful Application


Programming Interface
The prerequisites to use the Serviceguard RESTful Application Programming Interface is the same as
that for using the Serviceguard manager. For more information see, Launching Serviceguard Manager.

402 Using Serviceguard RESTful Application Programming Interface


Serviceguard Toolkit for Linux
The Serviceguard Toolkits such as, Contributed Toolkit, NFS, EDB PPAS, Sybase, and Oracle Toolkits are
used for the integration of applications such as, Apache, MySQL, NFS, Oracle database, EDB PPAS,
Sybase, and so on with the Serviceguard for Linux environment. The Toolkit documentation describes
how to customize the package for your needs. For more information, see the Release Notes of these
toolkits (Contributed Toolkit, NFS, and EDB PPAS) at http://www.hpe.com/info/linux-serviceguard-
docs.

NOTE:
For details about Sybase and Oracle toolkits, see HPE Serviceguard for Linux A.12.00.40 Release Notes
at http://www.hpe.com/info/linux-serviceguard-docs.

Serviceguard Toolkit for Linux 403


Serviceguard Manager for Linux
Serviceguard Manager is a web-based GUI management application for Serviceguard. Serviceguard
Manager is used to configure, monitor, and administer Serviceguard clusters, and Serviceguard Disaster
Recovery clusters like Extended Distance Cluster and Metrocluster. Serviceguard Manager simplifies the
administration tasks of managing critical applications integrated with Serviceguard. It provides protection
against planned and unplanned downtime.
This chapter provides an overview of the features that you can accomplish from the Serviceguard
Manager. All the features are available from Serviceguard Manager Version 12.10.00. For more
information about the features and how to configure and use them, see the Serviceguard Manager online
help.

Disaster recovery rehearsal overview


Use the Disaster Recovery solution to ensure that your applications are not impacted in case of site
failure. This feature is available with Serviceguard Enterprise license. To test the disaster recovery
preparedness of your site, it would be prudent to simulate a failure in your cluster and validate your DR
site for recovery preparedness instead of waiting for the actual disaster to strike and then begin the
disaster recovery process. This enables you to evaluate the disaster recovery preparedness of the DR
site and that your DR site is set up correctly for a recovery to complete in case a disaster strikes.
You can simulate a failure of your primary site using the Rehearsal feature. The Disaster Recovery
Rehearsal helps you simulate a failure at the package, workload, node, or at the site level. The rehearsal
process does not affect the regular functioning of either your primary or secondary sites and its
associated resources like node, storage, network, or applications.
In the event that the actual disaster situation arises after your start the disaster rehearsal process, the
rehearsal process rolls back and frees up all the resources to enable the actual disaster recovery to
begin.
When the rehearsal completes successfully or fails before the whole process is complete, a report is
generated that you can save for reference and troubleshooting purposes.

Prerequisites to deploy DRR on VMware environment


You can implement Disaster Recovery Rehearsal (DRR) in VMware environment. The Virtual Machines
(VMs) can be used as Serviceguard cluster nodes to deploy DRR. When you implement DRR on VMs,
you must ensure that the following prerequisites are met.
The SLS environment must be configured on vCenter to support DRR. You must configure either vCenter
or ESXi as a prerequisite to implement DRR. To configure the vCenter or ESXi in Serviceguard cluster
using Serviceguard Manager GUI, see Creating a cluster section in Serviceguard Manager online help.
If the package is configured for Dynamically Linked Storage (DLS) or Static Linked Storage (SLS), then
the ESXi or vCenter user must have additional privileges or must be a root user to start the rehearsal. If
the user is created on vCenter, then the same user with same privileges must be created on ESXi also. If
you do not want to use the administrator user account or the root user, create a role with the required
privileges for VMware disks resource functionality and assign this role to the user. The role assigned to
the user account must have the following privileges:

• Low level file operations on datastore


• Browse datastore on datastore
• Add existing disk on virtual machine
• Change resource on virtual machine

404 Serviceguard Manager for Linux


• Remove disk on virtual machine
• Remove file on datastore
• Storage partition configuration on host
• Raw device configuration on virtual machine
• Add or remove device on virtual machine

Cluster configuration properties


When you run the cluster configuration reports on a cluster to verify the status of the cluster and package
configuration parameters, the following configuration properties are verified and a report is generated to
reflect their status.

Table 25: Cluster configuration properties

Cluster Cluster Parameters

VMFS Enabler User validation


Host

Node Capacities And Weights Resource


Miscellaneous
Capacity

Network Heartbeat
Stationary
Subnet
IP monitor
Polling target
Miscellaneous
Interface

ClusterArbitration Lock Lun


Quorum server
Miscellaneous

TimeoutAndOtherParameters Configuration parameters


Miscellaneous

AccessControlPolicy Access control policy


Miscellaneous

Table Continued

Cluster configuration properties 405


Cluster Cluster Parameters

General Cmd
Service
Miscellaneous
License

Storage Volume group


Miscellaneous
Physical volume

Table 26: Package configuration properties

Package properties Error types

Package Startup Parameters Startup Parameters


Package Weights
Failover Parameters
Timeout and Log Parameters
Miscellaneous

Event Notifications Email

3PAR Replication Parameters 3PAR Replication Parameters

XDC Replication Parameters Host Based Replication Parameters

Toolkit Parameters Oracle Toolkit


NFS Toolkit
Apache Toolkit
SGeSAP Toolkit

Resource Parameters Service


Storage
Generic Resource
Network
External Script
External Pre Script
File System
Persistent Reservation
VMFS Parameters

Site Controller Parameters Site Controller Parameters

Table Continued

406 Serviceguard Manager for Linux


Package properties Error types

Package Dependencies Package Dependency


Access Control Policy Access Control Policy
General Cmd
Miscellaneous Miscellaneous

Toolkit Studio overview


Serviceguard provides standard and ready-to-use toolkits to integrate applications in the Serviceguard
environment to provide high availability. For applications that do not have a toolkit, use the Toolkit Studio
to quickly and efficiently create a custom toolkit for your application and to deploy it in the Serviceguard
environment. The Toolkit Studio provides an user interface environment to create a toolkit for any
application. You can design a toolkit for your application and deploy it on several Serviceguard
environments to enable monitoring and high availability for your application.

Workload overview
A workload is a set of packages in a cluster that are logically grouped to form a single resource. Workload
provides a simplified view of multiple packages under single resource pane in the Serviceguard Manager.
You can create and deploy the following workloads types:
Oracle Single Instance DB
Workload for single instance of database running on Oracle.
Oracle Single Instance DB using ASM
Workload for single instance of database running on Oracle using Automatic Storage Management
(ASM).
Oracle Single Instance DB using Data Guard
Workload for single instance of database running on Oracle and deployed on Oracle Data Guard.
Oracle Single Instance DB using ASM & Data Guard
Workload for single instance of database running on Oracle using Automatic Storage Management
(ASM) and deployed on Oracle Data Guard.
For more information about Oracle single instance, Automatic Storage Management (ASM), and Oracle
Data Guard see, https://docs.oracle.com/.
From workload page, you can complete the following tasks:

• Monitor the health of multiple packages in a cluster from a single view.


• Deploy workloads to create packages.

Toolkit Studio overview 407

You might also like