H.2.3 Building a Cluster

For details on survival priority, see "5.1.2.2 Survival priority."

In VMware environments, only the SA_icmp shutdown agent is available for setup.

This section explains the method for setting up the SA_icmp shutdown agent as the shutdown facility.

Note

Set up the SA_icmp shutdown agent when using the I/O fencing function.
Be sure to perform the following operations on all guest OSes (nodes).

Setting up the shutdown facility

Specify the shutdown agent.

Create /etc/opt/SMAW/SMAWsf/SA_icmp.cfg with the following contents on all guest OSes (nodes) of the cluster:

TIME_OUT=value
cfname:ip-address-of-node:NIC-name1,NIC-name2

value              : Specify the interval (in seconds) for checking whether the node is
                     alive. The recommended value is "5" (s).
cfname             : Specify the name of the CF node.
ip-address-of-node : Specify the IP addresses of any one of the following networks
                     utilized for checking whether the cfname node is alive. 
                     Checking via multiple networks is also available. 
                     In this case, add a line for each utilized network.
                     To check LAN paths, we recommend that you use multiple ones to surely
                     determine an error.
                     However, if you prioritize to switch over automatically to 
                     surely determine an error, set only cluster interconnects to the 
                     LAN paths.
                     If only cluster interconnects are set to the LAN paths, the automatic 
                     switchover is available even though communication is disabled 
                     between cluster interconnects but available via other LAN (when you 
                     determined that the node in the communication destination is alive).
                     - Cluster interconnect (IP address of CIP)
                     - Administrative LAN
                     - Public LAN
                     Available IP addresses are IPv4 and IPv6 addresses.
                     IPv6 link local addresses are not available. 
                     When specifying the IPv6 address, enclose it in brackets "[ ]".
                     (Example: [1080:2090:30a0:40b0:50c0:60d0:70e0:80f0])
                     Enter the IP address for all guest OSes (nodes) that configure the
                     cluster system.
NIC-nameX          : Specify the network interface of the local guest OS (node) utilized 
                     for checking whether the node defined by ip-address-of-node is alive. 
                     If there is more than one, delimit them with commas (",").

Note

Registering network interfaces

For duplicating by GLS, define all redundant network interfaces. (Example: eth0,eth1)
If you are bonding NICs, define the bonding device behind the IP address. (Example: bond0)
For registering the cluster interconnect, define all network interfaces that are used on all paths of the cluster interconnect. (Example: eth2,eth3)
Do not use the takeover IP address (takeover virtual Interface).

Example

Below indicates the setting example of clusters (consisted by 2 nodes) between guest OSes on multiple ESXi hosts.

When cluster interconnects (eth2,eth3) are set

TIME_OUT=5
node1:192.168.1.1:eth2,eth3
node2:192.168.1.2:eth2,eth3

When the public LAN (duplicated (eth0,eth1) by GLS) and the administrative LAN (eth4) are set

TIME_OUT=5
node1:10.20.30.100:eth0,eth1
node1:10.20.40.200:eth4
node2:10.20.30.101:eth0,eth1
node2:10.20.40.201:eth4

Setting up the shutdown daemon

Create /etc/opt/SMAW/SMAWsf/rcsd.cfg with the following contents on all guest OSes (nodes) of the cluster:

CFNameX,weight=weight,admIP=myadmIP:agent=SA_icmp,timeout=timeout
CFNameX,weight=weight,admIP=myadmIP:agent=SA_icmp,timeout=timeout

CFNameX        : CF node name of the cluster host. 
weight         : Weight of the SF node. 
myadmIP        : Specify the IP address of the administrative LAN for CFNameX. 
                 Available IP addresses are IPv4 and IPv6 addresses.
                 IPv6 link local addresses are not available.
                 When specifying the IPv6 address, enclose it in brackets "[ ]".
                 (Example: [1080:2090:30a0:40b0:50c0:60d0:70e0:80f0])
                 If you specify a host name, please make sure it is listed in /etc/hosts.
timeout        : Specify the timeout duration (seconds) of the Shutdown Agent. 
                 Specify the following values.
                (TIME_OUT + 2) X number of paths to be used for checking the survival
                 of a node, or 20
                                           (specify the larger value)
                 TIME_OUT is the TIME_OUT value that is described in the SA_icmp.cfg.

                     - When checking the survival of a node on the 1 path
                       (either one of administrative LAN, public LAN, or cluster
                        interconnects)
                       (1) TIME_OUT is 18 or larger
                           TIME_OUT + 2
                       (2) TIME_OUT is less than 18
                           20

                     - When checking the survival of a node on the 2 paths
                       (either two of administrative LAN, public LAN, or cluster
                        interconnects)
                       (1) TIME_OUT is 8 or larger
                           (TIME_OUT + 2)X 2
                       (2) TIME_OUT is less than 8
                           20

                     - When checking the survival of a node on the 3 paths
                       (three of administrative LAN, multiple public LANs, or public
                        LAN, or cluster interconnects)
                       (1) TIME_OUT is 5 or larger
                           (TIME_OUT + 2)X 3
                       (2) TIME_OUT is less than 5
                           20

Note

The rcsd.cfg file must be the same on all guest OSes (nodes). Otherwise, operation errors might occur.

Example

Below indicates the setting example to check survival of a node by using administrative LAN and public LAN when TIME_OUT value described in the SA_icmp.cfg is 10, in a two-node configuration.

node1,weight=1,admIP=192.168.100.1:agent=SA_icmp,timeout=24 (*)
node2,weight=1,admIP=192.168.100.2:agent=SA_icmp,timeout=24 (*)
timeout = (10 (TIMEOUT value) + 2) X 2(administrative LAN, public LAN) = 24

Starting the shutdown facility
Check that the shutdown facility has started.
```
# sdtool -s
```
If the shutdown facility has already started, execute the following command to restart the shutdown facility.
```
# sdtool -r
```
If the shutdown facility is not started, execute the following command to start the shutdown facility.
```
# sdtool -b
```
Checking the status of the shutdown facility
Check that the status of the shutdown facility is either "InitWorked" or "TestWorked." If the displayed status is "TestFailed" or "InitFailed," check the shutdown daemon settings for any mistakes.
```
# sdtool -s
```

H.2.3 Building a Cluster

H.2.3.1 Initial Setup of CF and CIP

H.2.3.2 Setting Up the Shutdown Facility

H.2.3.3 Initial Setup of the Cluster Resource Management Facility

H.2.3.4 Setting Up Fault Resource Identification and Operator Intervention Request