This chapter describes notes you should be well aware of when building a PRIMECLUSTER system. Be sure to read through this before you start operation.
Synchronize time on all the nodes to configure a cluster system
Connect to the NTP server and synchronize time on all the nodes.
If the time is not synchronized on all the nodes, a cluster may not operate properly.
For example, if the following messages are output or the OnlinePriority attribute of the cluster application is set, the cluster application may not become Online on the intended node because the last online node cannot be correctly recognized at RMS startup.
(WRP, 34) Cluster host host is no longer in time sync with local node. Sane operation of RMS can no longer be guaranteed. Further out-of-sync messages will appear in the syslog.
(WRP, 35) Cluster host host is no longer in time sync with local node. Sane operation of RMS can no longer be guaranteed.
Do no set Spanning Tree Protocol to cluster interconnects
If you set Spanning Tree Protocol to cluster interconnects, the access between them is suspended. Thus, a heartbeat communication may fail.
Do not set a filtering function in routes of cluster interconnects
The cluster interconnects in PRIMECLUSTER bundle multiple lines to perform communication with PRIMECLUSTER's own protocol (ICF protocol). Therefore, they cannot communicate with devices other than cluster nodes connected to the cluster interconnects. Thus, do not set the filtering function in routes of the cluster interconnects.
Set up kernel parameters necessary in a cluster
PRIMECLUSTER is operated by using a system resource. If this resource is insufficient, PRIMECLUSTER may not operate properly.
The volume of resources used in a system is set as a kernel parameter.
It varies depending on an environment on which your system is running. Estimate the volume of applicable resources based on the operation environment.
Moreover, change kernel parameters before building PRIMECLUSTER.
In addition to that, when you change kernel parameters, be sure to restart OS.
See
For details on a parameter value, see "Setup (initial configuration)" of PRIMECLUSTER Designsheets.
Enable system to collect a system dump or a clash dump
If either a system dump or a clash dump cannot be collected, it may take time to investigate the cause when a problem occurs. Moreover, it may not be able to identify its root cause.
Check that you can collect a system dump and a clash dump before building PRIMECLUSTER.
Synchronize time in the slew mode
To synchronize time on each node with NTP, use the slew mode to always adjust the time slowly. Do not choose the step mode, which is used for adjust the time rapidly.
For details, see the manual of OS and so on. Rapid time adjustment using NTP or time adjustment using running date command causes time inconsistency between nodes, which leads to the incorrect operation of cluster system.
Configure the required Shutdown Facility depending on a server to be used
The required Shutdown Facility varies depending on a server to be used. See "5.1.2 Setting up the Shutdown Facility" to check the required Shutdown Facility according to a server that is to be used. After that, configure it.
Set the time to detect CF heartbeat timeout as necessary
For the time to detect CF heartbeat timeout, you should consider operational volumes at a peak hour, and then set it based on your customer's environment. The value should be about 10 seconds to 1 minute. The default value is 10 seconds.
See
For the method of setting the time to detect CF heartbeat timeout, see "11.3.1 Changing Time to Detect CF Heartbeat Timeout."
Make sure to set the environment variable: RELIANT_SHUT_MIN_WAIT specifying the RMS shutdown wait time
The required time to stop RMS and cluster applications varies depending on an environment. Be sure to estimate its value corresponding to the configuration setup, and then set it.
See
For details on RELIANT_SHUT_MIN_WAIT, see "E.2 Global environment variables" in "PRIMECLUSTER Reliant Monitor Services (RMS) with Wizard Tools Configuration and Administration Guide."
For the method of referring to and changing RMS environment variables, see "E.1 Setting environment variables" in "PRIMECLUSTER Reliant Monitor Services (RMS) with Wizard Tools Configuration and Administration Guide."
Do not use DHCP when configuring CF
A node may be panicked if configuring CF while DHCP is set in the network interface.
Before configuring CF, unset DHCP in all network interfaces on nodes.
Example
<Contents of /etc/sysconfig/network-scripts/ifcfg-ethX>
DEVICE=ethX BOOTPROTO=dhcp ONBOOT=yes TYPE=Ethernet DHCP_HOSTNAME=Node1
<Contents of /etc/sysconfig/network-scripts/ifcfg-ethX>
DEVICE=ethX BOOTPROTO=static ONBOOT=yes IPADDR=xxx.xxx.xxx.xxx NETMASK=xxx.xxx.xxx.x TYPE=Ethernet
When using Global Link Services (hereinafter GLS), set up the configuration file (ifcfg-ethX) of network interface according to the redundant line control methods.
Setting items are different for each redundant line control method of GLS. For details, refer to "PRIMECLUSTER Global Link Services Configuration and Administration Guide: Redundant Line Control Function."
To use iptables or ip6tables as Firewall in a cluster node, see "Appendix L Using Firewall."
If Firewall is not set correctly, PRIMECLUSTER may not operate properly.
Do not disable an IPv6 function of the operating system in RHEL6 environment.
Even if you do not use an IPv6 address, set the IPv6 module of the operating system to be loaded.
Do not enable NetworkManager service.
PRIMECLUSTER cannot perform any setup or operation while NetworkManager service is enabled.
Make sure that NetworkManager service is disabled. For how to change the setup of NetworkManager service, refer to the OS manual.
If CF is running, do not restart network services or delete network interfaces.
If CF is running, any of the following operations may panic a node.
Restarting network services
Stopping and starting GLS (when CF uses network interfaces of GLS)
Deleting network interfaces used by CF
When performing these operations, stop CF beforehand.
When CF is not set, CF uses all the network interfaces on the OS. When CF is set, CF uses the network interfaces set to interconnects.
See
For details on how to start and stop CF, see "4.6 Starting and stopping CF" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."