1.1 CF, CIP, and CIM configuration

You must configure CF before any other cluster services, such as Reliant Monitor Services (RMS). CF defines which nodes are in a given cluster. In addition, after you configure CF and CIP, the Shutdown Facility (SF) and RMS can be run on the nodes.

The Shutdown Facility (SF) is responsible for node elimination. This means that even if RMS is not installed or running in the cluster, missing CF heartbeats will cause SF to eliminate nodes.

You can use the Cluster Admin CF Wizard to easily configure CF, CIP, and CIM for all the nodes in the cluster.

A CF configuration consists of the following main attributes:

Cluster name - This can be any name that you choose as long as it is 31 characters or less per name and each character comes from the set of printable ASCII characters, excluding white space, newline, and tab characters. Cluster names are always mapped to upper case.
Interconnects - Set of interfaces on each node in the cluster used for CF networking. An Ethernet device on the local node is one example of an interface.
CF node name - By default, in Cluster Admin, the CF node names are the same as the Web-Based Admin View names; however, you can use the CF Wizard to change them. CF node names are converted to lower case.

The dedicated network connections used by CF are known as interconnects. They typically consist of some form of high speed networking such as 100 MB or Gigabit Ethernet links. These interconnects must meet the following requirements if they are to be used for CF:

The network links used for interconnects must have low latency and low error rates. This is required by the CF protocol. Private switches and hubs will meet this requirement. Public networks, bridges, and switches shared with other devices may not necessarily meet these requirements, and their use is not recommended.
It is recommended that each CF interface be connected to its own private network with each interconnect on its own switch or hub.
The interconnects should not be used on any network that might experience network outages of 5 seconds or more. A network outage of 10 seconds will, by default, cause a route to be marked as DOWN. cfset(1M) can be used to change the 10 second default. Refer to the Section "1.1.2 cfset" for additional information.

Since CF automatically attempts to bring up downed interconnects, the problem with split clusters only occurs if all interconnects experience a 10-second outage simultaneously. Nevertheless, CF expects highly reliable interconnects.

You should carefully choose the number of interconnects you want in the cluster before you start the configuration process. If you decide to change the number of interconnects after you have configured CF across the cluster, you can either bring down CF on each node to do the reconfiguration or use the cfrecon command. To stop CF, it is necessary to stop the higher level services (such as RMS, SF, Global File Services (hereinafter GFS)) on each node. Therefore, the reconfiguration process is complicated to influence other operations. Using the cfrecon command will lead to temporary asymmetrical CF configuration.

Note

Your configuration should specify at least two interconnects to avoid a single point of failure in the cluster.

Before you begin the CF configuration process, ensure that all of the nodes are connected to the interconnects you have chosen and that all of the nodes can communicate with each other over those interconnects. For proper CF configuration using Cluster Admin, all of the interconnects should be working during the configuration process.

CIP configuration involves defining virtual CIP interfaces and assigning IP addresses to them. Up to eight CIP interfaces can be defined per node. These virtual interfaces act like normal TCP/IP interfaces except that the IP traffic is carried over the CF interconnects. Because CF is typically configured with multiple interconnects, the CIP traffic will continue to flow even if an interconnect fails. This helps eliminate single points of failure as far as physical networking connections are concerned for intracluster TCP/IP traffic.

Except for their IP configuration, the eight possible CIP interfaces per node are all treated identically. There is no special priority for any interface, and each interface uses all of the CF interconnects equally. For this reason, many system administrators may choose to define only one CIP interface per node.

To ensure that you can communicate between nodes using CIP, the IP address on each node for a specific CIP interface should use the same subnet. Besides, if you use an IPv6 address, use the IPv6 address assigned to the CIP interface for communications. Communications using the link local address are not available.

CIP traffic is really intended only to be routed within the cluster. The CIP addresses should not be used outside of the cluster. Because of this, you should use addresses from the non-routable reserved IP address range.

For the IPv4 address, Address Allocation for Private Internets (RFC 1918) defines the following address ranges that are set aside for private subnets:

Subnets(s)                     Class    Subnetmask
10.0.0.0                         A       255.0.0.0
172.16.0.0 ... 172.31.0.0        B       255.255.0.0
192.168.0.0 ... 192.168.255.0    C       255.255.255.0

For the IPv6 address, the range where Unique Local IPv6 Unicast Addresses (RFC 4193) defined with the prefix FC00::/7 is used as the address (Unique Local IPv6 Unicast Addresses) which can be allocated freely within the private network.

For CIP nodenames, it is strongly recommended that you use the following convention for RMS:

cfnameRMS

cfname is the CF name of the node and RMS is a literal suffix. This will be used for one of the CIP interfaces on a node. This naming convention is used in the Cluster Admin GUI to help map between normal node names and CIP names. In general, you only need to configure one CIP interface per node.

Note

In the CIP configuration, CIP names are stored in /etc/hosts. /etc/nsswitch.conf(4) should be set to use files as the first criteria when looking up nodes.

The recommended way to configure CF, CIP and CIM is to use the Cluster Admin GUI. You can use the CF/CIP Wizard in the GUI to configure CF, CIP, and CIM on all the nodes in the cluster in just a few screens. Before running the wizard, however, you must complete the following steps:

CF/CIP, Web-Based Admin View, and Cluster Admin should be installed on all the nodes in the cluster.
If you are running CF over Ethernet, then all of the interconnects in the cluster should be physically attached to their proper hubs or networking equipment and should be working.
Web-Based Admin View configuration must be done. Refer to "2.4.1 Management server configuration" in "PRIMECLUSTER Web-Based Admin View Operation Guide" for details.

In the cf tab in Cluster Admin, make sure that the CF driver is loaded on that node. Press the Load Driver button if necessary to load the driver. Then press the Configure button to start the CF Wizard.

The CF/CIP Wizard is invoked by starting the GUI on a node where CF has not yet been configured. When this is done, the GUI automatically brings up the CF/CIP Wizard in the cf tab of the GUI. You can start the GUI by entering the following URL with a browser running the correct version of the Java plug-in:

http://management_server:8081/Plugin.cgi

management_server is the primary or secondary management server you configured for this cluster. Refer to "4.3.3.1 Initial setup of the operation management server" in "PRIMECLUSTER Installation and Administration Guide" for details on configuring the primary and secondary management servers. Refer to "3.1.2 Prerequisite client environment" in "PRIMECLUSTER Web-Based Admin View Operation Guide" on which browsers and Java plug-ins are required for the Cluster Admin GUI.

In PRIMECLUSTER, it is recommended that you configure the administrative LAN and cluster interconnects on different NICs. However, if you cannot make such a configuration due to restrictions on hardware in KVM environment or VMware environment, the configuration which shares the administrative LAN and cluster interconnects on the NIC is also supported.

KVM environment

In the configuration which shares the administrative LAN and cluster interconnects on the NIC, you must conform all the following conditions for network and Global Link Services (hereinafter GLS):

Make two NICs redundant by GLS Virtual NIC mode on the Host OS.
Create the necessary number of the VLAN interfaces for the Host OS, the administrative LAN for the Host OS, public LAN, and cluster interconnects on the virtual interface.
Create cluster interconnects for the Host OS and guest OS on their VLAN interfaces. They are not made redundant on the cluster interconnect side.
For the public LAN, create Gls resources on the guest OS and RMS on the guest OS monitors them.

This configuration requires the CF configuration by CLI. For the configuration method, see "1.1.7 Example of CF configuration by CLI".

Note

In this configuration, there are the following notes:

Availability in the event of a double failure of network switch
If both network switches where two NICs are connected fail, the administrative LAN, public LAN, and cluster interconnects will enter the fault state. In this state, the Host OS and guest OS cannot be forcibly stopped and no switchover of applications occur.
Note that if a double failure occurs on the NIC of a server, switchover of applications occurs because they can be forcibly stopped from the other server.
Restriction on the timeout value of cluster interconnects
In GLS Virtual NIC mode, it takes 20 seconds to switch a path. On the other hand, the time to detect the failure of cluster interconnects is 10 seconds (default value). Therefore, with the default value, the failure of cluster interconnects will be detected first if one NIC failure occurs.
To solve this problem, change the timeout value (CLUSTER_TIMEOUT) to 40 seconds for the Host OS and 30 seconds for the guest OS.
By this setting change, the time to detect failures of cluster interconnects will be longer (from 10 seconds to 40 seconds).
Cluster switchover due to overload of the public LAN
If a communication timeout which is more than 30 seconds occurs, PRIMECLUSTER detects a failure of cluster interconnects, forcibly stops the Host OS or guest OS, and a cluster switchover may occur.
Restriction on the starting and stopping of GLS, and the rebooting for network service of system
When stopping and starting GLS, or rebooting the network service of System, stop CF beforehand. For instructions on stopping CF, refer to the Section "4.6 Starting and stopping CF".

VMware environment: When sharing NIC with administrative LAN and cluster interconnect in the VMware environment, separate the network allocated to the virtual machine using VMware's function. In this configuration, CF configuration can be conducted from GUI.