This section describes the initial cluster setup for PRIMECLUSTER.
For details on the setup methods, refer to the reference locations indicated in the table below.
Details | Manual reference location* | |
---|---|---|
1 | 15.8.1.1 Initial Setup of CF and CIP (setting up cluster configuration information and IP addresses) | CF "1.1 CF, CIP, and CIM configuration" |
2 | CF "7 Shutdown Facility (SF)" | |
3 | 15.8.1.3 Initial Setup of the Cluster Resource Management Facility | CF "3.1 Resource Database configuration" |
*The PRIMECLUSTER manual name is abbreviated as follows:
CF: PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide
Refer to "5.1.1 Setting Up CF and CIP" in "PRIMECLUSTER Installation and Administration Guide" to set up CF and CIP.
For the IP interconnect, use the tagged VLAN interface created in "15.7.1 Initial GLS Setup."
However, in an FJcloud-Baremetal environment, CF cannot be built with Cluster Admin. For details on how to set up CF, refer to the configuration procedure in a cloud environment described in "1.1.6 Example of CF configuration by CLI" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."
In an FJcloud-Baremetal environment, only the SA_vmk5r shutdown agent is available for setup.
This section describes the method for setting up the SA_vmk5r shutdown agent as the shutdown facility.
For details on the survival priority, refer to "5.1.2.1 Survival Priority." in "PRIMECLUSTER Installation and Administration Guide."
Note
After setting up the shutdown agent, conduct a test for the forced stop of cluster nodes to make sure that the correct nodes can be forcibly stopped. For details of the test for the forced stop of cluster nodes, refer to "1.4 Test" in "PRIMECLUSTER Installation and Administration Guide."
After the node was successfully forcibly stopped, make sure that the following messages are not output in /var/log/messages of the node.
systemd-logind: Power key pressed. systemd-logind: Powering Off... systemd-logind: System is powering down.
The contents of the SA_vmk5r.cfg file and the rcsd.cfg file of all nodes should be identical. If not, a malfunction will occur.
If you changed a user password created in "15.1.1 Creating the User for the Forced Stop", perform this step again with a new password.
Be sure to perform the following operations on all nodes.
Set the shutdown daemon.
Create /etc/opt/SMAW/SMAWsf/rcsd.cfg with the following contents on all nodes in the cluster system.
CFNameX,weight=weight,admIP=myadmIP:agent=SA_vmk5r,timeout=90 CFNameX,weight=weight,admIP=myadmIP:agent=SA_vmk5r,timeout=90
CFNameX : Specify the CF node name of the cluster host. weight : Specify the weight of the SF node. myadmIP : Specify the IP address of the administrative LAN used in the shutdown facility of the cluster host. Available IP addresses are IPv4. When specifying a host name, make sure it is described in /etc/hosts.
Example) The following is a setup example.
# cat /etc/opt/SMAW/SMAWsf/rcsd.cfg
node1,weight=1,admIP=192.168.1.1:agent=SA_vmk5r,timeout=90
node2,weight=1,admIP=192.168.1.2:agent=SA_vmk5r,timeout=90
Create /etc/opt/SMAW/SMAWsf/rcsd.cfg and then set the owner, group, and access rights as follows.
# chown root:root /etc/opt/SMAW/SMAWsf/rcsd.cfg # chmod 600 /etc/opt/SMAW/SMAWsf/rcsd.cfg
Information
When creating the /etc/opt/SMAW/SMAWsf/rcsd.cfg file, the /etc/opt/SMAW/SMAWsf/rcsd.cfg.template file can be used as a template.
Encrypt the password.
Execute the sfcipher command to encrypt a password of a user for forcibly stopping the Bare Metal server of FJcloud-Baremetal. For details on how to use the sfcipher command, refer to the manual page of "sfcipher."
# sfcipher -c
Example) The following is a setup example.
If a password is "k5admin$":
# sfcipher -c
Enter Password: <- Enter k5admin$
Re-Enter Password: <- Enter k5admin$
O/gm+AYuWwE7ow3dgVG/Nw==
Set the shutdown agent.
Create /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg with the following contents on all nodes in the cluster system.
Delimit each item with a single space.
CFNameX ServerName user passwd {cycle | leave-off} CFNameX ServerName user passwd {cycle | leave-off}
CFNameX : Specify the CF node name of the cluster host. ServerName : Specify the Bare Metal server name in FJcloud-Baremetal on which the cluster host is running. For Bare Metal server names that use PRIMECLUSTER, the following ASCII characters can be used. Do not use other characters. - Uppercase letters - Lowercase letters - Numbers - "_" (Underscore) - "-" (Hyphen) user : Specify a user name for forcibly stopping the Bare Metal server. passwd : Specify a password encrypted in step 2. cycle : Restart the node after forcibly stopping the node. leave-off : Power-off the node after forcibly stopping the node.
Example) The following is a setup example.
This example shows the following settings:
- The CF node names of the cluster host are node1 and node2.
- The Bare Metal server names are vm1 and vm2.
- The user name to forcibly stop the Bare Metal server is pcl.
- The node will be restarted when it is forcibly stopped.
# cat /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg
node1 vm1 pcl O/gm+AYuWwE7ow3dgVG/Nw== cycle
node2 vm2 pcl O/gm+AYuWwE7ow3dgVG/Nw== cycle
Create /etc/opt/SMAW/SMAWsf/ SA_vmk5r.cfg and then set the owner, group, and access rights as follows.
# chown root:root /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg # chmod 600 /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg
Note
Make sure that the /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg file is set correctly. If the setting is incorrect, the shutdown facility cannot be performed normally.
Make sure that the Bare Metal server name (ServerName) corresponding to the CF node name (CFNameX) of the cluster host of the /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg file is set. If the setting is incorrect, an incorrect node will be forcibly stopped.
Start the shutdown facility.
Check if the shutdown facility has been started on all nodes in the cluster system.
# sdtool -s
On a node where the shutdown facility has already been started, execute the following commands to restart the shutdown facility.
# sdtool -e # sdtool -b
On a node where the shutdown facility has not been started, execute the following command to start the shutdown facility.
# sdtool -b
Information
You can check if the shutdown facility has already been started with the sdtool -s command. If "The RCSD is not running" is displayed, the shutdown facility is not started.
Check the status of the shutdown facility.
Execute the following command with all nodes in the cluster system to check the status of the shutdown facility.
# sdtool -s
Note
If "The RCSD is not running" is displayed, there is a failure in the shutdown daemon or shutdown agent settings. Perform the procedure from step 1 to 4 again.
A user created in "15.1.1 Creating the User for the Forced Stop" needs a periodical change of the password (every 90 days). For the procedure on changing a password, refer to "18.1 Changing a Password Periodically."
If you changed the Bare Metal server name created in "15.1.2 Building the Bare Metal Server", perform the procedure from step 3 to 5 again.
Information
Display results of the sdtool -s command
If Unknown or Init-ing is displayed in Init State, wait for about one minute, and then check the status again.
If Unknown is displayed in Shut State, it means that SF has not yet stopped the node. If Unknown is displayed in Init State, it means that SF has not yet initialized SA or tested the route. Unknown is displayed temporarily in Test State or Init State until the actual status can be confirmed.
If TestFailed is displayed in Test State, it means that a problem occurred while the agent was testing whether or not the node displayed in the Cluster Host field could be stopped. Some sort of problem probably occurred in the software, hardware, or network resources being used by that agent.
If InitFailed is displayed in Init State, communication with the endpoint of the regional user management or the compute (standard service) in FJcloud-Baremetal is disabled or the setting might have a failure. Check the following and then set the following again.
After the failure-causing problem is resolved and SF is restarted, the status display changes to InitWorked or TestWorked.
Execute the following command and check if the Bare Metal server on which the cluster host is running can communicate with the endpoint of the regional user management.
# curl -k -s -X GET <URL of the endpoint of the regional user management>/v3/
If an error occurs, check the following.
- The security groups, the firewall service, and the firewall of OS in FJcloud-Baremetal must be set properly.
- The virtual router of FJcloud-Baremetal must be created.
- The default router of the cluster host must be set in the virtual router.
- URL of the endpoint of the regional user management must be correct.
- The DNS server used in the cluster host must be set.
Execute the following command and check if the Bare Metal server on which the cluster host is running can communicate with the endpoint of the compute (standard service).
# curl -k -s -X GET <URL of the endpoint of the compute (standard service)>/v2/
If the following message is displayed, it is a normal operation.
{"error": {"message": "The request you have made requires authentication.", "code": 401, "title": "Unauthorized"}}
If a message other than the above message was displayed, check the following.
- The security groups, the firewall service, and the firewall of OS in FJcloud-Baremetal must be set properly.
- The virtual router of FJcloud-Baremetal must be created.
- The default router of the cluster host must be set in the virtual router.
- URL of the endpoint of the compute (standard service) must be correct.
- The DNS server used in the cluster host must be set.
Make sure that the following settings are correct:
- The domain name (contractor number), project name, URL of the endpoint for the regional user management, and URL of the endpoint for the compute (standard service) for the FJcloud-Baremetal environment information file (/opt/SMAW/SMAWRrms/etc/k5_endpoint.cfg)
- All of CF node name, Bare Metal server name, user name, and encrypted password in the configuration file of the shutdown agent (/etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg)
Refer to "5.1.3 Initial Setup of the Cluster Resource Management Facility" in "PRIMECLUSTER Installation and Administration Guide" to set up the resource database managed by the cluster resource management facility.