This section explains the initial cluster setup for PRIMECLUSTER.
For details on the setup methods, refer to the reference locations indicated in the table below.
Details | Manual reference location* | |
---|---|---|
1 | 3.7.1.1 Initial Setup of CF and CIP (setting up cluster configuration information and IP addresses) | CF "1.1 CF, CIP, and CIM configuration" |
2 | CF "7 Shutdown Facility (SF)" | |
3 | 3.7.1.3 Initial Setup of the Cluster Resource Management Facility | CF "3.1 Resource Database configuration" |
*The PRIMECLUSTER manual name is abbreviated as follows:
CF: PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide
Refer to "5.1.1 Setting Up CF and CIP" in "PRIMECLUSTER Installation and Administration Guide" to set up CF and CIP.
In an FJcloud-O environment, only the SA_vmk5r shutdown agent is available for setup.
This section explains the method for setting up the SA_vmk5r shutdown agent as the shutdown facility.
For details on the survival priority, refer to "5.1.2.1 Survival Priority." in "PRIMECLUSTER Installation and Administration Guide."
Note
This setting is not necessary in a single-node cluster.
After setting up the shutdown agent, conduct a test for the forced stop of cluster nodes to make sure that the correct nodes can be forcibly stopped. For details of the test for the forced stop of cluster nodes, refer to "1.4 Test" in "PRIMECLUSTER Installation and Administration Guide."
The contents of the SA_vmk5r.cfg file and the rcsd.cfg file of all nodes should be identical. If not, a malfunction will occur.
If you changed a user password created in "3.1.1 Creating the User for the Forced Stop", perform this step again with a new password.
Be sure to perform the following operations on all nodes.
Set the shutdown daemon.
Create /etc/opt/SMAW/SMAWsf/rcsd.cfg with the following contents on all nodes in the cluster system.
CFNameX,weight=weight,admIP=myadmIP:agent=SA_vmk5r,timeout=125 CFNameX,weight=weight,admIP=myadmIP:agent=SA_vmk5r,timeout=125
CFNameX : Specify the CF node name of the cluster host. weight : Specify the weight of the SF node. myadmIP : Specify the IP address of the administrative LAN used in the shutdown facility of the cluster host. Available IP addresses are IPv4. When specifying a host name, make sure it is described in /etc/hosts.
Example) The following is a setup example.
# cat /etc/opt/SMAW/SMAWsf/rcsd.cfg
node1,weight=1,admIP=192.168.1.1:agent=SA_vmk5r,timeout=125
node2,weight=1,admIP=192.168.1.2:agent=SA_vmk5r,timeout=125
Create /etc/opt/SMAW/SMAWsf/rcsd.cfg and then set the owner, group, and access rights as follows.
# chown root:root /etc/opt/SMAW/SMAWsf/rcsd.cfg # chmod 600 /etc/opt/SMAW/SMAWsf/rcsd.cfg
Information
When creating the /etc/opt/SMAW/SMAWsf/rcsd.cfg file, the /etc/opt/SMAW/SMAWsf/rcsd.cfg.template file can be used as a template.
Encrypt the password.
Execute the sfcipher command to encrypt a password of a user for forcibly stopping the virtual server of FJcloud-O. For details on how to use the sfcipher command, refer to the manual page of "sfcipher."
# sfcipher -c
Example) The following is a setup example.
If a password is "k5admin$":
# sfcipher -c
Enter Password: <- Enter k5admin$
Re-Enter Password: <- Enter k5admin$
O/gm+AYuWwE7ow3dgVG/Nw==
Set the shutdown agent.
Create /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg with the following contents on all nodes in the cluster system.
Delimit each item with a single space.
CFNameX ServerName user passwd {cycle | leave-off} CFNameX ServerName user passwd {cycle | leave-off}
CFNameX : Specify the CF node name of the cluster host. ServerName : Specify the virtual server name in FJcloud-O on which the cluster host is running. user : Specify a user name for forcibly stopping the virtual server in FJcloud-O. passwd : Specify a password encrypted in step 2. cycle : Restart the node after forcibly stopping the node. leave-off : Power-off the node after forcibly stopping the node.
Example) The following is a setup example.
This example shows the following settings:
- The CF node names of the cluster host are node1 and node2.
- The virtual server names are vm1 and vm2.
- The user name to forcibly stop the virtual servers is pcl.
- The node will be restarted when it is forcibly stopped.
# cat /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg node1 vm1 pcl O/gm+AYuWwE7ow3dgVG/Nw== cycle node2 vm2 pcl O/gm+AYuWwE7ow3dgVG/Nw== cycle
Create /etc/opt/SMAW/SMAWsf/ SA_vmk5r.cfg and then set the owner, group, and access rights as follows.
# chown root:root /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg # chmod 600 /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg
Note
Make sure that the /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg file is set correctly. If the setting is incorrect, the shutdown facility cannot be performed normally.
Make sure that the virtual server name (ServerName) corresponding to the CF node name (CFNameX) of the cluster host of the /etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg file is set. If the setting is incorrect, an incorrect node will be forcibly stopped.
Start the shutdown facility.
Check if the shutdown facility has been started on all nodes in the cluster system.
# sdtool -s
On a node where the shutdown facility has already been started, execute the following commands to restart the shutdown facility.
# sdtool -e # sdtool -b
On a node where the shutdown facility has not been started, execute the following command to start the shutdown facility.
# sdtool -b
Information
You can check if the shutdown facility has already been started with the sdtool -s command. If "The RCSD is not running" is displayed, the shutdown facility is not started.
Check the status of the shutdown facility.
Execute the following command with all nodes in the cluster system to check the status of the shutdown facility.
# sdtool -s
Note
If "The RCSD is not running" is displayed, there is a failure in the shutdown daemon or shutdown agent settings. Perform the procedure from step 1 to 4 again.
A user created in "3.1.1 Creating the User for the Forced Stop" needs a periodical change of the password (every 90 days). For the procedure on changing a password, refer to "6.1 Changing a Password Periodically."
If you changed the virtual server name created in "3.1.4 Creating the Virtual Server for the Cluster Node", perform the procedure from step 3 to 5 again.
Information
Display results of the sdtool -s command
If Unknown or Init-ing is displayed in Init State, wait for about one minute, and then check the status again.
If Unknown is displayed in Shut State, it means that SF has not yet stopped the node. If Unknown is displayed in Init State, it means that SF has not yet initialized SA or tested the route. Unknown is displayed temporarily in Test State or Init State until the actual status can be confirmed.
If TestFailed is displayed in Test State, it means that a problem occurred while the agent was testing whether or not the node displayed in the Cluster Host field could be stopped. Some sort of problem probably occurred in the software, hardware, or network resources being used by that agent.
If InitFailed is displayed in Init State, communication with the endpoint of the regional user management or the compute (standard service) in FJcloud-O is disabled or the setting might have a failure. Check the following and then set the following again.
After the failure-causing problem is resolved and SF is restarted, the status display changes to InitWorked or TestWorked.
Execute the following command and check if the virtual server on which the cluster host is running can communicate with the endpoint of the regional user management.
# curl -k -s -X GET <URL of the endpoint of the regional user management>/v3/
If an error occurs, check the following.
- Application of the necessary OS patch
If a version of curl displayed by executing rpm -q curl is 7.19.7-43 or earlier, the necessary OS patch is not applied. Perform "3.1.4.6 Application of the Necessary OS Patch."
- .curlrc must be created.
Refer to "3.1.4.7 Creating .curlrc" to make sure that .curlrc is created according to the procedure.
- The security groups and the firewall service in FJcloud-O must be set properly.
- The virtual router of FJcloud-O must be created.
- The default router of the cluster host must be set in the virtual router.
- URL of the endpoint of the regional user management must be correct.
- The DNS server used in the cluster host must be set.
Execute the following command and check if the virtual server on which the cluster host is running can communicate with the endpoint of the compute (standard service).
# curl -k -s -X GET <URL of the endpoint of the compute (standard service)>/v2/
If the following 404 message was displayed, it is a normal operation.
{"nova_error":{"message":"{\"error\": {\"message\": \"Could not find token, .\" , \"code\": 404, \"title\": \"Not Found\"}}","request_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"}}
If the following 401 message was displayed, check the following.
{"error": {"message": "The request you have made requires authentication.", "code": 401, "title": "Unauthorized"}}
- A user name and a password for forcibly stopping the virtual server must be correct.
- If a password of a user for forcibly stopping the virtual server has not expired (90 days).
Refer to "6.1 Changing a Password Periodically" and change the password.
- An appropriate role must be set in a user for forcibly stopping the virtual server.
Refer to "3.1.1 Creating the User for the Forced Stop" and make sure that the role is set.
If a message other than the above 404 message or the above 401 message was displayed, check the following.
- The security groups and the firewall service in FJcloud-O must be set properly.
- The virtual router of FJcloud-O must be created.
- The default router of the cluster host must be set in the virtual router.
- URL of the endpoint of the compute (standard service) must be correct.
- The DNS server used in the cluster host must be set.
Make sure that the following settings are correct:
- The domain name, project name, URL of the endpoint for the regional user management, and URL of the endpoint for the compute (standard service) for the FJcloud-O environment information file (/opt/SMAW/SMAWRrms/etc/k5_endpoint.cfg)
- All of CF node name, virtual server name, user name, and encrypted password in the configuration file of the shutdown agent (/etc/opt/SMAW/SMAWsf/SA_vmk5r.cfg)
Refer to "5.1.3 Initial Setup of the Cluster Resource Management Facility" in "PRIMECLUSTER Installation and Administration Guide" to set up the resource database managed by the cluster resource management facility. In this setting, set the iSCSI device used in the mirroring among the servers of GDS and register it to the resource database.