Top
PRIMECLUSTER  Installation and Administration Guide4.5
FUJITSU Software

5.1.2 Configuring the Shutdown Facility

This section explains the procedure for configuring the shutdown facility with the shutdown configuration wizard.

The configuration procedure for the shutdown facility varies depending on the machine type. Check the machine type of hardware and set an appropriate shutdown agent.

The following table shows the shutdown agent necessary by machine type.

Server machine type name

Shutdown agent

XSCF SNMP

RCI

XSCF

ALOM

ILOM

Panic/Reset/
PPAR Reset

Panic/
Reset

Panic/
Reset/
Break

Break

Panic/Reset

SPARC Servers

SPARC M10
SPARC M12

Y

N

N

N

N

SPARC
Enterprise

M3000
M4000
M5000
M8000
M9000

Japan

Fujitsu

N

Y

Y

N

N

Other than Fujitsu

N

N

Y

N

N

Other than Japan

N

N

Y

N

N

T1000
T2000

N

N

N

Y

N

T5120
T5220
T5140
T5240
T5440

N

N

N

N

Y
(*1)

SPARC

T3 series

N

N

N

N

Y

T4 series

N

N

N

N

Y

T5 Series

N

N

N

N

Y

T7 Series

N

N

N

N

Y

S7 Series

N

N

N

N

Y

Y: Required N: Not required
(*1) When using ILOM Reset, you need firmware for SPARC Enterprise server (System Firmware 7.1.6.d or later).


The following table shows the shutdown agent necessary for virtualized environments.

Table 5.2 Shutdown agents necessary for virtualized environments (Oracle VM Server for SPARC environment)

Server machine type name

Shutdown agent

XSCF SNMP

ILOM

ICMP

Control domain

Guest domain

Control domain

Guest domain

Panic

Reset

PPAR Reset

Panic

Reset

Panic

Reset

Check

SPARC
Servers

SPARC M10
SPARC M12

Y
(*1)

Y
(*1)

Y
(*1)

Y
(*2)

Y
(*2)

N

N

Y
(*3)

SPARC

T3 series
T4 series
T5 Series
T7 Series
S7 Series

N

N

N

N

N

Y

Y

N

Y: Required N: Not required
(*1) Required if used in the cluster between control domains.
(*2) Required if used in the cluster between guest domains, or if the I/O fencing function and the XSCF SNMP shutdown agent are used in combination.
(*3) Required if the I/O fencing function and the ICMP shutdown agent are used in combination.

Table 5.3 Shutdown agents necessary for virtualized environments (Oracle Solaris Kernel Zones environment)

Server machine type name

Shutdown agent

KZONE

XSCF SNMP

ILOM

Kernel Zone

Control domain

Guest domain

Control domain

Panic

Reset

Check

Panic

Reset

PPAR Reset

Panic

Reset

Panic

Reset

SPARC
Servers

SPARC M10
SPARC M12

Y

Y

Y

Y
(*1)

Y
(*1)

Y
(*1)

Y

Y

N

N

SPARC

T4 series
T5 Series
T7 Series
S7 Series

Y

Y

Y

N

N

N

N

N

Y
(*1)

Y
(*1)

Y: Required N: Not required
(*1) Not required to configure the shutdown facility if you configure cluster system between Kernel Zones within a same physical partition.

Note

  • When you are operating the shutdown facility by using one of the following shutdown agents, do not use the console.

    • XSCF Panic

    • XSCF Reset

    • XSCF Break

    • ILOM Panic

    • ILOM Reset

    If you cannot avoid using the console, stop the shutdown facility of all nodes beforehand. After using the console, cut the connection with the console, start the shutdown facility of all nodes, and then check that the status is normal. For details on stop, start, and the state confirmation of the shutdown facility, see the manual page describing sdtool(1M).

  • In the /etc/inet/hosts file, you must describe the IP addresses and the host names of the administrative LAN used by the shutdown facility for all nodes. Check that the IP addresses and host names of all nodes are described.

  • When you set up asynchronous RCI monitoring, you must specify the timeout interval (kernel parameter) in /etc/system for monitoring via SCF/RCI. For kernel parameter settings, see the section "3.2.3 Checking and Setting the Kernel Parameters."

  • If a node's AC power supply is suddenly disconnected during operation of the cluster system, the PRIMECLUSTER, after putting the node for which the power supply was cut into LEFTCLUSTER status, may disconnect the console. In this instance, after confirming that the node's power supply is in fact disconnected, cancel the LEFTCLUSTER status using the cftool -k command. Afterwards, reconnect the console and switch on the power supply to the node.

  • If the SCF/RCI is malfunctioning or if there is the detection of a hardware error such as the RCI cable being disconnected or detection of redundant RCI address settings, it will take a maximum of 10 minutes (from the time that the error is detected or the shutdown facility is started up) until those statuses are reflected to the sdtool -s display or shutdown facility status display screen.

  • After setting the shutdown agent, conduct the cluster node forced stop test to check that the cluster nodes have undergone a forced stop correctly. For details on the cluster node forced stop test, see "1.4 Test."

  • For using the Migration function of Oracle VM Server for SPARC, see "Chapter 17 When Using the Migration Function in Oracle VM Server for SPARC Environment."

  • To make the administrative LAN, used in the shutdown facility, redundant by GLS, use the logical IP address takeover function of NIC switching mode, and configure the physical IP address for the administrative LAN of the shutdown facility.

See

For details on the shutdown facility and the asynchronous monitoring function, refer to the following manuals:

  • "2.3.5 PRIMECLUSTER SF" in "PRIMECLUSTER Concepts Guide."

  • "Chapter 7 Shutdown Facility" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide".

5.1.2.1 For SPARC M10 and M12

When using the I/O fencing function and the ICMP shutdown agent in combination, refer to "5.1.2.6 Using ICMP Shutdown Agent in SPARC M10 and M12."

5.1.2.1.1 Checking XSCF Information

The SNMP asynchronous monitoring function of the shutdown facility uses XSCF.

The connection method to XSCF can be selected from SSH or the telnet. Default connection is SSH.

Create the login user account for the shutdown facility in XSCF before setting the shutdown facility.

After that, make sure the following settings concerning XSCF are correctly set:

Note

When the connection to XSCF is a serial port connection alone, it is not supported in the shutdown facility. Connect to XSCF via SSH or telnet by using XSCF-LAN.

Record the following information that is necessary for setting the shutdown facility.

Information

Description

(1)

PPAR-ID

Identification ID of the physical partition (PPAR) in which the logical domain of the cluster node belongs to.
In SPARC M10-1,M10-4, and M12-2, the value is "0".
In SPARC M10-4S and M12-2S, the value is an integer ranged from 0 to 15.

Execute the showpparstatus -a command on XSCF to display all PPAR-ID status.
In the results of the showpparstatus -a command, PPAR-ID is displayed as two-digit by prefixing zero when it is one-digit.
In this case, excluding the prefixing zero, take one-digit as a memo.

Example: If zero is prefixed to the PPAR-ID status in the results of the showpparstatus -a command below, take zero as a memo if PPAR-ID is "00". Take one as a memo if PPAR-ID is "01".

XSCF> showpparstatus -a
PPAR-ID        PPAR Status
00             Running
01             Running
XSCF>

(2)

Domain-name

Logical domain name of the cluster node.
Execute the virtinfo -a command on each node, and take the logical domain name as a memo.

# virtinfo -a
Domain role: LDoms control I/O service root
Domain name: primary
             ^^^^^^^logical domain name
Domain UUID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Control domain: xxxxx
Chassis serial#: xxxxxxxxxx
#

(3)

XSCF-name1

Host name or IP address for XSCF-LAN#0 of the unit in which the logical domain of the cluster node exists. (*1, *2, *3)

(4)

XSCF-name2

Host name or IP address for XSCF-LAN#1 of the unit in which the logical domain of the cluster node exists. (*1, *2, *3)

(5)

User-Name

User name to log in to the unit in which the logical domain of the cluster node exists. (*4, *5)

(6)

Password

Password to log in to the unit in which the logical domain of the cluster node exists. (*4, *5)

(7)

Administrative LAN

Administrative LAN of the cluster node used for the shutdown facility.

(8)

Asynchronous monitoring sub-LAN

Asynchronous monitoring sub-LAN of the cluster node used for the shutdown facility. (*3)

*1) When the network routing is set, the IP address of XSCF need not be the same to management LAN segment of the cluster node.

*2) For SPARC M10-4S and M12-2S, specify the XSCF takeover IP address.

*3) In the configuration where the asynchronous monitoring sub-LAN is not used, record the IP address of XSCF-LAN#0 to "XSCF-name1" and the host name corresponding to the IP address of XSCF-name1 to "XSCF-name2." For the configuration where the asynchronous monitoring sub-LAN is not used, see "2.2.2 XSCF Configuration in SPARC M10 and M12."

*4) In the environment where XSCF is duplexed, a combination of a user name and a password for 2 of the XSCF must be the same.

*5) To use the Migration function, set a combination of a user name and password for the XSCF and the connection method to the XSCF to be consistent on all nodes.

See the figure below to check the information used to set the shutdown facility.

Figure 5.1 Information used to set the shutdown facility when configuring the cluster on the control domain (reference)

See

For information on how to configure and confirm XSCF, see the "Fujitsu SPARC M12 and Fujitsu M10/SPARC M10 System Operation and Administration Guide."

5.1.2.1.2 Setting SNMP

Make settings for SNMP to use the SNMP asynchronous monitoring function.

Note

Port numbers for SNMP need to be changed under the following condition. For details, see "9.2.4 Changing Port Numbers for SNMP".

  • When the port number 9385 used for the shutdown facility overlaps with a port number of the other products.

Setting up information related to the SNMP agent of XSCF

Set up the SNMP agent on all XSCF in the cluster.

  1. Execute the showsnmp command to display SNMP settings.

    XSCF> showsnmp
  2. Execute the setsnmp command to set up the trap transmission for all the nodes configuring the cluster.

    XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the administrative LAN on node1]
    XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the asynchronous monitoring sub-LAN on node1]
    XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the administrative LAN on node2]
    XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the asynchronous monitoring sub-LAN on node2]

    Example

    • XSCF on node1

      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the administrative LAN on node1]
      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the asynchronous monitoring sub-LAN on node1]
      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the administrative LAN on node2]
      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the asynchronous monitoring sub-LAN on node2]
    • XSCF on node2

      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the administrative LAN on node1]
      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the asynchronous monitoring sub-LAN on node1]
      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the administrative LAN on node2]
      XSCF> setsnmp addtraphost -t v2 -s FJSVcldev -p 9385 [IP address of the asynchronous monitoring sub-LAN on node2]
  3. Execute the setsnmp command to enable the SNMP agent.

    XSCF> setsnmp enable
  4. Execute the showsnmp command to check that the settings are enabled.

    XSCF> showsnmp

See

For information on how to configure and confirm XSCF related to SNMP agents, see the "Fujitsu SPARC M12 and Fujitsu M10/SPARC M10 System Operation and Administration Guide."

5.1.2.1.3 Using the Shutdown Configuration Wizard

Starting up the shutdown configuration wizard

From the CF main window of the Cluster Admin screen, select the Tool menu and then Shutdown Facility -> Configuration Wizard. The shutdown configuration wizard will start.

Note

You can also configure the shutdown facility immediately after you complete the CF configuration with the CF wizard.

The following confirmation popup screen will appear. Click Yes to start the shutdown configuration wizard.

Selecting a shutdown agent

The selection screen for the shutdown agent will appear.

Confirm the hardware machine type and select the appropriate shutdown agent.

After selection, click Next.

Information

If you select a shutdown agent, the timeout value is automatically set. For details on the timeout value of the shutdown agent, see "7.2.2 Configuration file of SF" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."

  • For XSCF Domain Panic/XSCF Domain Reset/XSCF PPAR Reset

    Timeout value = 20 (seconds)

Configuring XSCF

The screen for entering the information of XSCF will appear.

Enter the settings for XSCF that you recorded in "5.1.2.1.1 Checking XSCF Information".

PPAR-ID

Specify the identification ID of the physical partition (PPAR) in which the logical domain of the cluster node belongs to.

Make sure to enter "0" for SPARC M10-1, M10-4, and M12-2.

For SPARC M10-4S and M12-2S, specify an integer ranged from 0 to 15.

Domain-name

Specify the logical domain name of the cluster node.

During initial setup, the logical domain name acquired from each node will be displayed as the initial value.

When changing the settings, the previous set value will be displayed on the screen.

Check that the logical domain name displayed is correct.

Change the logical domain name if it is wrong.

Execute the virtinfo -a command on each node and enter the logical domain name displayed.

Specify the input character string using up to 255 characters starting with an alphabetic letter and consisting of only alphanumeric characters and "-" (hyphens) and "." (period).

XSCF-name1

Specify the host name or IP address for XSCF-LAN#0 of the cabinet in which the logical domain of the cluster node exists.

Available IP addresses are IPv4 addresses.

For SPARC M10-4S and M12-2S environments, specify the XSCF takeover IP address.

XSCF-name2

Specify the host name or IP address for XSCF-LAN#1 of the cabinet in which the logical domain of the cluster node exists.

Available IP addresses are IPv4 addresses.

For SPARC M10-4S and M12-2S environments, specify the XSCF takeover IP address.

User-Name

Enter the user name to log in to the XSCF of the cabinet where the logical address of the cluster node exists.

Password

Enter the password to log in to the XSCF of the cabinet where the logical address of the cluster node exists.

Note

  • In the environment where XSCF is duplexed, a combination of a user name and a password for 2 of the XSCF must be the same.

  • To use the Migration function, set a combination of a user name and password for the XSCF and the connection method to the XSCF to be consistent on all nodes.

  • In the configuration where the asynchronous monitoring sub-LAN is not used, specify the IP address of XSCF-LAN#0 to "XSCF- name1" and the host name corresponding to the IP address specified in XSCF- name1 to "XSCF-name2." For the configuration where the asynchronous monitoring sub-LAN is not used, see "2.2.2 XSCF Configuration in SPARC M10 and M12."

Upon the completion of configuration, click Next.

Entering node weights and administrative IP addresses

The screen for entering the weights of the nodes and the IP addresses for the administrative LAN will appear.

Enter the weights of the nodes and the IP addresses for the administrative LAN.

Weight

Enter the weight of the node that constitutes the cluster. Weight is used to identify the survival priority of the node group that constitutes the cluster. Possible values for each node range from 1 to 300.
For details on survival priority and weight, refer to the explanations below.

Admin IP

Enter an IP address directly or click the tab to select the host name that is assigned to the administrative IP address.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

Upon the completion of configuration, click Next.

Survival priority

Even if a cluster partition occurs due to a failure in the cluster interconnect, all the nodes will still be able to access the user resources. For details on the cluster partition, see "1.2.2.1 Protecting data integrity" in "PRIMECLUSTER Concepts Guide".
To guarantee the consistency of the data constituting user resources, you have to determine the node groups to survive and those that are to be forcibly stopped.
The weight assigned to each node group is referred to as a "Survival priority" under PRIMECLUSTER.
The greater the weight of the node, the higher the survival priority. Conversely, the less the weight of the node, the lower the survival priority. If multiple node groups have the same survival priority, the node group that includes a node with the name that is first in alphabetical order will survive.

Survival priority can be calculated based on the following formula:

Survival priority = SF node weight + ShutdownPriority of userApplication

Note

When SF calculates the survival priority, each node will send its survival priority to the remote node via the administrative LAN. If any communication problem of the administrative LAN occurs, the survival priority will not be able to reach. In this case, the survival priority will be calculated only by the SF node weight.

SF node weight (Weight):

Weight of node. Default value = 1. Set this value while configuring the shutdown facility.

userApplication ShutdownPriority:

Set this attribute when userApplication is created. For details on how to change the settings, see "11.1 Changing the Operation Attributes of a Cluster Application".

See

For details on the ShutdownPriority attribute of userApplication, see "Attributes".

Survival scenarios

The typical scenarios that are implemented are shown below:

[Largest node group survival]
  • Set the weight of all nodes to 1 (default).

  • Set the attribute of ShutdownPriority of all user applications to 0 (default).

[Specific node survival]
  • Set the "weight" of the node to survive to a value more than double the total weight of the other nodes.

  • Set the ShutdownPriority attribute of all user applications to 0 (default).

    In the following example, node1 is to survive:

[Specific application survival]
  • Set the "weight" of all nodes to 1 (default).

  • Set the ShutdownPriority attribute of the user application whose operation is to continue to a value more than double the total of the ShutdownPriority attributes of the other user applications and the weights of all nodes.

    In the following example, the node for which app1 is operating is to survive:

[Combination of the cluster system between control domains and the cluster system between guest domains for specific control domain survival (recommended)]
  • Set the "weight" the nodes to a power of 2 (1,2,4,8,16,...) in ascending order of the survival priority on each cluster system..

  • The order relation of "weight" set for guest domains must be the same as the corresponding control domains.

    For example, if the survival priority of host1 is higher than that of host2 between control domains, the survival priority of node1 (corresponding to host1) must be higher than those of node2 to 4 (corresponding to host2) between guest domains.

  • Set the ShutdownPriority attribute of all user applications to 0 (default).

    In the following example, nodes are to survive in the order of node1, node2, node3, and node4.

[Combination of the cluster system between control domains and the cluster system between guest domains for the largest control domain survival]

Note

  • If the physical partition is reset, note that operations in the cluster system between guest domains may stop.

  • Create the I/O root domain for this setting.

In the following example, "Specific node survival" is set for the guest domain.

In this case, in the cluster 1 between guest domains, node 11 is saved as a survival node and node 12 is forcibly stopped while node 2 and node 3 are saved as survival nodes and node 1 is forcibly stopped in the cluster between control domains. If the physical partition of unit 0 is reset, note that operations in the cluster 1 between guest domains will stop.

Saving the configuration

Confirm and then save the configuration.

In the left-hand panel of the window, those nodes that constitute the cluster are displayed. The SF node weight is displayed in brackets after those nodes. The timeout value of the shutdown agent is displayed in brackets after the shutdown agent.

Click Next. A popup screen will appear for confirmation.

Select Yes to save the setting.

Displaying the configuration of the shutdown facility

If you save the setting, a screen displaying the configuration of the shutdown facility will appear. On this screen, you can confirm the configuration of the shutdown facility on each node by selecting each node in turn.

Information

You can also view the configuration of the shutdown facility by selecting Shutdown Facility -> Show Status from the Tool menu.

Shut State

"Unknown" is shown during normal system operation. If an error occurs and the shutdown facility stops the relevant node successfully, "Unknown" will change to "KillWorked".

Test State

Indicates the state in which the path to shut down the node is tested when a node error occurs. If the test of the path has not been completed, "Unknown" will be displayed. If the configured shutdown agent operates normally, "Unknown" will be changed to "TestWorked".

Init State

Indicates the state in which the shutdown agent is initialized.

To exit the configuration wizard, click Finish. Click Yes in the confirmation popup screen that appears.

Note

On this screen, confirm that the shutdown facility is operating normally.

  • If "TestFailed" is displayed in the test state and the message number 7245 is logged in the /var/adm/messages file, the first SSH connection to XSCF has not completed yet. Connect to XSCF from all the cluster nodes via SSH, and complete the user inquiry (such as generation of RSA key) at the first SSH connection.

  • If "TestFailed" is displayed in the test state, the configuration information of the logical domains may not be saved. Use the ldm add-spconfig command to save the information when it is not saved.

  • If "InitFailed" is displayed in the Initial state even when the configuration of the shutdown facility has been completed or if "Unknown" is displayed in the Test state or "TestFailed" is highlighted in red, the agent or hardware configuration may contain an error. Check the /var/adm/messages file and the console for an error message. Then, apply appropriate countermeasures as instructed the message that is output.

  • If the connection to XSCF from the shutdown facility is telnet and also the setting of the connection to XSCF is SSH, the test state becomes TestFailed at this point in time. Confirm that the shutdown facility is operating normally, after performing the "5.1.2.1.4 Setting of the connection method to the XSCF".

See

For details on how to respond to the error messages that may be output, see "PRIMECLUSTER Messages."

Checking if the SNMP trap is received

Check if the SNMP trap is received in all XSCFs that constitute a cluster system.

  1. Execute the following command in XSCF.

    The pseudo error notification trap is issued.

    XSCF> rastest -c test
  2. For two IP addresses for the management LAN and the asynchronous monitoring sub-LAN of XSCF, check that the pseudo error notification trap is output to the /var/adm/messages file on all nodes. For the environment where the asynchronous monitoring sub-LAN is not used, the pseudo error notification trap is output only for the IP address for the management LAN. If the output message contains the following words "FF020001" and "M10-Testalert" or "M12-Testalert", it means that the SNMP trap has been received successfully.

    Example: When the IP address of XSCF is "192.168.10.10"

    snmptrapd[Process ID]: [ID 702911 daemon.warning] 192.168.10.10 [192.168.10.10]: Trap 
    DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (3557) 0:00:35.57, SNMPv2-MIB::snmpTrapOID.0 = 
    OID: SNMPv2-SMI::enterprises.211.1.15.4.1.2.0.1, 
    SNMPv2-SMI::enterprises.211.1.15.4.1.1.12.2.1.13.100.0.254.0.254.0 = INTEGER: 3, 
    SNMPv2-SMI::enterprises.211.1.15.4.1.2.1.2.0 = INTEGER: 1, 
    SNMPv2-SMI::enterprises.211.1.15.4.1.1.4.3.0 = STRING: "PZ31426053", 
    SNMPv2-SMI::enterprises.211.1.15.4.1.1.4.2.0 = STRING: "SPARC M10-1", 
    SNMPv2-SMI::enterprises.211.1.15.4.1.1.4.1.0 = "", SNMPv2-SMI::enterprises.211.1.15.4.1.2.1.14.0 
    = STRING: "FF020001", SNMPv2-SMI::enterprises.211.1.15.4.1.2.1.15.0 = STRING: "Oct 27 10:54:34.288 
    JST 2014", SNMPv2-SMI::enterprises.211.1.15.4.1.2.1.16.0 = 
    STRING: "https://support.oracle.com/msg/M10-Testalert 
    <https://support.oracle.com/msg/M10-Testalert>", SNMPv2-SMI::enterprises.211.1.15.4.1.2.1.17.0 = 
    STRING: "TZ1422A010  ", SNMPv2-SMI::enterprises.211.1.15.4.1.2.1.18.0 = STRING: "CA07363-D011

    If the pseudo error notification trap is not output, there may be an error in the SNMP settings. See the "5.1.2.1.2 Setting SNMP" to correct the settings.

    See

    For the content of the pseudo error notification trap, see "Fujitsu SPARC M12 and Fujitsu M10/SPARC M10 XSCF MIB and Trap Lists."

5.1.2.1.4 Setting of the connection method to the XSCF

The default of setting of the connection method to the XSCF is SSH connection for SPARC M10 and M12.

The procedure when changing to the telnet connection is the following.

Change of the connection method

Execute the following command in all nodes to change a connection method.

# /etc/opt/FJSVcluster/bin/clsnmpsetup -m -t telnet

After changing the connection method, execute the clsnmpsetup -l command to check that "telnet" is displayed in the "connection-type" field.

# /etc/opt/FJSVcluster/bin/clsnmpsetup -l
device-name cluster-host-name PPAR-ID domain-name IP-address1 IP-address2 user-name connection-type
-------------------------------------------------------------------------------------------------
xscf        node1             1       primary     xscf11      xscf12      xuser     telnet
xscf        node2             2       primary     xscf21      xscf22      xuser     telnet

Note

To use the Migration function, set a combination of a user name and password for the XSCF and the connection method to the XSCF to be consistent on all nodes.

Starting up the shutdown facility

Execute the following command in each node, and confirm the shutdown facility has started.

# /opt/SMAW/bin/sdtool -s

If the state of configuration of shutdown facility is displayed, shutdown facility is started.

If "The RCSD is not running" is displayed, shutdown facility is not started.

If shutdown facility is started, execute the following command, and restart the shutdown facility.

# /opt/SMAW/bin/sdtool -r

If shutdown facility is not started, execute the following command, and start the shutdown facility.

# /opt/SMAW/bin/sdtool -b

5.1.2.2 For SPARC Enterprise M3000, M4000, M5000, M8000, or M9000

5.1.2.2.1 Checking Console Configuration

In SPARC Enterprise M3000, M4000, M5000, M8000, and M9000, XSCF is used. The connection method to XSCF as the shutdown facility can be selected from SSH or the telnet.

Default connection is SSH.

Create the login user account for the shutdown facility in XSCF before setting the shutdown facility.

After that, make sure the following settings concerning XSCF are correctly set:

Note

When the connection to XSCF is a serial port connection alone, it is not supported in the shutdown facility. Please use XSCF-LAN.

Moreover, record the following information on XSCF.

See

For information on how to configure and confirm XSCF, see the "XSCF User's Guide".

5.1.2.2.2 Using the Shutdown Configuration Wizard

The required shutdown agent varies depending on the hardware machine type.

Check the following combinations of the hardware machine types and shutdown agents.

Setting up the operation environment for the asynchronous RCI monitoring

This setting is required only for the following cases:

When you set up asynchronous RCI monitoring, you must specify the timeout interval (kernel parameter) in /etc/system for monitoring via SCF/RCI.

See

For kernel parameter settings, see "3.2.3 Checking and Setting the Kernel Parameters."

Note

You need to reboot the system to enable the changed value.

Starting up the shutdown configuration wizard

From the CF main window of the Cluster Admin screen, select the Tool menu and then Shutdown Facility -> Configuration Wizard. The shutdown configuration wizard will start.

Note

You can also configure the shutdown facility immediately after you complete the CF configuration with the CF wizard.

The following confirmation popup screen will appear. Click Yes to start the shutdown configuration wizard.

Selecting a shutdown agent

The selection screen for the shutdown agent will appear.

Figure 5.2 Selecting a shutdown agent

Confirm the hardware machine type and select the appropriate shutdown agent.

  1. SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 provided by companies other than Fujitsu in Japan or SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 with logos of both Fujitsu and Oracle provided in other than Japan

  2. SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 (other than above) provided by Fujitsu in Japan

a) SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 provided by companies other than Fujitsu in Japan or SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 with logos of both Fujitsu and Oracle provided in other than Japan

Select XSCF (SPARC Enterprise M-series).

If you select XSCF (SPARC Enterprise M-series), Use RCI is displayed. Clear the checkbox of Use RCI.

The following shutdown agents are automatically set:

After selection, click Next.

b) SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 (other than above) provided by Fujitsu in Japan

Select XSCF (SPARC Enterprise M-series).

If you select XSCF (SPARC Enterprise M-series), Use RCI is displayed, however do not clear the checkbox of Use RCI.

The following shutdown agents are automatically set:

After selection, click Next.

Information

If you select a shutdown agent, the timeout value is automatically set. For details on the timeout value of the shutdown agent, see "7.2.2 Configuration file of SF" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."

  • For XSCF Panic/XSCF Break

    • 4 or fewer nodes

      Timeout value = 20 (seconds)

    • 5 or more nodes

      Timeout value = 6 x number of cluster nodes + 2 (seconds)

      Example for 5 nodes: 6 x 5 + 2 = 32 (seconds)

  • For XSCF Reset

    • 4 or fewer nodes

      Timeout value = 40 (seconds)

    • 5 or more nodes

      Timeout value = 6 x number of cluster nodes + 22 (seconds)

      Example for 5 nodes: 6 x 5 + 22 = 52 (seconds)

  • For RCI Panic/RCI Reset

    Timeout value = 20 (seconds)

Configuring XSCF

The screen for entering the information of XSCF will appear.

Figure 5.3 Selecting the number of XSCF IP addresses

Select the number of XSCF IP addresses to use in the shutdown facility.

Note

If XSCF unit is duplexed but XSCF-LAN is not duplexed, the number of XSCF IP addresses is 1.

In this case, specify the virtual IP (takeover IP address) for the XSCF IP addresses.

Select the number of XSCF IP addresses, and click Next.

The screen to set the information of XSCF will appear.


"a) For selecting [1] for the number of XSCF IP addresses" and "b) For selecting [2] for the number of XSCF IP addresses" are respectively explained below.

a) For selecting [1] for the number of XSCF IP addresses

Enter the settings for XSCF that you recorded in "5.1.2.2.1 Checking Console Configuration".

XSCF-name

Enter the IP address of XSCF or the host name of XSCF that is registered in the /etc/inet/hosts file.

Available IP addresses are IPv4 addresses.

User-Name

Enter a user name to log in to XSCF.

Password

Enter a password to log in to XSCF.

Upon the completion of configuration, click Next.

b) For selecting [2] for the number of XSCF IP addresses

Enter the settings for XSCF that you recorded in "5.1.2.2.1 Checking Console Configuration".

XSCF-name1

Enter the IP address of XSCF-LAN#0 or the host name that is registered in the /etc/inet/hosts file.

Available IP addresses are IPv4 addresses.

XSCF-name2

Enter the IP address of XSCF-LAN#1 or the host name that is registered in the /etc/inet/hosts file.

Available IP addresses are IPv4 addresses.

User-Name

Enter a user name to log in to XSCF.

Password

Enter a password to log in to XSCF.

Note

A combination of a user name and a password for 2 of the XSCF must be the same.

Upon the completion of configuration, click Next.

Configuring Wait for PROM

Note

Wait for PROM is currently not supported.
You do not have to select the checkbox, and then click Next.

Figure 5.4 Configure Wait for PROM

Configuring hardware selection

If you select XSCF (SPARC Enterprise M-series) as the shutdown agent, the screen for selecting hardware will appear.

Figure 5.5 Configuring hardware selection

Upon the completion of configuration, click Next.

Entering node weights and administrative IP addresses

The screen for entering the weights of the nodes and the IP addresses for the administrative LAN will appear.

Figure 5.6 Entering node weights and administrative IP addresses

Enter the weights of the nodes and the IP addresses for the administrative LAN.

Weight

Enter the weight of the node that constitutes the cluster. Weight is used to identify the survival priority of the node group that constitutes the cluster. Possible values for each node range from 1 to 300.
For details on survival priority and weight, refer to the explanations below.

Admin IP

Enter an IP address directly or click the tab to select the host name that is assigned to the administrative IP address.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

Upon the completion of configuration, click Next.

Survival priority

Even if a cluster partition occurs due to a failure in the cluster interconnect, all the nodes will still be able to access the user resources. For details on the cluster partition, see "1.2.2.1 Protecting data integrity" in "PRIMECLUSTER Concepts Guide".
To guarantee the consistency of the data constituting user resources, you have to determine the node groups to survive and those that are to be forcibly stopped.
The weight assigned to each node group is referred to as a "Survival priority" under PRIMECLUSTER.
The greater the weight of the node, the higher the survival priority. Conversely, the less the weight of the node, the lower the survival priority. If multiple node groups have the same survival priority, the node group that includes a node with the name that is first in alphabetical order will survive.

Survival priority can be calculated based on the following formula:

Survival priority = SF node weight + ShutdownPriority of userApplication

Note

When SF calculates the survival priority, each node will send its survival priority to the remote node via the administrative LAN. If any communication problem of the administrative LAN occurs, the survival priority will not be able to reach. In this case, the survival priority will be calculated only by the SF node weight.

SF node weight (Weight):

Weight of node. Default value = 1. Set this value while configuring the shutdown facility.

userApplication ShutdownPriority:

Set this attribute when userApplication is created. For details on how to change the settings, see "11.1 Changing the Operation Attributes of a Cluster Application".

See

For details on the ShutdownPriority attribute of userApplication, see "Attributes".

Survival scenarios

The typical scenarios that are implemented are shown below:

[Largest node group survival]
  • Set the weight of all nodes to 1 (default).

  • Set the attribute of ShutdownPriority of all user applications to 0 (default).

[Specific node survival]
  • Set the "weight" of the node to survive to a value more than double the total weight of the other nodes.

  • Set the ShutdownPriority attribute of all user applications to 0 (default).

    In the following example, node1 is to survive:

[Specific application survival]
  • Set the "weight" of all nodes to 1 (default).

  • Set the ShutdownPriority attribute of the user application whose operation is to continue to a value more than double the total of the ShutdownPriority attributes of the other user applications and the weights of all nodes.

    In the following example, the node for which app1 is operating is to survive:

Saving the configuration

Confirm and then save the configuration.

In the left-hand panel of the window, those nodes that constitute the cluster are displayed. The SF node weight is displayed in brackets after those nodes. The timeout value of the shutdown agent is displayed in brackets after the shutdown agent.

Figure 5.7 Saving the configuration

Click Next. A popup screen will appear for confirmation.

Select Yes to save the setting.

Displaying the configuration of the shutdown facility

If you save the setting, a screen displaying the configuration of the shutdown facility will appear. On this screen, you can confirm the configuration of the shutdown facility on each node by selecting each node in turn.

Information

You can also view the configuration of the shutdown facility by selecting Shutdown Facility -> Show Status from the Tool menu.

Figure 5.8 Show Status

Shut State

"Unknown" is shown during normal system operation. If an error occurs and the shutdown facility stops the relevant node successfully, "Unknown" will change to "KillWorked".

Test State

Indicates the state in which the path to shut down the node is tested when a node error occurs. If the test of the path has not been completed, "Unknown" will be displayed. If the configured shutdown agent operates normally, "Unknown" will be changed to "TestWorked".

Init State

Indicates the state in which the shutdown agent is initialized.

To exit the configuration wizard, click Finish. Click Yes in the confirmation popup screen that appears.

Note

On this screen, confirm that the shutdown facility is operating normally.

  • If "InitFailed" is displayed in the Initial state even when the configuration of the shutdown facility has been completed or if "Unknown" is displayed in the Test state or "TestFailed" is highlighted in red, the agent or hardware configuration may contain an error. Check the /var/adm/messages file and the console for an error message. Then, apply appropriate countermeasures as instructed the message that is output.

  • If the connection to XSCF from the shutdown facility is telnet and also the setting of the connection to XSCF is SSH, the test state becomes TestFailed at this point in time. Confirm that the shutdown facility is operating normally, after the "5.1.2.2.3 Setting of the connection method to the XSCF".

See

For details on how to respond to the error messages that may be output, see "PRIMECLUSTER Messages."

5.1.2.2.3 Setting of the connection method to the XSCF

The default of setting of the connection method to the XSCF is SSH connection, in the SPARC Enterprise M3000, M4000, M5000, M8000, or M9000. The procedure when changing to the telnet connection is the following.

Change of the connection method

Execute the following command in all nodes to change a connection method.

# /etc/opt/FJSVcluster/bin/clrccusetup -m -t telnet

After changing the connection method, execute the clrccusetup -l command to check that "telnet" is displayed in the "connection-type" field.

# /etc/opt/FJSVcluster/bin/clrccusetup -l
Device-name cluster-host-name IP-address host-name user-name connection-type
-------------------------------------------------------------------------------
xscf        fuji2             xscf2      1         xuser     telnet
xscf        fuji3             xscf3      1         xuser     telnet

Starting up the shutdown facility

Execute the following command in each node, and confirm the shutdown facility has started.

# /opt/SMAW/bin/sdtool -s

If the state of configuration of shutdown facility is displayed, shutdown facility is started.

If "The RCSD is not running" is displayed, shutdown facility is not started.

If shutdown facility is started, execute the following command, and restart the shutdown facility.

# /opt/SMAW/bin/sdtool -r

If shutdown facility is not started, execute the following command, and start the shutdown facility.

# /opt/SMAW/bin/sdtool -b

5.1.2.3 For SPARC Enterprise T5120, T5220, T5140, T5240, T5440, SPARC T3, T4 , T5, T7, S7 series

5.1.2.3.1 Checking Console Configuration

In SPARC Enterprise T5120, T5220, T5140, T5240, T5440, SPARC T3, T4, T5, T7, S7 series, ILOM is used.

Create the login user account for the shutdown facility in ILOM before setting the shutdown facility.

After that, make sure the following settings concerning ILOM are correctly set:

If you are using ILOM 3.0, please check the following settings as well.

Moreover, record the following information on ILOM.

*1) You can check if CLI mode of the login user account is set to the default mode by the following procedure.

  1. Log in CLI of ILOM.

  2. Check prompt status.
    Prompt status that is set to the default mode.
    ->
    Prompt status that is set to alom mode.
    sc>

*2) Due to compatibility of ILOM 3.0 with ILOM 2.x, this operation is also available for users with administrator or operator privileges from ILOM 2.x.

*3) When the network routing is set, the IP address of ILOM need not be the same to management LAN segment of the cluster node.

See

For details on how to make and check ILOM settings, please refer to the following documentation.

  • For ILOM 2.x:

    • "Integrated Lights Out Manager User's Guide"

  • For ILOM 3.0:

    • "Integrated Lights Out Manager (ILOM) 3.0 Concepts Guide"

    • "Integrated Lights Out Manager (ILOM) 3.0 Web Interface Procedures Guide"

    • "Integrated Lights Out Manager (ILOM) 3.0 CLI Procedures Guide"

    • "Integrated Lights Out Manager (ILOM) 3.0 Getting Started Guide"

5.1.2.3.2 Using the Shutdown Configuration Wizard

The required shutdown agent varies depending on the hardware machine type.

Check the following combinations of the hardware machine types and shutdown agents.

Starting up the shutdown configuration wizard

From the CF main window of the Cluster Admin screen, select the Tool menu and then Shutdown Facility -> Configuration Wizard. The shutdown configuration wizard will start.

Note

You can also configure the shutdown facility immediately after you complete the CF configuration with the CF wizard.

The following confirmation popup screen will appear. Click Yes to start the shutdown configuration wizard.

Selecting a shutdown agent

The selection screen for the shutdown agent will appear.

Figure 5.9 Selecting a shutdown agent

Confirm the hardware machine type and select the appropriate shutdown agent.

Select ILOM, and then click Next.

Information

If you select a shutdown agent, the timeout value is automatically set. For details on the timeout value of the shutdown agent, see "7.2.2 Configuration file of SF" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."

  • For ILOM Panic/ILOM Reset

    Timeout value = 70 (seconds)

Configuring ILOM

The screen for entering the information of ILOM will appear.

Figure 5.10 Configuring ILOM

Enter the settings for ILOM that you recorded in "5.1.2.3.1 Checking Console Configuration".

ILOM-Name

Enter the IP address of ILOM or the host name of ILOM that is registered in the /etc/inet/hosts file.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

User-Name

Enter a user name to log in to ILOM.

Password

Enter a password to log in to ILOM.

Upon the completion of configuration, click Next.

Entering node weights and administrative IP addresses

The screen for entering the weights of the nodes and the IP addresses for the administrative LAN will appear.

Figure 5.11 Entering node weights and administrative IP addresses

Enter the weights of the nodes and the IP addresses for the administrative LAN.

Weight

Enter the weight of the node that constitutes the cluster. Weight is used to identify the survival priority of the node group that constitutes the cluster. Possible values for each node range from 1 to 300.
For details on survival priority and weight, refer to the explanations below.

Admin IP

Enter an IP address directly or click the tab to select the host name that is assigned to the administrative IP address.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

Upon the completion of configuration, click Next.

Survival priority

Even if a cluster partition occurs due to a failure in the cluster interconnect, all the nodes will still be able to access the user resources. For details on the cluster partition, see "1.2.2.1 Protecting data integrity" in "PRIMECLUSTER Concepts Guide".
To guarantee the consistency of the data constituting user resources, you have to determine the node groups to survive and those that are to be forcibly stopped.
The weight assigned to each node group is referred to as a "Survival priority" under PRIMECLUSTER.
The greater the weight of the node, the higher the survival priority. Conversely, the less the weight of the node, the lower the survival priority. If multiple node groups have the same survival priority, the node group that includes a node with the name that is first in alphabetical order will survive.

Survival priority can be calculated based on the following formula:

Survival priority = SF node weight + ShutdownPriority of userApplication

Note

When SF calculates the survival priority, each node will send its survival priority to the remote node via the administrative LAN. If any communication problem of the administrative LAN occurs, the survival priority will not be able to reach. In this case, the survival priority will be calculated only by the SF node weight.

SF node weight (Weight):

Weight of node. Default value = 1. Set this value while configuring the shutdown facility.

userApplication ShutdownPriority:

Set this attribute when userApplication is created. For details on how to change the settings, see "11.1 Changing the Operation Attributes of a Cluster Application".

See

For details on the ShutdownPriority attribute of userApplication, see "Attributes".

Survival scenarios

The typical scenarios that are implemented are shown below:

[Largest node group survival]
  • Set the weight of all nodes to 1 (default).

  • Set the attribute of ShutdownPriority of all user applications to 0 (default).

[Specific node survival]
  • Set the "weight" of the node to survive to a value more than double the total weight of the other nodes.

  • Set the ShutdownPriority attribute of all user applications to 0 (default).

    In the following example, node1 is to survive:

[Specific application survival]
  • Set the "weight" of all nodes to 1 (default).

  • Set the ShutdownPriority attribute of the user application whose operation is to continue to a value more than double the total of the ShutdownPriority attributes of the other user applications and the weights of all nodes.

    In the following example, the node for which app1 is operating is to survive:

Saving the configuration

Confirm and then save the configuration.

In the left-hand panel of the window, those nodes that constitute the cluster are displayed. The SF node weight is displayed in brackets after those nodes. The timeout value of the shutdown agent is displayed in brackets after the shutdown agent.

Figure 5.12 Saving the configuration

Click Next. A popup screen will appear for confirmation.

Select Yes to save the setting.

Displaying the configuration of the shutdown facility

If you save the setting, a screen displaying the configuration of the shutdown facility will appear. On this screen, you can confirm the configuration of the shutdown facility on each node by selecting each node in turn.

Information

You can also view the configuration of the shutdown facility by selecting Shutdown Facility -> Show Status from the Tool menu.

Figure 5.13 Show Status

Shut State

"Unknown" is shown during normal system operation. If an error occurs and the shutdown facility stops the relevant node successfully, "Unknown" will change to "KillWorked".

Test State

Indicates the state in which the path to shut down the node is tested when a node error occurs. If the test of the path has not been completed, "Unknown" will be displayed. If the configured shutdown agent operates normally, "Unknown" will be changed to "TestWorked".

Init State

Indicates the state in which the shutdown agent is initialized.

To exit the configuration wizard, click Finish. Click Yes in the confirmation popup screen that appears.

Note

On this screen, confirm that the shutdown facility is operating normally.

  • If "TestFailed" is displayed in the test state and the message number 7043 is logged in the /var/adm/messages file, the first SSH connection to ILOM has not completed yet. Connect to ILOM from all the cluster nodes via SSH, and complete the user inquiry (such as generation of RSA key) at the first SSH connection.

  • If "InitFailed" is displayed in the Initial state even when the configuration of the shutdown facility has been completed or if "Unknown" is displayed in the Test state or "TestFailed" is highlighted in red, the agent or hardware configuration may contain an error. Check the /var/adm/messages file and the console for an error message. Then, apply appropriate countermeasures as instructed the message that is output.

5.1.2.4 For SPARC Enterprise T1000, T2000

5.1.2.4.1 Checking Console Configuration

ALOM in console can be used by SPARC Enterprise T1000 or T2000.

Create the login user account for the shutdown facility in ALOM before setting the shutdown facility.

After that, make sure the following settings concerning ALOM are correctly set:

Note

  • As a default, the connection permission from the outside to ALOM is SSH. In that case, it is not supported in the shutdown facility.

  • When the connection to ALOM is a serial port connection alone, it is not supported in the shutdown facility.

Moreover, record the following information on ALOM.

*1) When the network routing is set, Internet Protocol address of ALOM need not be the same to management LAN segment of the cluster node.

See

For information on how to configure and confirm ALOM, see the "Advanced Lights out Management (ALOM) CMT Guide".

5.1.2.4.2 Using the Shutdown Configuration Wizard

The required shutdown agent varies depending on the hardware machine type.

Check the following combinations of the hardware machine types and shutdown agents.

Starting up the shutdown configuration wizard

From the CF main window of the Cluster Admin screen, select the Tool menu and then Shutdown Facility -> Configuration Wizard. The shutdown configuration wizard will start.

Note

You can also configure the shutdown facility immediately after you complete the CF configuration with the CF wizard.

The following confirmation popup screen will appear. Click Yes to start the shutdown configuration wizard.

Selecting a shutdown agent

The selection screen for the shutdown agent will appear.

Figure 5.14 Selecting a shutdown agent

Confirm the hardware machine type and select the appropriate shutdown agent.

Select ALOM, and then click Next.

Information

If you select a shutdown agent, the timeout value is automatically set. For details on the timeout value of the shutdown agent, see "7.2.2 Configuration file of SF" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."

  • For ALOM Break

    Timeout value = 40 (seconds)

Configuring ALOM

The screen for entering the information of ALOM will appear.

Figure 5.15 Configuring ALOM

Enter the settings for ALOM that you recorded in "5.1.2.4.1 Checking Console Configuration".

ALOM-Name

Enter the IP address of ALOM.

Available IP addresses are IPv4 addresses.

User-Name

Enter a user name to log in to ALOM.

Password

Enter a password to log in to ALOM.

Upon the completion of configuration, click Next.

Entering node weights and administrative IP addresses

The screen for entering the weights of the nodes and the IP addresses for the administrative LAN will appear.

Figure 5.16 Entering node weights and administrative IP addresses

Enter the weights of the nodes and the IP addresses for the administrative LAN.

Weight

Enter the weight of the node that constitutes the cluster. Weight is used to identify the survival priority of the node group that constitutes the cluster. Possible values for each node range from 1 to 300.
For details on survival priority and weight, refer to the explanations below.

Admin IP

Enter an IP address directly or click the tab to select the host name that is assigned to the administrative IP address.

Available IP addresses are IPv4 addresses.

Upon the completion of configuration, click Next.

Survival priority

Even if a cluster partition occurs due to a failure in the cluster interconnect, all the nodes will still be able to access the user resources. For details on the cluster partition, see "1.2.2.1 Protecting data integrity" in "PRIMECLUSTER Concepts Guide".
To guarantee the consistency of the data constituting user resources, you have to determine the node groups to survive and those that are to be forcibly stopped.
The weight assigned to each node group is referred to as a "Survival priority" under PRIMECLUSTER.
The greater the weight of the node, the higher the survival priority. Conversely, the less the weight of the node, the lower the survival priority. If multiple node groups have the same survival priority, the node group that includes a node with the name that is first in alphabetical order will survive.

Survival priority can be calculated based on the following formula:

Survival priority = SF node weight + ShutdownPriority of userApplication

Note

When SF calculates the survival priority, each node will send its survival priority to the remote node via the administrative LAN. If any communication problem of the administrative LAN occurs, the survival priority will not be able to reach. In this case, the survival priority will be calculated only by the SF node weight.

SF node weight (Weight):

Weight of node. Default value = 1. Set this value while configuring the shutdown facility.

userApplication ShutdownPriority:

Set this attribute when userApplication is created. For details on how to change the settings, see "11.1 Changing the Operation Attributes of a Cluster Application".

See

For details on the ShutdownPriority attribute of userApplication, see "Attributes".

Survival scenarios

The typical scenarios that are implemented are shown below:

[Largest node group survival]
  • Set the weight of all nodes to 1 (default).

  • Set the attribute of ShutdownPriority of all user applications to 0 (default).

[Specific node survival]
  • Set the "weight" of the node to survive to a value more than double the total weight of the other nodes.

  • Set the ShutdownPriority attribute of all user applications to 0 (default).

    In the following example, node1 is to survive:

[Specific application survival]
  • Set the "weight" of all nodes to 1 (default).

  • Set the ShutdownPriority attribute of the user application whose operation is to continue to a value more than double the total of the ShutdownPriority attributes of the other user applications and the weights of all nodes.

    In the following example, the node for which app1 is operating is to survive:

Saving the configuration

Confirm and then save the configuration.

In the left-hand panel of the window, those nodes that constitute the cluster are displayed. The SF node weight is displayed in brackets after those nodes. The timeout value of the shutdown agent is displayed in brackets after the shutdown agent.

Figure 5.17 Saving the configuration

Click Next. A popup screen will appear for confirmation.

Select Yes to save the setting.

Displaying the configuration of the shutdown facility

If you save the setting, a screen displaying the configuration of the shutdown facility will appear. On this screen, you can confirm the configuration of the shutdown facility on each node by selecting each node in turn.

Information

You can also view the configuration of the shutdown facility by selecting Shutdown Facility -> Show Status from the Tool menu.

Figure 5.18 Show Status

Shut State

"Unknown" is shown during normal system operation. If an error occurs and the shutdown facility stops the relevant node successfully, "Unknown" will change to "KillWorked".

Test State

Indicates the state in which the path to shut down the node is tested when a node error occurs. If the test of the path has not been completed, "Unknown" will be displayed. If the configured shutdown agent operates normally, "Unknown" will be changed to "TestWorked".

Init State

Indicates the state in which the shutdown agent is initialized.

To exit the configuration wizard, click Finish. Click Yes in the confirmation popup screen that appears.

Note

On this screen, confirm that the shutdown facility is operating normally.

If "InitFailed" is displayed in the Initial state even when the configuration of the shutdown facility has been completed or if "Unknown" is displayed in the Test state or "TestFailed" is highlighted in red, the agent or hardware configuration may contain an error. Check the /var/adm/messages file and the console for an error message. Then, apply appropriate countermeasures as instructed the message that is output.

5.1.2.5 For Oracle Solaris Kernel Zones

5.1.2.5.1 Checking XSCF Information

When building Kernel Zones in SPARC M10 and M12, the KZONE shutdown agent uses XSCF.

To build the cluster system in the control domain, you can use the user account to login to the shutdown facility as in the settings described in "5.1.2.1.1 Checking XSCF Information."

See

For checking the XSCF information, refer to "5.1.2.1.1 Checking XSCF Information."

5.1.2.5.2 Checking ILOM Information

When building Kernel Zones to SPARC T4, T5, T7, S7 series, use ILOM for the KZONE shutdown agent.

See

For checking the ILOM information, refer to "5.1.2.3.1 Checking Console Configuration."

5.1.2.5.3 Logging in to Global Zone Host

To access a host (control domain or guest domain) where Kernel Zones from Kernel Zones via SSH, you need to complete the user inquiry of the first SSH connection (RSA key generation).

On all Kernel Zones (cluster nodes), log in to the hosts where all Kernel Zones operate with the user for the shutdown facility set in "Creating a user for the shutdown facility" of "15.1.1 Software Installation and Configuration of Cluster Environment."

When using the host name for specifying the globalzone hostname, log in with the specified host name.

Example: when the user for the shutdown facility is "user1"

# ssh -l user1 XXX.XXX.XXX.XXX
The authenticity of host 'XXX.XXX.XXX.XXX (XXX.XXX.XXX.XXX)' can't be established.	
RSA key fingerprint is xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx.	
Are you sure you want to continue connecting (yes/no)? Yes	<- Input yes.
5.1.2.5.4 Using the Shutdown Configuration Wizard

Starting up the shutdown configuration wizard

From the CF main window of the Cluster Admin screen, select the Tool menu and then Shutdown Facility -> Configuration Wizard. The shutdown configuration wizard will start.

Note

You can also configure the shutdown facility immediately after you complete the CF configuration with the CF wizard.

The following confirmation popup screen will appear. Click Yes to start the shutdown configuration wizard.

Selecting a shutdown agent

The selection screen for the shutdown agent will appear.

Figure 5.19 Selecting a shutdown agent

Confirm the hardware machine type and select the appropriate shutdown agent.

Information

If you select a shutdown agent, the timeout value is automatically set. For details on the timeout value of the shutdown agent, see "7.2.2 Configuration file of SF" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."

  • For KZONE Panic

    Timeout value = 45 (seconds)

  • For KZONE Reset

    Timeout value = 70 (seconds)

  • For KZONE Check

    • For SPARC M10 and M12

      Timeout value = 20 (seconds)

    • For SPARC T4 , T5, T7, S7 series

      Timeout value = 30 (seconds)

Configuring Kernel Zones

The screen for entering the information of Kernel Zones will appear.

Figure 5.20 Configuring Kernel Zones

zone name

Enter a Kernel Zone name.

Prior to entering a Kernel Zone name, execute the zoneadm list -cv command on the global zone host to check that "solaris-kz" is displayed in BRAND of the zone name to be entered.
If a brand other than "solaris-kz" is displayed, it cannot be used because it is not a Kernel Zone.

globalzone hostname

Enter the IP address of the global zone host (control domain or guest domain) where a Kernel Zone operates, or the host name that is registered in the /etc/inet/hosts file.

If the Kernel Zone was built on the control domain, enter the IP address or host name of the control domain.

If the Kernel Zone was built on the guest domain, enter the IP address or host name of the guest domain.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

User-Name

Enter a user name to log in to the global zone host.

Specify the user for the shutdown facility that was created in "Creating a user for the shutdown facility" of "15.1.1 Software Installation and Configuration of Cluster Environment."

Password

Enter a password to log in to the global zone host.

Upon the completion of configuration, click Next.

Configuring XSCF

When building a Kernel Zone in SPARC M10 and M12, the screen for entering the information of the XSCF will appear.

Figure 5.21 Configuring XSCF

Enter the settings for XSCF that you recorded in "5.1.2.1.1 Checking XSCF Information."

PPAR-ID

Specify the identification ID of the physical partition (PPAR) in which the logical domain of the cluster node belongs to.

Make sure to enter "0" for SPARC M10-1, M10-4, and M12-2.

For SPARC M10-4S and M12-2S, specify the integer ranged from 0 to 15.

domain-name

Specify the logical domain (control domain or guest domain) where Kernel Zones operate.

During initial setup, the logical domain name acquired from each node will be displayed as the initial value.

When changing the settings, the previous set value will be displayed on the screen.


Make sure the logical domain name displayed is correct.

Change the logical domain name if it is wrong.

Execute the virtinfo -a command on each node and enter the logical domain name displayed.


Specify the input character string using up to 255 characters starting with an alphabetic letter and consisting of only alphanumeric characters and "-" (hyphens) and "." (period).

XSCF-name1

Specify the XSCF-LAN#0 host name of the cabinet where the logical domain with running Kernel Zones exist, or specify the IP address.

Available IP addresses are IPv4 addresses.

For SPARC M10-4S and M12-2S environments, specify the XSCF takeover IP address.

XSCF-name2

Specify the XSCF-LAN#1 host name of the cabinet where the logical domain with running Kernel Zones exists, or specify the IP address.

Available IP addresses are IPv4 addresses.

For SPARC M10-4S and M12-2S environments, specify the XSCF takeover IP address.

User-Name

Specify the user name to log in to the XSCF of the cabinet where the logical domain with running Kernel Zones exists.

Password

Specify the password to log in to the XSCF of the cabinet where the logical domain with running Kernel Zones exists.

connection method

Select a connection method for XSCF

Select "ssh" or "telnet."

Note

  • In the environment where XSCF is duplexed, a combination of a user name and a password for 2 of the XSCF must be the same.

  • To change the connection type after setting the shutdown facility, perform the setting using the shutdown facility wizard from beginning again.

  • In the configuration where the asynchronous monitoring sub-LAN is not used, specify the IP address of XSCF-LAN#0 to "XSCF- name1" and the host name corresponding to the IP address specified in XSCF- name1 to "XSCF-name2." For the configuration where the asynchronous monitoring sub-LAN is not used, see "2.2.2 XSCF Configuration in SPARC M10 and M12."

  • If an incorrect value has been entered, the cluster system in a Kernel Zone cannot be switched even if an error occurs in the logical domain where the Kernel Zone operates.

Upon the completion of configuration, click Next.

Configuring ILOM

When building Kernel Zones in SPARC T4, T5, T7, S7 series, the screen for entering the information of ILOM will appear.

Figure 5.22 Configuring ILOM

Enter the settings for ILOM that you recorded in "5.1.2.5.2 Checking ILOM Information."

ILOM-Name

Enter the IP address for ILOM of the host where Kernel Zones operate or the host name that is registered in the /etc/inet/hosts file.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

User-Name

Enter a user name to log in to ILOM.

Password

Enter a password to log in to ILOM.

Upon the completion of configuration, click Next.

Entering node weights and administrative IP addresses

The screen for entering the weights of the nodes and the IP addresses for the administrative LAN will appear.

Figure 5.23 Entering node weights and administrative IP addresses

Enter the weights of the nodes and the IP addresses for the administrative LAN.

Weight

Enter the weight of the node that constitutes the cluster. Weight is used to identify the survival priority of the node group that constitutes the cluster. Possible values for each node range from 1 to 300.

For details on survival priority and weight, refer to the explanations below.

Admin IP

Enter an IP address directly or click the tab to select the host name that is assigned to the administrative IP address.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

Upon the completion of configuration, click Next.

Survival priority

The typical scenarios that are implemented are shown below:

[Largest node group survival]

[Specific node survival]

In the following example, node1 is to survive:

[Specific application survival]

In the following example, the node for which app1 is operating is to survive:

[Combination of the cluster system between control domains or guest domains and the cluster system among Kernel Zones for specific control domain survival]

In the following example, nodes are to survive in the order of node1, node2, node3, and node4.

Figure 5.24 Building Kernel Zones on the control domain

Figure 5.25 Building Kernel Zones on the guest domains

Saving the configuration

Confirm and then save the configuration.

In the left-hand panel of the window, those nodes that constitute the cluster are displayed. The SF node weight is displayed in brackets after those nodes. The timeout value of the shutdown agent is displayed in brackets after the shutdown agent.

Figure 5.26 Saving the configuration

Click Next. A popup screen will appear for confirmation.

Select Yes to save the setting.

Displaying the configuration of the shutdown facility

If you save the setting, a screen displaying the configuration of the shutdown facility will appear. On this screen, you can confirm the configuration of the shutdown facility on each node by selecting each node in turn.

Information

You can also view the configuration of the shutdown facility by selecting Shutdown Facility -> Show Status from the Tool menu.

Figure 5.27 Displaying the state of configuration of shutdown facility

Shut state

"Unknown" is shown during normal system operation. If an error occurs and the shutdown facility stops the relevant node successfully, "Unknown" will change to "KillWorked".

Test State

Indicates the state in which the path to shut down the node is tested when a node error occurs. If the test of the path has not been completed, "Unknown" will be displayed. If the configured shutdown agent operates normally, "Unknown" will be changed to "TestWorked".

Init State

Indicates the state in which the shutdown agent is initialized.

To exit the configuration wizard, click Finish. Click Yes in the confirmation popup screen that appears.

Note

On this screen, confirm that the shutdown facility is operating normally.

If "InitFailed" is displayed in the Initial state even when the configuration of the shutdown facility has been completed or if "Unknown" is displayed in the Test state or "TestFailed" is highlighted in red, the agent or hardware configuration may contain an error. Check the /var/adm/messages file and the console for an error message. Then, apply appropriate countermeasures as instructed the message that is output.

5.1.2.6 Using ICMP Shutdown Agent in SPARC M10 and M12

When using the I/O fencing function in SPARC M10 or M12 environment where PRIMECLUSTER is not configured in the control domain, use the ICMP shutdown agent.

Note

  • Do not set the ICMP shutdown agent in the environment where the I/O fencing function is not used.

  • The ICMP shutdown agent checks response from the guest OSes on the network paths (administrative LAN/interconnect). The application will be switched when no response is confirmed from the guest OSes. In this case, if the error guest domain does not stop completely (when the OS is hanging, for example), the cluster applications starts on both guest domains, and these applications access the shared disk at the same time. Using the I/O fencing function prevents simultaneous access of both guest domains at the same time.

5.1.2.6.1 Confirming Route Information

Record the IP addresses of the following network used to check if each node is alive:

5.1.2.6.2 Using the Shutdown Configuration Wizard

Starting up the shutdown configuration wizard

From the CF main window of the Cluster Admin screen, select [Tools] menu and then [Shutdown Facility] -> [Configuration Wizard]. The shutdown configuration wizard will start.

Note

You can also configure the shutdown facility immediately after you complete the CF configuration with the CF wizard. The following confirmation popup screen will appear. Click [Yes] to start the shutdown configuration wizard.

Selecting a shutdown agent

The selection screen for the shutdown agent will appear.

Confirm the hardware machine type and select the appropriate shutdown agent.

After selection, click [Next].

Information

If you select a shutdown agent, the timeout value is automatically set. For details on the timeout value of the shutdown agent, see "7.2.2 Configuration file of SF" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."

  • ICMP

    Timeout value = 7 x Number of IP addresses (seconds)

    20 seconds if the timeout value is less than 20.

Setting Route Information

The screen for entering the route information will appear.

Set the IP address recorded in "5.1.2.6.1 Confirming Route Information."

Number of IP addresses

Specify the number of IP addresses used to check if each node is alive.

Specify the number from 1 to 10.

If the numbers of available IP addresses are less than 10, for example, if only 5 IP addresses exist, select the number from 1 to 5.

IP address n

Specify the IP address used to check if each node is alive.

Specify the IP address for the specified number of IP addresses.

For the number of IP addresses used to check the survival of each node, using multiple routes are recommended to ensure the determination of abnormality.

However, to prioritize automatic switch than certain abnormality determination, specify "1" for the number of IP address used to check the survival of node to only set the IP address of cluster interconnect.

If only the cluster interconnect is set, automatic switch is enabled even when communication with the cluster interconnect is disabled while communication with other LAN routes is enabled (when the communication target node responding to ping).

Select the following IP addresses:

  • Cluster interconnect (IP address of CIP)

  • Public LAN/Administrative LAN (IP address of standard port)

Note

  • The ICMP shutdown agent performs to switch the cluster application automatically only when the remote node cannot be confirmed to be alive on all the specified routes.

  • If the communication in the cluster interconnect is disabled even when the communication with the remote node is enabled in public LAN/administrative LAN, the cluster becomes LEFTCLUSTER. Automatic switch cannot be performed until the cluster is restored manually.

  • Do not use the takeover IP address (takeover virtual interface).

After all the settings are done, click [Next].

Entering node weights and administrative IP address

The screen for entering the weights of the nodes and the IP addresses for the administrative LAN will appear.

Enter the weights of the nodes and the IP addresses for the administrative LAN.

Weight

Specify the SF node weight.

When using the I/O fencing function, the value is always "1" because the SF node weight is disabled.

Admin IP

Enter an IP address directly or click the tab to select the host name that is assigned to the administrative IP address.

Available IP addresses are IPv4 and IPv6 addresses.

IPv6 link local addresses are not available.

After all the settings are done, click [Next].

Saving the configuration

Confirm and then save the configuration.

In the left-hand panel of the window, those nodes that constitute the cluster are displayed. The SF node weight is displayed in brackets after those nodes. The timeout value of the shutdown agent is displayed in brackets after the shutdown agent.

Click [Next]. A popup screen will appear for confirmation.

Select [Yes] to save the setting.

Displaying the configuration of the shutdown facility

If you save the setting, a screen displaying the configuration of the shutdown facility will appear. On this screen, you can confirm the configuration of the shutdown facility on each node by selecting each node in turn.

Information

You can also view the configuration of the shutdown facility by selecting [Shutdown Facility] -> [Show Status] from [Tools] menu.

Shut State

Unknown is shown during normal system operation. If an error occurs and the shutdown facility stops the relevant node successfully, Unknown will change to "KillWorked".

Test State

Indicates the state in which the path to shut down the node is tested when a node error occurs. If the test of the path has not been completed, "Unknown" will be displayed. If the configured shutdown agent operates normally, "Unknown" will be changed to "TestWorked".

Init State

Indicates the state in which the shutdown agent is initialized.

To exit the configuration wizard, click Finish. Click Yes in the confirmation popup screen that appears.

Note

On this screen, confirm that the shutdown facility is operating normally.

If "InitFailed" is displayed in the Initial state even when the configuration of the shutdown facility has been completed or if "Unknown" is displayed in the Test state or "TestFailed" is highlighted in red, the agent or hardware configuration may contain an error. Check the /var/adm/messages file and the console for an error message. Then, apply appropriate countermeasures as instructed the message that is output.