PRIMECLUSTER Installation and Administration Guide 4.2 (Linux)
Contents Index PreviousNext

Part 2 Installation> Chapter 3 Software Installation> 3.1 Installation and Setup of Related Software

3.1.4 Setting Up the Cluster High-Speed Failover Function

mark1Overview

If heartbeat monitoring fails because of a node failure, PRIMECLUSTER shutdown facility removes the failed node. If this occurs during crash dump collection, you might not be able to acquire information for troubleshooting.

The cluster high-speed failover function prevents node elimination during crash dump collection, and at the same time, enables the ongoing operations on the failed node to be quickly moved to the another node.

The crash dump collection facility varies depending on the version of RHEL being used.

Version of Red Hat Enterprise Linux

Crash dump collection facility

RHEL-AS3/RHEL-ES3

Netdump

RHEL-AS3 batch correction U05011/ RHEL-ES3 batch correction U05011

Netdump or Diskdump

RHEL-AS4 batch correction U05111

Diskdump

mark1Netdump

As shown in the above figure, the cluster high-speed failover function sets up and refers to the panic status of the Netdump server if the heartbeat fails. The node that detected a heartbeat error assumes that the failed node enters the Offline mode without forced power-off of the node for which the crash dump is being output, so that this node can take over the transactions.

mark2Netdump server (server dedicated to dump collection)

You must prepare another node, to be used as the Netdump server, independently of the cluster nodes. It must be connected to the LAN for the Netdump server (a dedicated LAN). For example, when you build a cluster system configured with four nodes, you must prepare a total of five nodes, one of which will be used as the Netdump server.

To enable to use the Netdump function, you must first set up the Netdump server and Netdump client.

mark2Settings required for the Netdump shutdown agent

mark3Settings for the Netdump server

  1. Confirming the Netdump function

    Confirm that the Netdump server function is available. If not, enable it.

    Use the "runlevel(8)" command and the "chkconfig(8)" command to confirm the operation.

  2. Confirming the NFS function

    The Netdump shutdown agent uses NFS. Confirm if NFS is available. If it is not available, make it available.

    Use the "runlevel(8)" command and the "chkconfig(8)" command to confirm the operation.

  3. Setting to avoid rebooting

    The Netdump command is used to reboot a node from which a dump was collected after crash dump collection. Set up the following in "/etc/netdump.conf" to prevent the node from rebooting after dump collection.

      noreboot=true
  4. Setting the NFS function

    Set up the following in "/etc/exports."

    /var/crash/log/netdump_status NodeA(ro,no_root_squash) NodeB(ro,no_root_squash)

  5. Rebooting the system

    Reboot the system.

      # shutdown -r now

mark3Setting for the Netdump client (cluster system configuration node)

  1. Confirming the NFS function

    Confirm if NFS is available. If it is not available, make it available. This operation must be executed on all the nodes that constitute the cluster system.

    Use the "runlevel(8)" command and the "chkconfig(8)" command to confirm the operation.

  2. Setting the NFS function

    This operation must be executed on all the nodes that constitute the cluster system.

  3. Rebooting the system

    Reboot the system.

    This operation must be executed on all the nodes that configure the cluster system.

      # shutdown -r now

mark1Diskdump

As shown in the above figure, the cluster fast switching function allows for panic status setting and reference through RSB or BMC (Baseboard Management Controller) when a heartbeat monitoring failure occurs. The node that detects the failure can consider that the other node is stopped and takes over ongoing operation without eliminating the node that is collecting crash dump.

mark2Required setting for the Diskdump shutdown agent

  1. Configure Diskdump

    When using Diskdump, it is necessary to configure the Diskdump.

  2. Check Diskdump

    Check if the Diskdump is available. If not, enable the Diskdump using the "runlevel(8)" and "chkconfig(8)" commands.

mark1Prerequisites for the other shutdown agent settings

After you completed configuring the Netdump shutdown agent or Diskdump shutdown agent, set the remote service board (RSB), IPMI (Inteligent Platform Management Interface) or BLADE server.

mark2Prerequisites for the RSB shutdown agent settings

Set the following for the remote service board (RSB):

For details, see the operation manual provided with the remote service board and the "ServerView User Guide."

mark2Prerequisites for the IPMI shutdown agent settings

Set the following for the IPMI user.

For details, see the "User Guide" provided with the hardware and the "ServerView User Guide."

mark2Prerequisites for the BLADE shutdown agent settings

Set the following for the BLADE server:

For details, see the operation manual provided with the hardware and the "ServerView User Guide."


Contents Index PreviousNext

All Rights Reserved, Copyright(C) FUJITSU LIMITED 2006