ETERNUS SF Storage Cruiser Install Guide 13.2 - Solaris (TM) Operating System / Linux / Microsoft(R) Windows(R) -
Contents PreviousNext

Appendix A High-availability admin servers

Admin servers in cluster systems ensure high-availability and maximize uptime.

A.1 What are cluster systems? 

A cluster system provides high-availability by operating two or more servers as one virtual server.

If a system is run with only one server machine, and the server machine or the server application fails, all operations would stop until it is rebooted.

In a cluster system where two or more server machines are linked together, if the primary server machine becomes unusable due to an error on the server machine or the application being run, the secondary server machine can restart the application that was running on the failed server machine to resume the operation, reducing the operation stop time. The switchover from a failed server to another server is called failover.

These two or more linked server machines are referred to as a cluster of server machines and clustering server machines are referred to as nodes in the cluster system.

Clusters of server machines are classified into one of the following types.


For details, refer to the version of "PRIMECLUSTER Installation and Administration Guide" for the respective operating System.

A.2 Supported software 

The following pieces of ETERNUS SF Storage Cruiser's software are supported:

A.3 Manager installation

This section describes the procedure for manager installation on a cluster system.


To distinguish two physical nodes, one is called the primary node and the other is called the secondary node. The primary node indicates an operating node on which the cluster service (cluster application) runs at initial boot time, and the secondary node indicates a standby node on which the cluster service (cluster application) stands ready at initial boot time.


A.3.1 New manager installation 

++A.3.1.1 Preparation

Cluster installation and configuration

To operate managers as cluster services (cluster applications) in a cluster system, install and configure manager software on each node. Be sure to check the device names and the mount point names when configuring the shared disks.


Operations that are run where the manager is set up as cluster service (cluster application) is referred to as manager operations is respectively. Collectively, these operations are called Manager operations.
For details about PRIMECLUSTER, refer to the version of "PRIMECLUSTER Installation and Administration Guide" for the respective operating System.

Resource configuration

Manager installation on a cluster system requires the following resources.

Note1: When a server operates as the server, both the admin servers use one shared disk.

Note2: Necessary when Interstage Application Server Smart Repository is installed

Shared disk space

The ETERNUS SF Storage Cruiser shared disks include the following directories that are used with ETERNUS SF Storage Cruiser.

++A.3.1.2 Installation

Install manager.

Using similar steps on the primary node and the secondary node, perform the procedure from the beginning of "4.1.3 Installation procedure" (in Solaris OS) or "4.2.3 Installation procedure" (in Linux) for manager.

A.4 Manager operation setup

This section describes the procedure for setting up managers as cluster services (cluster applications) in a cluster system.


Prior to setup, decide whether to add the manager operations to existing cluster services (cluster applications) or as new cluster services (cluster applications).


A.4.1 In advance checks 

Check that the manager installation is complete. For the admin server, check that the procedure from the beginning of "4.1.3 Installation procedure" (in Solaris operating system) or "4.2.3 Installation procedure" (in Linux) is complete.


Use the cluster setup command for configuring "LOGICAL_MANAGER_IP" in "/etc/opt/FJSVssmgr/current/sanma.conf".


A.4.2 Manager operation setup (outline) 

Set up managers on the admin servers.
The following figure shows the setup flow.


A.4.3 Manager operation setup (details)

Set up managers as cluster services (cluster applications) using the following steps. Only a user with system administrator privileges can conduct the following procedures.

When rebooting the system after installing managers, stop the managers before performing the following procedure.

++For the admin server

  1. When adding a manager to an existing cluster service (cluster application), stop the cluster service (cluster application) using the cluster admin of the cluster system.

     

  2. Configure the shared disk using PRIMECLUSTER GDS.


    For details, refer to "PRIMECLUSTER Global Disk Services Guide (Solaris(TM) operating system version)" of the respective operating system.

     

  3. Mount the shared data disk for the server on the primary node.

     

  4. Specify the port numbers for the server in "/etc/services" on both the primary node and the secondary node.

     

  5. Perform the procedure described in "4.1.4.2 Administrator login account creation" (in Solaris operating system) or "4.2.4.2 Administrator login account creation" (in Linux) only on the primary node. The created login account information will be set on the shared data disk when the cluster setup command is executed.

     

  6. Execute the cluster setup command on the primary node.

    After preventing access by other users to the shared data disk for the server, execute the following cluster setup command on the primary node.

    # /opt/FJSVssmgr/cluster/esc_clsetup -k Primary -m Mount_point_on_the_shared_data_disk_for_the_ server -i Server_takeover_IP_address <RETURN>


     

  7. Check the configurations.

    Information specified with the command is output. Check the output information and enter "y". To cancel setup at this point, enter "n".

    # /opt/FJSVssmgr/cluster/esc_clsetup -k Primary -m /zzz -i 10.10.10.10 <RETURN>
    ETERNUS SF Storage Cruiser settings are as follows.
    
          Cluster system : PRIMECLUSTER
    
          Node type      : Primary
    
          Mount point    : /zzz
    
          IP Address     : 10.10.10.10
    
    Manager cluster setup : Are you sure? [y,n,?]

     

  8. Setup on the primary node is executed.

    FJSVrcx:INFO:27700:esc_clsetup:primary node setup completed

    The command execution runs cluster setup and sets the "LOGICAL_MANAGER_IP" configuration of /etc/opt/FJSVssmgr/current/sanma.conf on the shared disk into sanma.conf.

    LOGICAL_MANAGER_IP="10.10.10.10";

     

  9. Unmount the shared data disk for the server from the primary node.

     

  10. Mount the shared data disk for the server on the secondary node.

     

  11. Execute the cluster setup command on the secondary node.

    After preventing access by other users to the shared data disk for the server, execute the following cluster setup command on the secondary node.

    # /opt/FJSVssmgr/cluster/esc_clsetup -k Secondary -m Mount_point_on_the_shared_data_disk_for_the _server <RETURN>

     

  12. Check the configurations.

    Information specified with the command is output. Check the output information and enter "y". To cancel setup at this point, enter "n".

    # /opt/FJSVssmgr/cluster/esc_clsetup -k Secondary -m /zzz <RETURN>
    Systemwalker Resource Coordinator settings are as follows.
    
          Cluster system : PRIMECLUSTER
    
          Node type      : Secondary
    
          Mount point    : /zzz
    
          IP Address     : 10.10.10.10
    
    Manager cluster setup : Are you sure? [y,n,?]

     

  13. Setup on the secondary node is executed.

    FJSVrcx:INFO:27701:esc_clsetup:secondary node setup completed

     

  14. Unmount the shared data disk for the server from the secondary node.

     

  15. Create a cluster service (cluster application).


    For details, refer to "PRIMECLUSTER Installation and Administration Guide" for the respective operating system.

    For Solaris OS:

    Use userApplication Configuration Wizard for cluster system to create the following required PRIMECLUSTER resources on the cluster service (cluster application).

    After creating the resources a. to d. described below, create the cluster service (cluster application) as follows.

    Then create the following resources on the created cluster service (cluster application).

    1. Cmdline resource (Create the Cmdline resources for ESC.)

      Select "Cmdline" for Resource creation in the userApplication Configuration Wizard to create the resources through path input, and make the following configurations.

      • Start script:
        /opt/FJSVssmgrcluster/cmd/rcxclstartcmd

      • Stop script:
        /opt/FJSVssmgr/cluster/cmd/rcxclstopcmd

      • Check script:
        /opt/FJSVssmgr/cluster/cmd/rcxclcheckcmd

      Flag (attribute value) settings: When a value other than "NO" is specified for the cluster service (cluster application) attribute value "StandbyTransitions", set ALLEXITCODES to "yes" and STANDBYCAPABLE to "yes".

      • When using Interstage Application Server Smart Repository, manager must start after Interstage Application Server Smart Repository starts.

    2. Ipaddress resource (Configure the takeover logical IP address for the cluster system.)

      Select Ipaddress for Resource creation in the userApplication Configuration Wizard and configure the takeover logical IP address. Configure the takeover logical IP address for the NIC of clients. For the takeover network type, select "IP address takeover".

    3. Fsystem resource (Configure the mount point of the shared data disk for the server.)

      Select Fsystem for Resource creation in the userApplication Configuration Wizard and configure the file system. If no mount point definition exists, please refer to the "PRIMECLUSTER Installation and Administration Guide" and define the mount point.

    4. Gds resource (Specify the configuration created for the shared data disk for the server.)

      Select Gds for Resource creation in the userApplication Configuration Wizard and configure the shared disk. Specify the configuration created for the shared data disk for the server.

    For Linux:

    Use RMS Wizard for cluster system to create the following PRIMECLUSTER resources on the cluster service (cluster application).

    To create a new cluster service (cluster application), select "Application-Create" to specify a primary node for Machines[0] and a secondary node for Machines[1] to create. For settings of "Machines+Basics", set "yes" to AutoStartUp, "HostFailure|ResourceFailure|ShutDown" to a value of AutoSwitchOver and "yes" to HaltFlag.
    Then, create the following resources a) through d) on the created cluster service (cluster application).

    Perform settings by means of the RMS Wizard to any of the nodes configured in a cluster

    1. Cmdline resources (Create Cmdline resources for ESC)

      Select "CommandLines" with the RMS Wizard to perform the following settings.

      • Start script :
        /opt/FJSVssmgr/cluster/cmd/rcxclstartcmd

      • Stop script :
        /opt/FJSVssmgr/cluster/cmd/rcxclstopcmd

      • Check script :
        /opt/FJSVssmgr/cluster/cmd/rcxclcheckcmd

      Flag settings (attribute value): If anything other than "without value" is specified for attribute value StandbyTransitions of the cluster service (cluster application), set the Flags ALLEXITCODES(E) and STANDBYCAPABLE(O) to "Enabled".

      • If shared with the operations management server or division management server for Systemwalker Centric Manager, make settings so that they are started after Centric Manager.

    2. Gls resources (Set a takeover logical IP address to be used for cluster system.)

      Select "Gls:Global-Link-Sevices" with the RMS Wizard to set a takeover logical IP address The takeover logical IP address is set to the client NIC. If the configuration information for takeover logical IP address has not been created, refer to "PRIMECLUSTER Global Link Services Instruction Manual (Redundancy function in transmission lines) (Linux)" to create it.

    3. Fsystem resources (Set mount points of management server-shared disks.)

      Select "LocalFileSystems" with the RMS Wizard to set file systems. If mount points have not been defined, refer to "PRIMECLUSTER Installation Guide (Linux)" to define.

    4. Gds resources (Specify the settings for management server shared disks.)

    Select "Gds:Global-Disk-Sevices" with the RMS Wizard to set shared disks. Specify the settings created for management server-shared disks.

  16. Perform the procedure as instructed in "4.1.4.3 Rebooting the system" (In Solaris operating system) or " 4.2.4.3 Rebooting the system" (In Linux) on both the primary node and the secondary node.

     

A.5 Admin server daemons

After cluster setup, the following daemons become resources of the cluster services (cluster applications) as the manager operations.


A.6 Advisory notes

This section provides advisory notes on admin server operation in cluster systems.

  1. /etc/opt/FJSVssmgr/current/sanma.conf configuration

    Specify a logical IP address as the takeover IP address for the LOGICAL_MANAGER_IP parameter in /etc/opt/FJSVssmgr/current/sanma.conf. However, the cluster setup command usually configures the takeover IP address automatically.

  2. Cluster service (cluster application) switchover

    An event that occurred during cluster service switchover cannot be output.

  3. Troubleshooting material

    To collect troubleshooting material, refer to "E.1 Troubleshooting information" of the "ETERNUS SF Storage Cruiser User's Guide".

  4. Commands

    For the command execution methods, there is no difference between regular operation and cluster operation.

  5. Messages

    Please refer to "Appendix J Messages and corrective actions" of the "ETERNUS SF Storage Cruiser User's Guide" and "A.8 Cluster-related messages".

A.7 Cluster environment deletion

This section describes how to delete cluster environments for the manager operations.


A.7.1 Cluster environment deletion procedure (outline) 

The following figure shows the manager operation cluster environment deletion flow.


A.7.2 Cluster environment deletion procedure (details)

Delete cluster environments using the following steps. Perform the deletion procedures on the server admin server in the cluster system respectively. Only a user with system administrator privileges can conduct the following procedures.

++For the admin server

  1. Stop the cluster service (cluster application) configured for the manager operation using the cluster admin of the cluster system.

     

  2. Delete resources for management tasks registered to the target cluster service (cluster application).



    For details, refer to the "PRIMECLUSTER Installation and Administration Guide" for the respective operating system version.

    For Solaris:

    Remove the following resources with the userApplication Configuration Wizard:

    For Linux:

    Using the RMS Wizard for cluster system delete the resources for management tasks registered to the target cluster service (cluster application). If the cluster service consists of only the resources for management server, delete also the cluster service (cluster application).

    Delete the following resources with the RMS Wizard.

    Deletion with the RMS Wizard is performed with respect to any of the nodes in a cluster.

     

  3. Check that the shared data disk for the server is unmounted on the primary node and the secondary node.

     

  4. Mount the shared data disk for the server on the secondary node.

     

  5. Execute the cluster unsetup command on the secondary node.

    After preventing access by other users to the shared data disk for the server, execute the following cluster unsetup command on the secondary node.

    # /opt/FJSVssmgr/cluster/esc_clunsetup <RETURN>
    or
    # /opt/FJSVssmgr/cluster/esc_clunsetup -l <RETURN>


    Option specification is not required under normal conditions, but specify the -l option if local free disk space on the secondary node is insufficient.

  6. Check the configurations (no option in this example).

    Check the configurations and enter "y". To cancel unsetup at this point, enter "n".

    ETERNUS SF Storage cruiser settings were as follows.
    
          Cluster system : PRIMECLUSTER
    
          Node type      : Secondary
    
          Mount point    : /zzz
    
          IP Address     : 10.10.10.10
    
          Mode           : Normal (restore from Shared Disk)
    
    Manager cluster deletion : Are you sure? [y,n,?]

     

  7. Deletion on the secondary node is executed.

    FJSVrcx:INFO:27703:esc_clunsetup:secondary node deletion completed

     

  8. Unmount the shared data disk for the admin server from the secondary node.

     

  9. Mount the shared data disk for the admin server on the primary node.

     

  10. Execute the cluster unsetup command on the primary node.

    After preventing access by other users to the shared data disk for the server, execute the following cluster unsetup command on the primary node.

    # /opt/FJSVssmgr/cluster/esc_clunsetup <RETURN>
    or
    # /opt/FJSVssmgr/cluster/esc_clunsetup -l <RETURN>

    Option specification is not required under normal conditions, but specify the -l option if local free disk space on the primary node is insufficient.

  11. Check the configurations (no option in this example).

    Check the configurations and enter "y". To cancel unsetup at this point, enter "n".

    ETERNUS SF Storage Cruiser settings were as follows.
    
          Cluster system : PRIMECLUSTER
    
          Node type      : Primary
    
          Mount point    : /zzz
    
          IP Address     : 10.10.10.10
    
          Mode           : Normal (restore from Shared Disk)
    
    Manager cluster deletion : Are you sure? [y,n,?]

     

  12. Deletion on the primary node is executed.

    FJSVrcx:INFO:27702:esc_clunsetup:primary node deletion completed

     

  13. Unmount the shared data disk for the server from the primary node.

     

  14. Delete the port numbers for the server from /etc/services on both the primary node and the secondary node.

     

  15. Start the cluster service (cluster application) using the cluster admin of the cluster system. If the cluster service (cluster application) was deleted in step 2, this step is not required.

     

  16. Execute the following command on both the primary node and the secondary node to stop nwsnmp-trapd.

    # /opt/FJSVswstt/bin/mpnm-trapd stop <RETURN>

     

  17. Uninstall managers from both the primary node and the secondary node using the procedure as described in "9.1 [Solaris OS] Manager uninstallation" (In Solaris operating system) or "9.2 [Linux] Manager Uninstallation.

     

A.8 Cluster-related messages

This section describes messages output at the time of cluster environment setup and unsetup.


A.8.1 Informative messages 


 

FJSVrcx:INFO:27700:command:primary node setup completed

[Description]

The cluster setup on the primary node ended normally.

[Corrective action]

Proceed to the next operation according to the cluster environment configuration procedure.

 


 

FJSVrcx:INFO:27701:command:secondary node setup completed

[Description]

The cluster setup on the secondary node ended normally.

[Corrective action]

Proceed to the next operation according to the cluster environment configuration procedure.

 


 

FJSVrcx:INFO:27702:command:primary node deletion completed

[Description]

The cluster deletion on the primary node ended normally.

[Corrective action]

Proceed to the next operation according to the cluster environment configuration procedure.

 


 

FJSVrcx:INFO:27703:command:secondary node deletion completed

[Description]

The cluster deletion on the secondary node ended normally.

[Corrective action]

Proceed to the next operation according to the cluster environment configuration procedure.

 


 

FJSVrcx:INFO:27733:command:canceled

[Description]

The command was canceled.

[Corrective action]

No specific action is required.

 


 

FJSVrcx:INFO:27751:command:cluster deletion (erase shared disk data) completed

[Description]

The shared disk data deletion is completed.

[Corrective action]

If any node where the cluster environment is not deleted exists, delete the cluster environment in force mode (using -f when executing esc_clunsetup). After the deletion is completed, uninstall the manager.

 


 

FJSVrcx:INFO:27753:command:cluster deletion (force mode) completed

[Description]

The cluster unsetup in force mode was completed.

[Corrective action]

Uninstall the manager.

 

A.8.2 Warning messages 


 

FJSVrcx:WARNING:47752:command:cluster deletion (force mode) completed excluding erase shared disk data

[Description]

Except for shared disk data deletion, the cluster unsetup in force mode is completed.

[Corrective action]

Check that the shared disk is accessible, and delete shared disk data (using esc_clunsetup with -e MountPoint).

 

A.8.3 Error messages 


 

FJSVrcx:ERROR:67704:command:option:secondary node deletion completed

[Description]

The option is incorrect. Usage will be output.

[Corrective action]

Check the command then re-execute.

 


 

FJSVrcx:ERROR:67706:command:ipaddress:invalid IP address format

[Description]

The IP address format is incorrect.

[Corrective action]

Check the IP address.

 


 

FJSVrcx:ERROR:67707:command:setup command already running

[Description]

The cluster setup command or the cluster unsetup command is already active.

[Corrective action]

Check whether the cluster setup command or the cluster unsetup command is running somewhere else.

 


 

FJSVrcx:ERROR:67709:command:not privileged

[Description]

The command was activated by a non-OS administrator (non-root user).

[Corrective action]

Execute the command as an OS administrator (root user).

 


 

FJSVrcx:ERROR:67710:command:mountpoint:not mounted

[Description]

The shared disk has not been mounted on mount point mountpoint.

[Corrective action]

Check the mount status of the shared disk for admin server shared data.
If the mount destination mountpoint contains "/" at the end, delete the last "/" and re-execute.

 


 

FJSVrcx:ERROR:67711:command:software:not installed

[Description]

Cluster software software has not been installed.

[Corrective action]

Check whether the cluster setup command or the cluster unsetup command is running.

 


 

FJSVrcx:ERROR:67714:command:nodetype: cluster setup already completed in this node

[Description]

The cluster environment of node type nodetype has been configured on the current node.

[Corrective action]

Check the cluster environment status on the current node.

 


 

FJSVrcx:ERROR:67715:command:cluster setup already completed in another node

[Description]

The cluster environment of the node type specified with the cluster setup command has been configured on another node.

[Corrective action]

Check whether the node type specified with the executed command is correct.
Check whether the cluster environment of the node type specified with the executed command has already been configured on another node.

 


 

FJSVrcx:ERROR:67716:command:primary node setup not completed

[Description]

The cluster environment has not been configured on the primary node.

[Corrective action]

Check whether the shared disk for mounted admin server shared data is correct.
Configure the cluster environment on the primary node.

 


 

FJSVrcx:ERROR:67717:command:switch:parameter:parameter conflict

[Description]

Data different from the previous one was specified.

[Corrective action]

Check the argument values of the command.

 


 

FJSVrcx:ERROR:67725:command:option:illegal option

[Description]

The option is incorrect. Usage will be output.

[Corrective action]

Check the command then re-execute.

 


 

FJSVrcx:ERROR:67726:command:secondary node not deleted

[Description]

The cluster environment on the secondary node is not deleted.

[Corrective action]

Check whether the shared disk for mounted admin server shared data is correct.
Delete the cluster environment then re-execute the command.

 


 

FJSVrcx:ERROR:67727:command:no cluster setup node

[Description]

The cluster environment has not been configured.

[Corrective action]

Check whether the admin server cluster environment has been configured.

 


 

FJSVrcx:ERROR:67740:command:cluster setup failed (setup data invalid)

[Description]

The cluster setup failed. The cluster environment configurations are invalid.

[Corrective action]

Delete the cluster environment in force mode then uninstall the manager.

 


 

FJSVrcx:ERROR:67741:command:setup command not installed

[Description]

The module for cluster setup has not been installed.

[Corrective action]

Check whether the manager installation is valid.

 


 

FJSVrcx:ERROR:67742:command:file:setup data inconsistency

[Description]

An inconsistency was found in the cluster environment configurations.

[Corrective action]

Collect the following files and contact Fujitsu technical staff.

The file indicated by file
For manager: All files under /opt/FJSVssmgr/cluster/env


 

FJSVrcx:ERROR:67743:command:setup data invalid

[Description]

The cluster environment configurations are invalid.

[Corrective action]

Delete the cluster environment in force mode (using -f when executing esc_clunsetup). After the deletion is completed, uninstall the manager.

 


 

FJSVrcx:ERROR:67744:command:shared disk data invalid

[Description]

Data on the shared disk is invalid.

[Corrective action]

Delete the cluster environment in force mode (using -f when executing esc_clunsetup). After the deletion is completed, uninstall the manager.

 


 

FJSVrcx:ERROR:67745:command:property:manager:value:setup data conflict

[Description]

The property value property of the cluster setup command does not match the configuration value value on the admin server manager for which cluster setup is done.

[Corrective action]

Specify the same configuration value to perform setup.

 


 

FJSVrcx:ERROR:67747:command:cluster setup failed

[Description]

The cluster setup failed.

[Corrective action]

Check the execution environment then re-execute the command. If the situation is not improved, contact Fujitsu technical staff.

 


 

FJSVrcx:ERROR:67748:command: can not execute in normal cluster setup node

[Description]

The cluster setup is in normal condition. The unsetup command cannot be executed specifying shared disk data deletion.

[Corrective action]

Shared disk data deletion (using esc_clunsetup with -e MountPoint) cannot be performed.
Delete the cluster environment following the procedure given in "A.7.2 Cluster environment deletion procedure (details)".

 


 

FJSVrcx:ERROR:67749:command:cluster deletion failed

[Description]

The cluster unsetup failed.

[Corrective action]

Delete the cluster environment in force mode (using -f when executing esc_clunsetup). After the deletion is completed, uninstall the manager.

 


 

FJSVrcx:ERROR:67750:command:cluster deletion (erase shared disk data) failed

[Description]

The shared disk data deletion failed.

[Corrective action]

Check the execution environment then re-execute the command. If the situation is not improved, contact Fujitsu technical staff.

 


Contents PreviousNext

All Rights Reserved, Copyright(C) FUJITSU LIMITED 2008