PRIMECLUSTER Installation and Administration Guide 4.1 (for Solaris(TM) Operating System)
Contents Index PreviousNext

Part 1 Planning> Chapter 1 Build Flow

1.4 Test

mark1Purpose

When you build a cluster system using PRIMECLUSTER, you need to confirm before starting production operations that the entire system will operate normally and cluster applications will continue to run in the event of failures.

For 1:1 standby operation, the PRIMECLUSTER system takes an operation mode like the one shown in the figure below.

The PRIMECLUSTER system switches to different operation modes according to the state transitions shown in the figure below. To check that the system operates normally, you must test all operation modes and each state transition that switches to an operation mode.

[State transitions of the PRIMECLUSTER system]

mark2PRIMECLUSTER System State

Description

Dual instance operation

A cluster application is running, and it can switch to the other instance in the event of a failure (failover). Two types of the dual instance operation are OPERATING and STANDBY.

Even if an error occurs while the system is operating, the standby system takes over ongoing operations as an operating system. This operation ensures the availability of the cluster application even after failover.

Single instance operation

A cluster application is running, but failover is disabled.

Two types of the single instance operation are OPERATING and STOP. Since the standby system is not supported in this operation, a cluster application cannot switch to other instance in the event of a failure. So, ongoing operations are disrupted.

Stopped state

A cluster application is stopped.

The above-mentioned "OPERATING", "STANDBY", and "STOP" are defined by the state of RMS and cluster application as follows;

RMS state

Cluster application state

Remark

OPERATING

Operating

Online

-

STANDBY

Operating

Offline or Standby

-

STOP

Stopped

Unknown *

SysNode is Offline

* RMS determines the cluster application state. When RMS is stopped, the cluster application state is unknown.

mark1Main tests for PRIMECLUSTER system operation

mark2Startup test

Conduct a startup test and confirm the following:

mark2Clear fault

If a failure occurs in a cluster application, the state of that application changes to Faulted.

To build and run this application in a cluster system again, you need to execute "Clear Fault" and clear the Faulted state.


Conduct a clear-fault test and confirm the following:

mark2Switchover

Conduct a failover or switchover test and confirm the following:

You need to know the operation downtime in the event of a failure, so measure the switching time for each failure detection cause and check the recovery time.

mark2Replacement test

Conduct a replacement and confirm the followings:

mark2Stop

Conduct a stop test and confirm the followings:

mark2Work process continuity

Conduct work process continuity and confirm the followings:


Contents Index PreviousNext

All Rights Reserved, Copyright (C) FUJITSU LIMITED 2005