1.2.1 Applications, resources, and objects

RMS relies on a virtual representation of the cluster called a configuration. The configuration represents each machine, application, and system resource as an object, and the objects are logically arranged in a tree structure according to their dependencies. For instance, suppose a user application depends on a network interface and a file system in order to operate properly. In the tree structure, the corresponding application object would appear as a parent and the network and file system objects would appear as its children. The tree structure is commonly known as a graph.

Each object in the graph contains the state of the corresponding item along with any other parameters that may be required. An object is typically in the online (enabled, available) state or the offline (disabled, unavailable) state, but other states are possible according to the type of object. For the complete list of states supported by RMS, see "Appendix B States".

At runtime, the configuration is managed by the RMS base monitor, which initiates actions when an object's state changes, or, in the case of a timeout, when an object has remained in the same state for some specified time interval even though a change was expected. This design is known as a state machine.

Nodes and heartbeats

Machines that are members of a cluster are called nodes. When RMS monitors the health of a node, its highest priority is to detect a complete failure of the node or its base monitor. Its second priority is to detect slow response times that may be caused by system overloads. RMS uses two mechanisms, heartbeat and ELM to detect these problems.

Refer to the "2.2 RMS heartbeat operation".

Detectors

RMS monitors each resource by using detectors, which are processes that deliver status reports to the RMS base monitor process. RMS interprets the status reports to determine the state of the corresponding virtual object. When an object's state changes, RMS takes action according to the parameters set in the object. Each object may be associated with a detector.

Detectors are persistent: when RMS starts on a cluster node it starts the detectors for its configuration, which normally continue to run on that node until RMS is shut down. RMS has the ability to restart a detector if it terminates prematurely.

A complete list of the states that can be reported by detectors or displayed in the user interface is presented later in this chapter.

Scripts

Each object is linked with multiple scripts. These scripts are shell scripts. Normally, each script is described to interact with items in the operating system such as user applications or physical resources.

Some scripts are reactive: they define the actions that RMS should take in response to state changes. Other scripts are proactive: they define the actions that RMS should use to take control of individual objects. For instance, RMS would process one script when a resource reports a transition from the online state to the offline state; however, RMS would process a different script when it must force the resource to the offline state.

After performing their programmed tasks, the scripts exit and return an exit code to the base monitor.

A complete list of the scripts that may be specified for RMS objects is presented later in this chapter.

Object types

Most high-availability applications rely on a set of physical resources such as network interfaces, files systems, or virtual disks. RMS represents these as gResource objects. Most gResource objects have scripts that allow them to be brought online or taken offline.

Internally, RMS represents an actual application that runs in the operating system environment as a userApplication object. The set of gResource objects that represent the actual application's resource requirements are called its dependent resources. Bringing a userApplication object to the online state, along with all of its dependent resources, is called online processing. Taking a userApplication object to the offline state, along with all of its dependent resources, is called offline processing.

Each node that may run one or more applications in the high availability configuration is represented by a SysNode object. Like gResource objects and userApplication objects, SysNode objects can be brought online or taken offline, and they have an associated set of scripts. However, booting up or shutting down the corresponding physical machine requires more than simple script processing.

For the complete list of the RMS object types supported by the Wizard Tools, see "Appendix C Object types".

Shutdown Facility

The shutdown facility issues a forcible shutdown instruction to each node in the cluster. When it is necessary to take a SysNode object offline, RMS instructs the forcible shutdown of the target node by using the shutdown facility. RMS waits for successful completion of the node kill before switching any userApplication to another SysNode. This prevents any user application from running on two machines at the same time, which could lead to data corruption.

For more information about the Shutdown Facility, see "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide" for your operating system.