6.6.1.7.1 What Is the Process Monitoring Function?

PRIMECLUSTER Installation and Administration Guide 4.1 (for Solaris(TM) Operating System)

Contents Index

Part 2 Installation

> Chapter 6 Building Cluster Applications

> 6.6 Setting Up Cluster Applications

> 6.6.1 Setting Up Resources

> 6.6.1.7 Creating Process Monitoring Resources

6.6.1.7.1 What Is the Process Monitoring Function?

Overview of the process monitoring function

The process monitoring function monitors the live state of processes. The main features are as follows:

Changes in the live status of a process can be monitored.
(This setup is quite easy, so the user does not need to prepare commands for monitoring the live status of a process.)
Notifies RMS of the live state of any process immediately, and this provides high-speed switchover.
If any process terminates abnormally because of an unexpected error, that process is automatically restarted.

A relationship diagram of the process monitoring function and RMS is shown below. The process monitoring function consists of three components: the "clmonproc" command, the Process Monitoring Daemon (prmd), and the Detector (hvdet_prmd).

"clmonproc" command
The "clmonproc" command is executed from the Online or Offline script. The command requests prmd to start a specified process and to stop live monitoring.
prmd daemon
prmd is a daemon process that starts a process and stops live monitoring according to requests received from the "clmonproc" command. If the live state of a process being monitored changes, prmd notifies hvdet prmd immediately.
hvdet_prmd daemon
After receiving change information on the live state of a process from prmd, the "hvdet prmd" process notifies the RMS Base Monitor (BM) of the changes.

Benefits of using the process monitoring function

Described below are the benefits of using the process monitoring function.

Easy setup

Since prmd monitors whether there are any processes to be monitored, the user does not need to create a check command for each process to be monitored. The check command is used to determine whether the process to be monitored exists. Therefore with little work, the user can easily monitor the existence of processes.

High-speed detection of abnormal process termination

If the process monitoring function is not used, abnormal termination of a monitored process is detected by using a Cmdline resource to execute the "aforementioned check" command periodically. This delays detection of abnormal termination of a monitored process by execution time interval of the check command. However, if the process monitoring function is used, prmd uses signal processing to detect abnormal termination in monitored processes. This process monitoring function allows abnormal process termination to be detected at high speed compared to when check commands are executed periodically.

Automatic restart of any process that terminates abnormally

If any process terminates abnormally because of an unexpected error, the process monitoring function restarts that process automatically.

Reduction of CPU resource consumption

To shorten the time required to detect abnormal termination of a monitored process without using the process monitoring function, you must shorten the execution time interval of the check command. However, since this leads to frequent generation and execution of the check command, many CPU resources may be used up. Generally a command like the "ps" command is used as the check command. However, when a command that uses relatively more CPU resources, like the "ps" command, is used, the CPU resource consumption may become even more pronounced.

When the process monitoring function is used, prmd uses a signal process to monitor abnormal termination of the monitored process. A process that uses many CPU resources, such as one that issues a check command periodically, is not executed.

With the method that uses Cmdline resources, the number of check commands increases in proportion to the number of RMS objects because a check command is executed for each RMS object that is defined by the process to be monitored. Therefore if many check commands are executed periodically, many CPU resources may be used.

When the process monitoring function is used, it is always just one prmd that monitors the live stage of the process. Therefore, prmd does not use many CPU resources in proportion to the increase in the number of processes to be monitored.

Contents Index