If an error was detected, SNMP Trap can be sent to the server on which the SNMP manager is running.
By using this function, the SNMP manager can monitor the cluster system.
For details on how to set this function, see "6.10.1 Setting Contents of a Cluster Application" in "PRIMECLUSTER Installation and Administration Guide (Linux)."
For SNMP Trap, the version 2c is used and the following pieces of information are contained in SNMP Trap to be sent.
OIDs corresponding to the resource types
Message indicating the resource failure
To send SNMP Trap, the snmptrap command is used and the following operation is performed.
snmptrap -v 2c -c <Community name> <Destination host> .1.3.6.1.4.1.211.4.68.257 <OID> s <Message>
The following table shows OIDs set for SNMP Trap.
No | OID | Description |
---|---|---|
1 | .1.3.6.1.4.1.211.4.68.257.1 | This OID is set when a resource failure occurs in an application or middleware. |
2 | .1.3.6.1.4.1.211.4.68.257.2 | This OID is set when a network failure occurs. It is set when a failure is detected in the Gls resource, or in the takeover network resource. |
3 | 1.3.6.1.4.1.211.4.68.257.3 | This OID is set when a file system failure is detected. It is set when a failure is detected in the Fsystem resource. |
4 | .1.3.6.1.4.1.211.4.68.257.4 | This OID is set when a disk failure is detected. It is set when a failure is detected in the Gds resource. |
The message which RMS outputs to switchlog when the resource failure occurs is contained in "Message indicating a resource failure." The message indicates what failure is detected in which resource.
Example of a message:
2014-07-28 17:07:09.254:(DET, 7): ERROR: FAULT REASON: Resource <ManageProgram000_Cmd_APP1> transitioned to a Faulted state due to the resource unexpectedly becoming Offline.
You can also confirm the cause and action of the resource failure by searching the message in "PRIMECLUSTER Messages."
Note
If an error occurs in the communication path between the cluster node and the host on which the SNMP manager is running, the notification of resource failure by SNMP Trap does not work.
If the userApplication is switched over when the multiple resource failures are detected in one userApplication at the same time, only one of the resource failures will be notified. Check if other resource failures occur in the switchlog file after the cause of the notified resource failure has been removed on the cluster node.