The self-checking function can be enabled as follows.
Enable the self-checking function
# /opt/FJSVhanet/usr/sbin/hanetparam -e yes |
Check the changed parameters.
# /opt/FJSVhanet/usr/sbin/hanetparam print
[Fast switching]
Line monitor interval(w) :5
Line monitor message output (m) :0
Cluster failover (l) :5
Cluster failover in unnormality (c):OFF
Line status message output (s) :OFF
[NIC switching]
Standby patrol interval(p) :15
Standby patrol message output(o) :3
[Virtual NIC]
LinkDown detection time (q) :0
LinkUp detection time (r) :1
Link monitor starting delay (g) :5
[Common Setting]
Hostname resolution by file(h) :YES
Self-checking function(e) :YES |
Reboot the system. After reboot, the self-checking function will be enabled.
The self-checking function can be disabled as follows.
Disable the self-checking function.
# /opt/FJSVhanet/usr/sbin/hanetparam -e no |
Check the changed parameters.
# /opt/FJSVhanet/usr/sbin/hanetparam print [Fast switching] Line monitor interval(w) :5 Line monitor message output (m) :0 Cluster failover (l) :5 Cluster failover in unnormality (c):OFF Line status message output (s) :OFF [NIC switching] Standby patrol interval(p) :15 Standby patrol message output(o) :3 [Virtual NIC] LinkDown detection time (q) :0 LinkUp detection time (r) :1 Link monitor starting delay (g) :5 [Common Setting] Hostname resolution by file(h) :YES |
Reboot the system. After reboot, the self-checking function will be disabled.
The following describes how the monitoring is performed with the self-checking function. The virtual driver and control daemon are monitored periodically.
The monitoring targets are as follows. A system wide hang or error status cannot be detected.
Monitoring target | Error type | Error detection method |
---|---|---|
Driver | Hung-up | No response from the virtual driver for 60 seconds |
I/O Error | Information is not received from the driver five times in a row | |
Daemon | Hung-up | There is no response from the control daemon for 300 seconds |
I/O error | Information is not received from the control daemon five times in a row | |
Stopped process detection | There is no control daemon process |
If an error has been detected, a message similar to the following will be output to the system log.
An error occurred in the virtual driver
The following message is output and the monitoring function stopped. Reboot the system after collecting troubleshooting information.
ERROR: 97427: sha driver error has been detected. code=xxx |
xxx: error type (hung-up or I/O error)
An error occurred in the control daemon
The following message is output. After that, if there is no response from the control daemon for 300 seconds, the monitoring function will stop.
ERROR: 97627: hanetctld error has been detected. code=xxx |
xxx: error type (hung-up, I/O error, or stopped process)
However, if the control daemon recovered, the following message will be output and the monitoring will continue.
INFO: 97727: hanetctld recovery has been detected. |
If the above message is not output, reboot the system after collecting troubleshooting information.
Note that placing a script in the following location allows the script to be executed when an error is detected. For more details, see "3.12.2 Setting user command execution function."
/etc/opt/FJSVhanet/script/system/monitor |
Information
Rebooting the system is recommended after the monitoring function stopped.
If a hung-up or an I/O error was detected due to temporary system load, the self-checking function can be restored by restarting it as below.
# /opt/FJSVhanet/etc/sbin/hanetmond |
If the self-checking function failed to be restarted, collect materials for examination and then contact field engineers to report the error message.
In this case, an error may have been occurred or the system resources may be low. To resolve these problems, reboot the system.