GLS provides highly reliable transfer routes by using the control daemon and virtual driver.
The self-checking function monitors those states periodically and notifies to users if an error occurs. The function is enabled automatically.
Figure 2.46 Self-checking function
Note
The self-checking function does not detect the system wide errors or hangs. Use the cluster for these.
The following describes how the monitoring is performed with the self-checking function. The virtual driver and control daemon are monitored periodically.
Figure 2.47 Error detection of the self-checking function
The monitoring targets are as follows. A system wide hang or error status cannot be detected.
Monitoring target | Error type | Error detection method |
---|---|---|
Driver | Hung-up | No response from the virtual driver for 15 seconds |
I/O Error | Information is not received from the driver five times in a row | |
Daemon | Hung-up | There is no response from the control daemon for 300 seconds |
I/O error | Information is not received from the control daemon five times in a row | |
Stopped process detection | There is no control daemon process |
If an error has been detected, the following messages will be output to the system log. After that, the monitoring function stops. To restart monitoring, reboot the system after collecting troubleshooting information.
An error occurred in the virtual driver
The following message is output and the monitoring function stopped.
ERROR: 97427: sha driver error has been detected. code=xxx |
xxx: error type (hungup or I/O error)
An error occurred in the control daemon
The following message is output. After that, if there is no response from the control daemon for 300 seconds, the monitoring function will stop.
ERROR: 97627: hanetctld error has been detected. code=xxx |
xxx: error type (hungup, I/O error, or stopped process)
However, if the control daemon recovered, the following message will be output and the monitoring will continue.
ERROR: 97727: hanetctld recovery has been detected. |
Note that placing a script in the following location allows the script to be executed when an error is detected. For more details, see "3.6.10 Setting User command execution function."
/etc/opt/FJSVhanet/script/system/monitor |
Information
Rebooting the system is recommended after the monitoring function stopped.
If a hung-up or an I/O error was detected due to temporary system load, the self-check function can be restored by restarting it as below.
# svcadm restart fjsvhanet-poll |
If the self-check function failed to be restarted, collect materials for examination, and then contact field engineers to report the error message.
In this case, an error may have been occurred or the system resources may be low. To resolve these problems, reboot the system.