A server for which Auto-Recovery was enabled will be automatically switched over to its spare server if Resource Coordinator VE detects both a failure from the server hardware and determines that its physical OS (or VM host) has stopped. These two conditions are detailed below.
Detecting hardware failures from servers
A hardware failure can be detected from specific SNMP traps sent to the Admin Server from either the ServerView Agent or the server management unit. Alternatively, Resource Coordinator VE can detect a failure by periodically polling the status of each managed server.
Detectable hardware failures
CPU faults
Memory errors
Fan failures
Temperature abnormalities
Detecting that a physical OS (or VM host) has stopped
A physical OS (or VM host) is seen to have abnormally stopped when the following conditions are met:
An abnormal server status is obtained from a server management unit, and it is not possible to communicate with either the ServerView Agent or the Resource Coordinator VE Agent.
Note
Because the range of hardware failures that can be detected from rack-mount and tower servers is limited, Auto-Recovery can not be enabled for such servers. Instead, it is recommended to perform manual switchovers whenever necessary.
Auto-Recovery is not triggered on servers that are in maintenance mode.
Even if a hardware failure is detected, Auto-Recovery will not be triggered if no response is received from the target server. In such cases, shutting down (or restarting) the server will (temporally) stop the operating system, triggering an automatic switchover as the conditions for Auto-Recovery will be met. Under such conditions, automatic switchovers can be prevented by setting the server to maintenance mode before shutdown (or restart).