In the cluster system in an AWS environment, the asynchronous forcible stop method is used.
In this method, a node where an error occurs is forcibly stopped and the switching is started before the stop of a node is completed. This means that the switching time can be shorten and that a business does not stop in the LEFTCLUSTER state due to the incomplete stop of a node.
Also by using a function of the mirroring among servers, data synchronization communication is blocked at the time of the switching. This prevents simultaneous access to the business data from multiple nodes, and the data is secured.
When the cluster partition occurs, each node operates as follows according to the survival priority.
Node with high survival priority
Forcibly stopping a remote node by the panic instruction (in an AWS Nitro System environment) and the power-off instruction
Operating a local node without waiting for the stop of a forcibly stopped node
Node with low survival priority
Waiting for the time until the forcible stop process of a node with high survival priority is completed
Checking the state of the instance of a local node and panicking the local node
Note
When an OS panic occurs, the cluster node may be powered off while the memory dump is being output, and it may not be possible to collect a complete memory dump.
The slice preceding degenerated option of the mirroring among servers cannot be set.
When an OS panic occurs, the OS must be restarted manually since the OS is not restarted automatically.
A shared file system service cannot be used. The mirroring among servers must be used when taking over the volume data.