Checking Crash Dump
Check the crash dump directory for a crash dump created after the switchover had occurred. The time the dump was collected can be found by referring to the time stamp using, for example, the "ls(1)" command.
If a crash dump after the switchover is found
Save the crash dump.
If a crash dump after the switchover is not found
If the failed node is restartable, manually collect a crash dump before restarting it.
Information
A crash dump is stored as a file on the node in which the error occurred.
If your guest OS has been forcefully stopped by the shutdown facility or the guest OS has been panicked in the environment where the KVM virtual machine function is used, the crash dump will be stored in the following directory for the host OS.
/var/crash/<shutdown time of the guest OS (YYYYMMDDHHMMSS)>.<Domain name for the guest OS>.core
Example: node1 was forcefully stopped at 12:34:56 on 20th April, 2011
/var/crash/20110420123456.node1.core
Collecting Crash Dump
In a physical environment of the following models (with PRIMECLUSTER installed on a physical machine or on a host OS in a KVM environment), a crash dump caused by an OS panic cannot be collected.
RHEL8 environment in PRIMERGY RX1330M3
RHEL8 environment in PRIMERGY RX4770M3
RHEL8 environment in PRIMERGY TX1320M3
RHEL8 environment in PRIMERGY TX1330M3
PRIMERGY CX1430M1 environment
When manually collecting a crash dump, follow the procedure below. Otherwise, the node is shut down while collecting a crash dump, and crash dump collection ends in the middle.
Stopping the shutdown facility
Execute the following command on all the nodes to stop the shutdown facility.
# sdtool -e
Collecting a crash dump
Collect a crash dump manually.
Use one of the following methods to collect a crash dump.
Press the NMI button on the main device.
Press <Alt> + <SysRq> + <C> on the console.
Checking the LEFTCLUSTER state
Execute the following command on any node to make sure that the state of the node collecting a crash dump has become LEFTCLUSTER. If the node is not in the LEFTCLUSTER state, wait about 10 seconds and check it again.
# cftool -n
Note
If the time to detect the CF heartbeat timeout has been changed, (which means that CLUSTER_TIMEOUT is set in the /etc/default/cluster.config file,) wait for the heartbeat timeout period, and then make sure that the node is in the LEFTCLUSTER state.
Recovering the node from the LEFTCLUSTER state
Refer to "5.2 Recovering from LEFTCLUSTER" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide" to recover the node from the LEFTCLUSTER state.
Starting the shutdown facility
Execute the following command on all the nodes except the node collecting a crash dump to start the shutdown facility.
# sdtool -b