Check the messages output to the system log (/var/log/messages) if an error occurs on the primary master server that causes a switch to the secondary master server (when using replicated configuration), or if an error on one side of a replicated network causes a switch to the other side.
The status of the HA cluster and network replication can be checked with the following commands.
The status can be checked by running the hvdisp command on both the primary master server and the secondary master server.
Refer to the online Help of hvdisp for details of the hvdisp(1M) command.
The following example shows the status of the HA cluster after it switches to the secondary master server due to an error on the primary master server.
The error must be removed from the primary master server, because the status of app1 on this master server is "Faulted".
# hvdisp -a <Enter>
Local System: master1RMS
Configuration: /opt/SMAW/SMAWRrms/build/bdpp.us
Resource Type HostName State StateDetails
-----------------------------------------------------------------------------
master1RMS SysNode Online
master2RMS SysNode Online
app1 userApp Faulted Failed Over
Machine001_app1 andOp master1RMS
Machine000_app1 andOp master2RMS Offline
ManageProgram000_Cmd_APP1 gRes Offline
Ipaddress000_Gls_APP1 gRes Offline
# hvdisp -a <Enter> Local System: master2RMS Configuration: /opt/SMAW/SMAWRrms/build/bdpp.us Resource Type HostName State StateDetails ----------------------------------------------------------------------------- master1RMS SysNode Online master2RMS SysNode Online app1 userApp Online app1 userApp master1RMS Online Machine001_app1 andOp master2RMS Online Machine000_app1 andOp master1RMS ManageProgram000_Cmd_APP1 gRes Online Ipaddress000_Gls_APP1 gRes Online
In the example below, the cause of the error has been removed from the primary master server and the recovered state is shown.
Failback to the primary master server is possible, because the status of app1 on this master server is "Offline".
# hvdisp -a <Enter>
Local System: master1RMS
Configuration: /opt/SMAW/SMAWRrms/build/bdpp.us
Resource Type HostName State StateDetails
-----------------------------------------------------------------------------
master1RMS SysNode Online
master2RMS SysNode Online
app1 userApp Offline
app1 userApp master2RMS Online
Machine001_app1 andOp master2RMS
Machine000_app1 andOp master1RMS Offline
ManageProgram000_Cmd_APP1 gRes Offline
Ipaddress000_Gls_APP1 gRes Offline
The status of the primary and secondary master servers can be checked by executing the cftool command on the respective server.
Refer to the online Help of cftool for details of the cftool(1M) command.
In the example below, an error has occurred in the cluster interconnect (CIP) and the status of the remote node cannot be determined. In this situation, a cluster partition is assumed to have occurred.
# cftool -n <Enter> Node Number State Os Cpu master1 1 UP Linux EM64T master2 2 LEFTCLUSTER Linux EM64T
# cftool -n <Enter> Node Number State Os Cpu master1 1 LEFTCLUSTER Linux EM64T master2 2 UP Linux EM64T
Check the status using the dsphanet command. The status can be checked on the master server and slave servers.
Refer to "7.4 dsphanet Command" under "Chapter 7 Command references" in the "PRIMECLUSTER Global Link Services Configuration and Administration Guide 4.3 Redundant Line Control Function" for information on the dsphanet command.
# /opt/FJSVhanet/usr/sbin/dsphanet <Enter> [IPv4,Patrol / Virtual NIC] Name Status Mode CL Device +----------+--------+----+----+------------------------------------------------+ sha0 Active d ON eth5(ON),eth9(OFF) sha1 Active p OFF sha0(ON) [IPv6] Name Status Mode CL Device +----------+--------+----+----+------------------------------------------------+
# /opt/FJSVhanet/usr/sbin/dsphanet <Enter> [IPv4,Patrol / Virtual NIC] Name Status Mode CL Device +----------+--------+----+----+------------------------------------------------+ sha0 Active e OFF eth5(ON),eth9(OFF) sha1 Active p OFF sha0(ON) [IPv6] Name Status Mode CL Device +----------+--------+----+----+------------------------------------------------+