Standby operation of the NetWorker server is provided on PRIMECLUSTER. The process monitoring facility (detector), which is exclusive to NetWorker provided by PRIMECLUSTER Wizard for NetWorker, automatically detects and recovers from a failure. This enhances the availability of the backup system.
By running the NetWorker server on a cluster system, the tape unit and the NetWorker management database (index and NetWorker settings) can be shared. The backup operation is taken over to the standby node in the event of a failover.
Backup using the logical node is also enabled.
Note
The NetWorker storage node serves as a NetWorker client.
The backup takeover is enabled by running RMS (Reliant Monitor Services) configuration script that is exclusive to NetWorker provided by PRIMECLUSTER Wizard for NetWorker and using the process monitoring facility that monitors the NetWorker server on PRIMECLUSTER.
When a failover occurs during backup operation, backup is automatically switched to the standby node. The NetWorker server automatically starts on the standby node (Next Online node), and backup is restarted without the NetWorker backup definitions.
Also, the backup data is recovered on the next online node using the data that is backed up with the previously online node.
NetWorker errors are automatically detected and corrected by running the RMS (Reliant Monitor Services) configuration script that is exclusive to NetWorker provided by PRIMECLUSTER Wizard for NetWorker and the process monitoring facility that monitors the NetWorker server on PRIMECLUSTER.
This automatic detection and recovery function reduces downtime of backup application.
RMS configuration script
RMS (Reliant Monitor Services) starts the RMS configuration script to start or stop NetWorker. Also, when an error occurs during NetWorker operation, the RMS configuration script is restarted with the AutoRecover function to recover the NetWorker operation. The RMS configuration script checks the NetWorker management database index during NetWorker startup. If an error occurs in the index, then it is automatically corrected.
NetWorker process monitoring facility
The PRIMECLUSTER NetWorker process monitoring facility monitors the process of NetWorker. When the process monitoring facility detects a NetWorker error, it notifies RMS of the error. First, RMS attempts to restart NetWorker once, but if it doesn't work, it brings the Online node Faulted. The process monitoring facility continues monitoring NetWorker even after notifying RMS of the error. When NetWorker is restarted with the RMS configuration script, the process monitoring facility starts monitoring NetWorker again.
Details on each error and its recovery are given below:
Failure recovery
The process monitoring facility monitors NetWorker, if an error is detected, it notifies RMS of the error. If the AutoRecover attribute is enabled, the RMS starts up NetWorker startup script after receiving the error notification.
See
To determine whether the AutoRecover function should be enabled, refer to the PRIMECLUSTER manuals.
Recovery of the NetWorker management database index
The RMS configuration script checks the index by executing the NetWorker commands. If it determines that it is necessary to correct the index of the NetWorker management database, it corrects the index of the target client by executing the NetWorker commands. The RMS configuration script determines if the index needs to be recovered using the NetWorker command.
The index error might be caused by the following cases:
A failure occurs in NetWorker during backup
A running node is powered off during backup, and a failover occurs
You can select Yes or No in the recovery mode parameter provided by PRIMECLUSTER Wizard for NetWorker for index correction.
See
For details on how to set the recovery mode, see the "PRIMECLUSTER Wizard for NetWorker Configuration and Administration Guide."
For details on how to correct index errors manually, see the "PRIMECLUSTER Wizard for NetWorker Configuration and Administration Guide."