Top
Systemwalker Operation Manager  User's Guide
FUJITSU Software

13.1 Continuing Job Operations at Schedule Server System Down

Whether to continue network job and distributed execution job operations during the schedule server down time can be defined using the continuous execution mode switching command (jmmode command). The following describes how to define the continuous execution mode switching command (jmmode command). Note that the following operations should be performed in the same way, on all of the linked schedule servers and execution servers.

Procedure

  1. Stopping the services or the daemon

    Terminate the following services/daemons.

    • Jobscheduler service/daemon

    • Job Execution Control service/daemon

  2. Executing the continuous execution mode switching command (jmmode command)

    Execute the jmmode command for which the continue operand has been specified to validate the continuous execution mode for network and distributed execution jobs. For details on the jmmode command, see the "Shared Commands" section of the Systemwalker Operation Manager Reference Guide.

  3. Starting the services or the daemons

    Start the following services/daemons.

    • Job Execution Control service/daemon

    • Jobscheduler service/daemon

Note

  • For Windows system, only network and distributed execution jobs can be continued at schedule server system down. The jobs that are executing on the schedule server cannot be continued.

    For UNIX system, only network and distributed execution jobs within the job nets with the Job Execution Control attribute can be continued at schedule server system down. The jobs in the job nets with other attribute or the jobs that are executing on the schedule server cannot be continued.

  • If the JCL file is used, the job execution cannot be continued at schedule server system down.

  • If a Jobscheduler command (other than the jobschprint command) has been used to register job nets or groups, the schedule file will not be duplicated. The schedule file may be corrupted if the schedule server system down has occurred while any of these commands is in execution.

  • Use the same continuous execution mode on both the schedule server and all of the execution servers to which execution requests are sent. The following problems sometimes occur when the same continuous execution mode is not used:

    • During schedule server system downs, "Executing" may be displayed even though the jobs have terminated

    • Jobs may output an error message and terminate abnormally

    • Network job execution may be duplicated

  • Continuous job execution is possible only in environments where updates are reliably reflected in the physical disk. In environments where updates are reflected only in the cache, information on jobs that were running when the system fails does not remain in the disk, so continuous job execution is not guaranteed.

  • In the UNIX version, continuous job execution applies to abnormal stop of the Job Execution Control daemon, but in the Windows version, it applies both to abnormal stop and to normal stop of the service.

    This difference is intended to handle the following discrepancy in stop operations when failover is performed in a cluster system, in order to enable continuous job execution in the Windows version as well:

    • UNIX version: A stop of the Job Execution Control daemon is regarded as an abnormal stop.

    • Windows version: A stop of the Job Execution Control service is regarded as a normal stop.

Note

Continuous execution mode for network jobs and distributed execution jobs

As a default, the continuous execution mode is specified to cancel network jobs and distributed execution jobs if the schedule server system down has occurred. In such cases, the job execution will be considered to be interrupted (completion code = 239) at the schedule server startup after the system down.

Note

Duplication of schedule files

When schedule information file redundancy is specified with the jmmode command, write operations to the backup file will be synchronized as well as write operations to the schedule information file. Synchronization of write operations to the backup file helps prevent inconsistencies from occurring in the schedule information file, but it also adversely affects the scheduling and execution performance of groups, job nets and jobs. Before deciding to use this function, carry out thorough performance validation tests to determine whether job nets are running correctly according to schedule. Even when redundancy is specified, schedule information file discrepancies can still occur in some situations, such as when a power failure occurs while the operating system is writing to the hard disk. Consider making regular backups to guard against such situations.

Degradation of scheduling and execution performance can be avoided by not implementing schedule information file redundancy with the jmmode command. However, this will increase the risk of schedule information file inconsistencies due to events such as unexpected power outages, so regular backups should be made to facilitate recovery in the event of such problems.

Note

Policy distribution [Windows version]

To extract the policy information (the execution parameter information for the Jobscheduler) from a server environment where the continuous execution mode for network and distributed execution jobs has been validated and distribute and apply it to another server environment where the continuous execution mode has not been validated, distribute the policy information, execute the jmmode command with the continue operand specified on the distribution server and validate the continuous execution mode.

Conversely, to extract the policy information (the execution parameter information for the Jobscheduler) from a server environment where the continuous execution mode for network jobs has not been validated and to distribute and apply it to another server environment where the continuous execution mode has been validated, distribute the policy information, execute the jmmode command with the cancel operand specified on the distribution server and validate the continuous execution mode.