Top
PRIMECLUSTER Concepts Guide 4.3
FUJITSU Software

1.7.2 Oracle Solaris

This section describes the availability of cluster system in the following environments in Oracle Solaris.

1.7.2.1 Oracle Solaris (Physical environment and Oracle VM Server for SPARC environment)

This section describes the availability of cluster system in the following physical environment and Oracle VM Server for SPARC environment in Oracle Solaris.

Table 1.2 Availability according to each cluster system configuration

Monitoring target

Physical environment

Oracle VM Server for SPARC environment

Cluster system between guest domains among different physical partitions

Cluster system between guest domains within the same physical partition

Cluster system between control domains

(1) Physical partition

Y

Y

N

Y

(2) Shared disk and path of disk access

Y

Y

N

Y

(3) Public LAN

Y

Y

N

Y

(4) OS (physical and control domains)

Y

Y

Y*1

Y

(5) OS (guest domain)

-

Y

Y

Y*2

(6) Service (cluster application)

Y

Y

Y

Y*3

Service continuity when an error occurs Y: Available, N: Unavailable

*1 The service can be continued because the OS in the guest domain is available even when an OS error in the control domain occurs.

*2 The OS in the guest domain cannot be monitored. When the state of the guest domain (state displayed in the ldm list-domain) is in error, PRIMECLUSTER in the control domain monitors the state of the guest domain so that the service can be continued by switching the OS in the guest domain to the standby system.

*3 The service (cluster application) on the control domain can be monitored but the service on the guest domain cannot be monitored.

Figure 1.13 Physical environment

Figure 1.14 Oracle VM Server for SPARC environment

How to detect an error in the following targets to be monitored

  1. Physical partition

    The asynchronous monitoring linked with the system monitoring function of server immediately detects a panic or a reset triggered by an error in CPU, memory, or others, and the service is switched to the standby system.

  2. Shared disk and path of disk access

    Combining with the volume management function (GDS), the system detects a failure of a disk access or disk access path (monitored by Gds resource) and the service is switched to the standby system when the disk cannot be accessed or an error occurs in the entire communication path of disk access.

  3. Public LAN

    Combining with the network multiplexing function (GLS), the system detects a failure of network adapter or a path in the public LAN (monitored by Gls resource) and the service is switched to the standby system when an error occurs in the entire communication path of network.

  4. OS (physical and control domains)

    A panic or a reset of the OS is immediately detected by the asynchronous monitoring, and the service is switched to the standby system. A hang-up of the OS in the control domain is detected by the fixed-cycle monitoring of cluster interconnect (LAN) and the service is switched to the standby system.

    For the cluster system between guest domains within the same physical partition, an OS error in the control domain cannot be detected because it is a single domain.

  5. OS (guest domain)

    A panic or a reset of the OS is immediately detected by the asynchronous monitoring, and the service is switched to the standby system. A hang-up of the OS in the guest domain is detected by the fixed-cycle monitoring of cluster interconnect (LAN) and the service is switched to the standby system.

    For the cluster system between control domains, an error of the service in a guest domain cannot be detected.

  6. Service (cluster application)

    When a resource error of the cluster application occurs, the service is switched to the standby system.

1.7.2.2 Oracle Solaris (Oracle Solaris Kernel Zones environment)

This section describes the availability of following cluster systems in Oracle Solaris Kernel Zones.

Figure 1.15 Oracle Solaris Kernel Zones environment (among different physical partitions)

Figure 1.16 Oracle Solaris Kernel Zones environment (within the same physical partition)

How to detect an error in the following targets to be monitored

  1. Physical partition

    The asynchronous monitoring linked with the system monitoring function of server immediately detects a panic or a reset triggered by an error in CPU, memory, or others, and the service is switched to the standby system.

  2. Shared disk and path of disk access

    Combining with the volume management function (GDS), the system detects a failure of a disk access or disk access path (monitored by Gds resource) and the service is switched to the standby system when the disk cannot be accessed or an error occurs in the entire communication path of disk access.

  3. Public LAN

    Combining with the network multiplexing function (GLS), the system detects a failure of network adapter or a path in the public LAN (monitored by Gls resource) and the service is switched to the standby system when an error occurs in the entire communication path of network.

  4. OS (physical and control domains)

    A panic or a reset of the OS is immediately detected by the asynchronous monitoring, and the service is switched to the standby system. Additionally, a hang-up of the OS is detected by the fixed-cycle monitoring of cluster interconnect (LAN), and the service is switched to the standby system.

    For the cluster system between Kernel Zones within the same physical partition, an OS error in the control domain cannot be detected because it is a single domain.

  5. OS (guest domain)

    A panic or a reset of the OS is immediately detected by the asynchronous monitoring, and the service is switched to the standby system. Additionally, a hang-up of the OS is detected by the fixed-cycle monitoring of cluster interconnect (LAN), and the service is switched to the standby system.

    For the cluster system between Kernel Zones within the same guest domain, an OS error in the guest domain cannot be detected because it is a single domain.

  6. OS (Kernel Zones)

    A panic, a reset, or a hang-up of the OS is detected by the fixed-cycle monitoring of cluster interconnect (LAN), and the service is switched to the standby system.

  7. Service (cluster application)

    When a resource error of the cluster application occurs, the service is switched to the standby system.

1.7.2.3 Oracle Solaris (Oracle Solaris Zones environment)

This section describes the availability of cluster system in the following environments in Oracle Solaris Zones.

Table 1.4 Availability according to each cluster system configuration

Monitoring target

Cold-standby

Warm-standby

Single cluster

(1) Physical partition

Y

Y

-

(2) Shared disk and path of disk access

Y

Y

-

(3) Public LAN

Y

Y

-

(4) OS (global zone)

Y

Y

-

(5) OS (non-global zone)

Y

Y

Y*1

(6) Service (cluster application)

Y

Y

Y*2

Service continuity when an error occurs Y: Available, N: Unavailable

*1 When an error is detected, the service can be continued by restarting the non-global zone.

*2 When an error is detected, the service can be continued by restarting the service (cluster application).

Figure 1.17 Oracle Solaris Zones environment

How to detect an error in the following targets to be monitored

  1. Physical partition

    The asynchronous monitoring linked with the system monitoring function of server immediately detects a panic or a reset triggered by an error in CPU, memory, or others, and the service is switched to the standby system.

  2. Shared disk and path of disk access

    Combining with the volume management function (GDS), the system detects a failure of a disk access or disk access path (monitored by Gds resource) and the service is switched to the standby system when the disk cannot be accessed or an error occurs in the entire communication path of disk access.

  3. Public LAN

    Combining with the network multiplexing function (GLS), the system detects a failure of network adapter or a path in the public LAN (monitored by Gls resource) and the service is switched to the standby system when an error occurs in the entire communication path of network.

  4. OS (global zone)

    A panic or a reset of the OS is immediately detected by the asynchronous monitoring, and the service is switched to the standby system. A hang-up of the OS in the global zone is detected by the fixed-cycle monitoring of cluster interconnect (LAN) and the service is switched to the standby system.

  5. OS (non-global zone)

    • Check if login (zlogin command) to the non-global zone is possible. If it fails, the service is switched to the standby system.

    • For a single node cluster, restart the non-global zone.

  6. Service (cluster application)

    When a resource error of the cluster application occurs, the service is switched to the standby system.

    For a single cluster, restart the non-global zone.

    The following resources can be registered:

    • Fsystem resource: resource of the switchover File system that was created on the shared disk (ZFS or UFS is available)

    • Procedure resource: resource for FUJITSU middleware such as Interstage or Systemwalker

    • ISV resource: ISV product's resource such as Oracle or NetWorker (provided by Wizard product)

    • Process monitoring resource: Individual process resource such as user application

    • Cmdline resource: resource such as a user's own script or command