Top
PRIMECLUSTER  Installation and Administration Guide4.3

2.2.1 Virtual Machine Function

The virtual machine function is to operate the PRIMECLUSTER systems in virtualized environments for Oracle Solaris.

There are the following virtualized environments:

Note

When installing PRIMECLUSTER in a virtual machine environment, do not perform the following procedures:

  • Stopping the guest domain and the I/O root domain temporarily (suspend)

  • Restarting the guest domain and the I/O root domain from the temporary stopped state(resume)

2.2.1.1 Cluster Systems in Oracle VM Server for SPARC Environment

2.2.1.1.1 Cluster System Configuration in Oracle VM Server for SPARC Environment

The following cluster system configurations are supported in an Oracle VM Server for SPARC Environment:

Monitoring and notes of each cluster system are as follows.

Cluster system configuration

Cluster system configuration

Monitoring

Notes

Guest

domain

Control

domain

Cluster system between guest domains

Within the same physical partition

Y

N

- The cluster application error on the guest domain or the I/O root domain

-The OS error on the guest domain or the I/O root domain

Since this environment only comprises one physical partition, all of the cluster nodes will be stopped when the physical partition failure occurs. Therefore, this mode is not suitable for the practical business.

Among the different physical partitions

Y

Y

- The cluster application error

- The OS error on the control domain, the guest domain, or the I/O root domain

- The hardware (network, shared disk and the route) faults

- The physical partition error

You must build the cluster system between the cabinets.

Cluster system between control domains

N

Y

- The cluster application error on the control domain

- The control domain OS error

- The control domain hardware (network, shared disk and the route) faults

- The error of the guest domain status (which is displayed by the ldm list-domain command)

PRIMECLUSTER does not monitor the status of guest domains and applications.

Note

A tagged VLAN interface cannot be used for the cluster interconnect.

Cluster system between guest domains within the same physical partition

This configuration enables the cluster system to operate on guest domains or on I/O root domains within a single physical partition. This is effective when verifying operation of cluster applications operated on PRIMECLUSTER. The following types of error monitoring are performed in this configuration. This configuration supports only for SPARC M10.

  • The cluster application error on the guest domain or the I/O root domain

  • The OS error on the guest domain or the I/O root domain

    Figure 2.1 Cluster system between guest domains within the same physical partition

Note

  • Since this environment comprises a single physical partition, when the physical partition failure occurred, all cluster nodes are stopped. In consequence, the transaction comes to a stop. Therefore, this mode is not suitable for business operation.

  • Specify the same type for domains in the cluster. A cluster cannot be configured with different types of domains, for example, between the guest domain and I/O root domain, or between the control domain and I/O root domain.

  • When using a virtual disk as a shared disk of a cluster between guest domains in PRIMECLUSTER, you need to specify a timeout option of the virtual disk.

    [Specifying a timeout option]

    When omitting the timeout option, or specifying 0 to the option, an I/O error does not occur even a service domain stops. You should wait for the recovery of the service domain.
    When specifying a value greater than 0 to the timeout option, an I/O error will occur after the specified number of seconds has passed.

    The following explains how to specify a timeout option:

    Example 1: Specifying 15 (seconds) to the timeout when assigning a virtual disk.

    # ldm add-vdisk timeout=15 vdisk0 disk0@primary-vds0 guest0

    Example 2: Specifying 15 (seconds) to the timeout for the assigned virtual disk.

    # ldm set-vdisk timeout=15 vdisk0 guest0

    For details on the timeout option, see Oracle VM administration guide.

Cluster system between guest domains among different physical partitions

This enables the cluster system to operate between guest domains or I/O root domains (including an I/O domain) among different physical partitions. In a cluster system that consist of only guest domains and I/O root domains, when the physical partition failure occurred, the nodes that construct the cluster may come into the LEFTCLUSTER state. For dealing with this, installing PRIMECLUSTER switches cluster applications on the guest domain or I/O root domain automatically even when the physical partition failure occurred. The following types of error monitoring are performed in this configuration. This configuration supports only for SPARC M10.

  • Cluster application errors on a control domain, a guest domain, or an I/O root domain

  • OS errors on a control domain, a guest domain, or an I/O root domain

  • Hardware (network, shared disk and the route) faults

  • Physical partition errors

However, use this function with careful consideration of system design because this function limits other functions, such as disabling the RMS priority (ShutdownPriority) setting.

Note

  • When building the cluster system on multiple physical partitions within a single cabinet, the transaction comes to a stop if the cabinet failed. Therefore, you must build the cluster system between the cabinets.

  • When creating the cluster application on the control domain, the guest domain, or the I/O root domain, do not specify the RMS priority (ShutdownPriority) attribute.

  • Set the survival priority of guest domains or I/O root domains so as to be the same order relation as that of the control domain.

  • When a failure of the control domain (including the cluster application error) is detected and the control domain cannot be forcibly stopped, all the guest domains or all the I/O domains within the failed physical partition are stopped regardless of whether a cluster exists. This is because of stopping the physical partition forcibly.

  • When a virtual I/O is set on the control domain, the guest domain within the failed physical partition may be stopped regardless of whether a cluster exists.

  • Specify the same type for domains in the cluster. A cluster cannot be configured with different types of domains, for example, between the guest domain and I/O root domain, or between the control domain and I/O root domain.

  • When using a virtual disk as a shared disk of a cluster between guest domains in PRIMECLUSTER, you need to specify a timeout option of the virtual disk.

    [Specifying a timeout option]

    When omitting the timeout option, or specifying 0 to the option, an I/O error does not occur even a service domain stops. You should wait for the recovery of the service domain.
    When specifying a value greater than 0 to the timeout option, an I/O error will occur after the specified number of seconds has passed.

    The following explains how to specify a timeout option:

    Example 1: Specifying 15 (seconds) to the timeout when assigning a virtual disk.

    # ldm add-vdisk timeout=15 vdisk0 disk0@primary-vds0 guest0

    Example 2: Specifying 15 (seconds) to the timeout for the assigned virtual disk.

    # ldm set-vdisk timeout=15 vdisk0 guest0

    For details on the timeout option, see Oracle VM administration guide.

Figure 2.2 Cluster System between guest domains among different physical partitions

Figure 2.3 Switching image when the physical partition failure occurred

Cluster system between the control domain

This configuration applies PRIMECLUSTER on the control domain in an environment where the guest domain is configured, so that the cluster on the control domain can monitor the state of the guest domain.

In this configuration, the operation can be continued even when the hardware (networks and disks) failed for starting the guest domain on the other control domain to continue the operation by executing failover of the control domain. Applying PRIMECLUSTER to the control domain monitors the following failures that disable applications on guest domains:

  • The cluster application error on the control domain

  • The control domain OS error

  • The control domain hardware (network, shared disk and the route) fault

  • The guest domain status (the state displayed on the ldm list-domain) error

When a failure occurred, the guest domain is switched to the standby system to realize the guest domain environment with high reliability.

Figure 2.4 Cluster System between control domains

Note

  • PRIMECLUSTER does not monitor the status of guest domains and applications.

When using the cluster system between the control domains, the redundant line control method supported by GLS is only the NIC switching mode.

Note

  • GLS must be installed in both control and guest domains.

  • The I/O used in a guest domain must only be assigned to a virtual disk provided in a control domain.

  • Multiple guest domains on the same control domain cannot share a GDS shared class. When configuring multiple guest domains, please create shared classes separately for each guest domain.

  • When a failure of the control domain (including the cluster application error) is detected and the control domain cannot be forcibly stopped, all the guest domains or all the I/O domains within the failed physical partition are stopped regardless of whether a cluster exists. This is because of stopping the physical partition forcibly.

  • When the virtual I/O is set on the control domain, the guest domain within the failed physical partition may be stopped regardless of whether a cluster exists.

2.2.1.1.2 Migration for a Cluster System in Oracle VM Server for SPARC Environment

Following two types of the Migration function can be used for a cluster system in Oracle VM Server for SPARC Environment:

These functions can be used in combination with ServerView Resource Orchestrator Cloud Edition. (The patch for PRIMECLUSTER [T007881SP-02 or later for Solaris 10, or T007882SP-02 or later for Solaris 11] needs to be applied.)

The Migration function of Oracle VM Server for SPARC can be used in the following cluster system configuration:

2.2.1.1.3 When Migrating a Cluster System in the Physical Environment to a Guest Domain in Oracle VM Server for SPARC Environment (Physical to Virtual)

In the physical environment, you can migrate a cluster system that uses PRIMECLUSTER 4.2A00 or later to a guest domain (or I/O root domain) in Oracle VM Server for SPARC Environment (Physical to Virtual: hereafter referred to as P2V). (Only SPARC M10 is supported)

See

Figure 2.10 Cluster system before migration

Figure 2.11 Cluster system after migration

System requirement for migration
  • PRIMECLUSTER version

    PRIMECLUSTER 4.2A00 or later

  • Supported OS

    Solaris 10

  • GLS redundant line switching method

    NIC switching mode and GS/SURE linkage mode

  • File system in a shared disk

    UFS, ZFS, and GFS(only for Solaris 10)

Note

  • The disk size of GDS volumes in a shared disk must be the same in the migration source and migration destination.

  • You must migrate user data with ETERNUS storage migration or LUN to LUN such as REC beforehand.

  • Use GDS and GLS functions in the cluster system on a guest domain after migration.

Use the same configuration as the migration source after the migration.

See

If a setting has to be changed after completing the cluster system migration, see the following to change the setting:

  • "Part 4 System Configuration Modification"

  • "PRIMECLUSTER Global Link Services Configuration and Administration Guide 4.3: Redundant Line Control Function"

  • "PRIMECLUSTER Global Disk Services Configuration and Administration Guide 4.3"

  • "PRIMECLUSTER Global File Services Configuration and Administration Guide 4.3"

Note

There are some important points when using a cluster system in an Oracle VM Server for SPARC Environment. For details, see "12.2 Precautions on Using Cluster Systems in Oracle VM Server for SPARC Environments."

2.2.1.2 Cluster System Operating in Oracle Solaris Zones Environment

In an Oracle Solaris Zones environment, the applications on the non-global zone enter an inoperable status when an error occurs to the global zone or non-global zone.
Applying PRIMECLUSTER to the global zone and non-global zone provides status monitoring and a switchover function. Through these means, it becomes possible to switch over to a standby system in the event of an error occurring, and to achieve high reliability for the non-global zone.

You can build Oracle Solaris Zones environments on guest OS domains in Oracle VM Server for SPARC Environments (only for SPARC M10) as well as on physical server environments.

In addition, when the global zone is Solaris 10, the existing systems running on Solaris 8 or Solaris 9 can also be run on Solaris 10 by migrating them to the non-global zone with Oracle Solaris Legacy Containers (OSLC). (*1)

PRIMECLUSTER provides a status monitoring and switchover function for the non-global zone running on Solaris 8 or Solaris 9. Through these means, it becomes possible to switch over to a standby system in the event of an error occurring, and to achieve high reliability for the non-global zone running on Solaris 8 or Solaris 9.

(*1) To check whether the using middleware product is available in the non-global zone using Oracle Solaris Legacy Containers, see the respective middleware product manuals.

Figure 2.12 Switchover for When a Global Zone OS Error Occurs

Figure 2.13 Switchover for When some Application Error Occurs in a Non-Global Zone

If using a cluster system comprised of three or more nodes, consolidation of the standby server becomes possible by preparing one standby server for the multiple operating servers. An example is shown below.

Figure 2.14 Switchover for When an OS Error for a Global Zone on a Three-Node Configuration Zones Environments Occurs

If using a single-node cluster comprised of one node, the status of OS and applications on the non-global zone are monitored. The availability is increased by restarting the non-global zone or an application on the non-global zone automatically to perform recovery when an error is detected. An example is shown in the following figure.

Figure 2.15 The Operations When an OS Error for a Non-Global Zone on a Single-Node Cluster Operation Zones Environments Occurs.

Note

  • It is not possible to change the "cluster name" or "CF node name" in the non-global zone.

  • The following functions or commands cannot be used in the non-global zone:

    • Automatic configure

    • Shared disk device connection confirmation

    • Operator intervention

    • Fault resource identification

    • Patrol diagnosis

    • clsyncfile (distributes a file between cluster nodes)

  • The operations do not take over between the non-global zones operating on the same global zone.