The virtual machine function is to operate the PRIMECLUSTER systems in virtualized environments for Oracle Solaris.
There are the following virtualized environments:
Oracle VM Server for SPARC
Oracle Solaris Zones environment
Note
When installing PRIMECLUSTER in a virtual machine environment, do not perform the following procedures:
Stopping the guest domain and the I/O root domain temporarily (suspend)
Restarting the guest domain and the I/O root domain from the temporary stopped state(resume)
The following cluster system configurations are supported in an Oracle VM Server for SPARC Environment:
Cluster system between guest domains within the same physical partition (Supported only for SPARC M10)
Cluster system between guest domains among different physical partitions (Supported only for SPARC M10)
Cluster system between control domains
Monitoring and notes of each cluster system are as follows.
Cluster system configuration | Cluster system configuration | Monitoring | Notes | ||
---|---|---|---|---|---|
Guest domain | Control domain | ||||
Cluster system between guest domains | Within the same physical partition | Y | N | - The cluster application error on the guest domain or the I/O root domain -The OS error on the guest domain or the I/O root domain | Since this environment only comprises one physical partition, all of the cluster nodes will be stopped when the physical partition failure occurs. Therefore, this mode is not suitable for the practical business. |
Among the different physical partitions | Y | Y | - The cluster application error - The OS error on the control domain, the guest domain, or the I/O root domain - The hardware (network, shared disk and the route) faults - The physical partition error | You must build the cluster system between the cabinets. | |
Cluster system between control domains | N | Y | - The cluster application error on the control domain - The control domain OS error - The control domain hardware (network, shared disk and the route) faults - The error of the guest domain status (which is displayed by the ldm list-domain command) | PRIMECLUSTER does not monitor the status of guest domains and applications. |
Note
A tagged VLAN interface cannot be used for the cluster interconnect.
This configuration enables the cluster system to operate on guest domains or on I/O root domains within a single physical partition. This is effective when verifying operation of cluster applications operated on PRIMECLUSTER. The following types of error monitoring are performed in this configuration. This configuration supports only for SPARC M10.
The cluster application error on the guest domain or the I/O root domain
The OS error on the guest domain or the I/O root domain
Note
Since this environment comprises a single physical partition, when the physical partition failure occurred, all cluster nodes are stopped. In consequence, the transaction comes to a stop. Therefore, this mode is not suitable for business operation.
Specify the same type for domains in the cluster. A cluster cannot be configured with different types of domains, for example, between the guest domain and I/O root domain, or between the control domain and I/O root domain.
When using a virtual disk as a shared disk of a cluster between guest domains in PRIMECLUSTER, you need to specify a timeout option of the virtual disk.
When omitting the timeout option, or specifying 0 to the option, an I/O error does not occur even a service domain stops. You should wait for the recovery of the service domain.
When specifying a value greater than 0 to the timeout option, an I/O error will occur after the specified number of seconds has passed.
The following explains how to specify a timeout option:
Example 1: Specifying 15 (seconds) to the timeout when assigning a virtual disk.
# ldm add-vdisk timeout=15 vdisk0 disk0@primary-vds0 guest0
Example 2: Specifying 15 (seconds) to the timeout for the assigned virtual disk.
# ldm set-vdisk timeout=15 vdisk0 guest0
For details on the timeout option, see Oracle VM administration guide.
This enables the cluster system to operate between guest domains or I/O root domains (including an I/O domain) among different physical partitions. In a cluster system that consist of only guest domains and I/O root domains, when the physical partition failure occurred, the nodes that construct the cluster may come into the LEFTCLUSTER state. For dealing with this, installing PRIMECLUSTER switches cluster applications on the guest domain or I/O root domain automatically even when the physical partition failure occurred. The following types of error monitoring are performed in this configuration. This configuration supports only for SPARC M10.
Cluster application errors on a control domain, a guest domain, or an I/O root domain
OS errors on a control domain, a guest domain, or an I/O root domain
Hardware (network, shared disk and the route) faults
Physical partition errors
However, use this function with careful consideration of system design because this function limits other functions, such as disabling the RMS priority (ShutdownPriority) setting.
Note
When building the cluster system on multiple physical partitions within a single cabinet, the transaction comes to a stop if the cabinet failed. Therefore, you must build the cluster system between the cabinets.
When creating the cluster application on the control domain, the guest domain, or the I/O root domain, do not specify the RMS priority (ShutdownPriority) attribute.
Set the survival priority of guest domains or I/O root domains so as to be the same order relation as that of the control domain.
When a failure of the control domain (including the cluster application error) is detected and the control domain cannot be forcibly stopped, all the guest domains or all the I/O domains within the failed physical partition are stopped regardless of whether a cluster exists. This is because of stopping the physical partition forcibly.
When a virtual I/O is set on the control domain, the guest domain within the failed physical partition may be stopped regardless of whether a cluster exists.
Specify the same type for domains in the cluster. A cluster cannot be configured with different types of domains, for example, between the guest domain and I/O root domain, or between the control domain and I/O root domain.
When using a virtual disk as a shared disk of a cluster between guest domains in PRIMECLUSTER, you need to specify a timeout option of the virtual disk.
When omitting the timeout option, or specifying 0 to the option, an I/O error does not occur even a service domain stops. You should wait for the recovery of the service domain.
When specifying a value greater than 0 to the timeout option, an I/O error will occur after the specified number of seconds has passed.
The following explains how to specify a timeout option:
Example 1: Specifying 15 (seconds) to the timeout when assigning a virtual disk.
# ldm add-vdisk timeout=15 vdisk0 disk0@primary-vds0 guest0
Example 2: Specifying 15 (seconds) to the timeout for the assigned virtual disk.
# ldm set-vdisk timeout=15 vdisk0 guest0
For details on the timeout option, see Oracle VM administration guide.
This configuration applies PRIMECLUSTER on the control domain in an environment where the guest domain is configured, so that the cluster on the control domain can monitor the state of the guest domain.
In this configuration, the operation can be continued even when the hardware (networks and disks) failed for starting the guest domain on the other control domain to continue the operation by executing failover of the control domain. Applying PRIMECLUSTER to the control domain monitors the following failures that disable applications on guest domains:
The cluster application error on the control domain
The control domain OS error
The control domain hardware (network, shared disk and the route) fault
The guest domain status (the state displayed on the ldm list-domain) error
When a failure occurred, the guest domain is switched to the standby system to realize the guest domain environment with high reliability.
Note
PRIMECLUSTER does not monitor the status of guest domains and applications.
When using the cluster system between the control domains, the redundant line control method supported by GLS is only the NIC switching mode.
Note
GLS must be installed in both control and guest domains.
The I/O used in a guest domain must only be assigned to a virtual disk provided in a control domain.
Multiple guest domains on the same control domain cannot share a GDS shared class. When configuring multiple guest domains, please create shared classes separately for each guest domain.
When a failure of the control domain (including the cluster application error) is detected and the control domain cannot be forcibly stopped, all the guest domains or all the I/O domains within the failed physical partition are stopped regardless of whether a cluster exists. This is because of stopping the physical partition forcibly.
When the virtual I/O is set on the control domain, the guest domain within the failed physical partition may be stopped regardless of whether a cluster exists.
Following two types of the Migration function can be used for a cluster system in Oracle VM Server for SPARC Environment:
Live Migration
Transferring an active guest domain.
Cold Migration
Transferring an inactive guest domain.
(The patch for PRIMECLUSTER [T007881SP-02 or later for Solaris 10, or T007882SP-02 or later for Solaris 11] needs to be applied.)
These functions can be used in combination with ServerView Resource Orchestrator Cloud Edition. (The patch for PRIMECLUSTER [T007881SP-02 or later for Solaris 10, or T007882SP-02 or later for Solaris 11] needs to be applied.)
The Migration function of Oracle VM Server for SPARC can be used in the following cluster system configuration:
Cluster system between guest domains among different physical partitions (Supported only for SPARC M10)
To use the Migration function of Oracle VM Server for SPARC in a cluster system, you can do server maintenance while keeping a redundant configuration for active and standby servers.
You can also do server maintenance while keeping a redundant configuration for active and standby servers between physical partitions by configuring a cluster system using not only active and standby servers but also a spare server in a control domain.
By the Cold Migration to an inactive guest domain, the guest domain can be started in a spare server.
A redundant configuration for active and standby servers can be maintained even during the maintenance of a standby server.
Prerequisites are needed for using the Migration function of Oracle VM Server for SPARC in a cluster system. For details, see "Chapter 14 When Using the Migration Function in Oracle VM Server for SPARC Environment."
Note
A cluster system is not switched during the Migration.
Do not perform the Migration during a cluster system switchover.
In the physical environment, you can migrate a cluster system that uses PRIMECLUSTER 4.2A00 or later to a guest domain (or I/O root domain) in Oracle VM Server for SPARC Environment (Physical to Virtual: hereafter referred to as P2V). (Only SPARC M10 is supported)
See
For how to migrate it with P2V, see "Chapter 15 When Using Oracle VM Server for SPARC P2V Tool to Migrating a Cluster System."
For specification changes of PRIMECLUSTER after migration, see the following:
"PRIMECLUSTER Global Link Services Configuration and Administration Guide 4.3: Redundant Line Control Function"
"PRIMECLUSTER Global Disk Services Configuration and Administration Guide 4.3"
For system requirements and notes on migration, see "Oracle VM Server for SPARC Administration Guide."
PRIMECLUSTER version
PRIMECLUSTER 4.2A00 or later
Supported OS
Solaris 10
GLS redundant line switching method
NIC switching mode and GS/SURE linkage mode
File system in a shared disk
UFS, ZFS, and GFS(only for Solaris 10)
Note
The disk size of GDS volumes in a shared disk must be the same in the migration source and migration destination.
You must migrate user data with ETERNUS storage migration or LUN to LUN such as REC beforehand.
Use GDS and GLS functions in the cluster system on a guest domain after migration.
Use the same configuration as the migration source after the migration.
See
If a setting has to be changed after completing the cluster system migration, see the following to change the setting:
"PRIMECLUSTER Global Link Services Configuration and Administration Guide 4.3: Redundant Line Control Function"
"PRIMECLUSTER Global Disk Services Configuration and Administration Guide 4.3"
"PRIMECLUSTER Global File Services Configuration and Administration Guide 4.3"
Note
There are some important points when using a cluster system in an Oracle VM Server for SPARC Environment. For details, see "12.2 Precautions on Using Cluster Systems in Oracle VM Server for SPARC Environments."
In an Oracle Solaris Zones environment, the applications on the non-global zone enter an inoperable status when an error occurs to the global zone or non-global zone.
Applying PRIMECLUSTER to the global zone and non-global zone provides status monitoring and a switchover function. Through these means, it becomes possible to switch over to a standby system in the event of an error occurring, and to achieve high reliability for the non-global zone.
You can build Oracle Solaris Zones environments on guest OS domains in Oracle VM Server for SPARC Environments (only for SPARC M10) as well as on physical server environments.
In addition, when the global zone is Solaris 10, the existing systems running on Solaris 8 or Solaris 9 can also be run on Solaris 10 by migrating them to the non-global zone with Oracle Solaris Legacy Containers (OSLC). (*1)
PRIMECLUSTER provides a status monitoring and switchover function for the non-global zone running on Solaris 8 or Solaris 9. Through these means, it becomes possible to switch over to a standby system in the event of an error occurring, and to achieve high reliability for the non-global zone running on Solaris 8 or Solaris 9.
(*1) To check whether the using middleware product is available in the non-global zone using Oracle Solaris Legacy Containers, see the respective middleware product manuals.
Global zone status monitoring and switchover
PRIMECLUSTER monitors the following statuses:
Global zone OS errors
Global zone hardware (network, shared disk, and the route) faults
If PRIMECLUSTER detects an OS error, it stops all of the non-global zones operating on that global zone and switches them over to the standby system.
Non-global zone status monitoring and switchover
PRIMECLUSTER monitors the following statuses:
Non-global zone status
OS errors on the non-global zones
Status of applications operating on the non-global z
If PRIMECLUSTER detects an error, it switches the affected non-global zones over to the standby system.
If using a cluster system comprised of three or more nodes, consolidation of the standby server becomes possible by preparing one standby server for the multiple operating servers. An example is shown below.
If using a single-node cluster comprised of one node, the status of OS and applications on the non-global zone are monitored. The availability is increased by restarting the non-global zone or an application on the non-global zone automatically to perform recovery when an error is detected. An example is shown in the following figure.
Note
It is not possible to change the "cluster name" or "CF node name" in the non-global zone.
The following functions or commands cannot be used in the non-global zone:
Automatic configure
Shared disk device connection confirmation
Operator intervention
Fault resource identification
Patrol diagnosis
clsyncfile (distributes a file between cluster nodes)
The operations do not take over between the non-global zones operating on the same global zone.