When using PRIMECLUSTER in a VMware environment, clustering (virtual machine function) between guest OSes on multiple ESX hosts and clustering (virtual machine function) between guest OSes on a single ESX host are available.
When an error occurs on an ESX host or a guest OS within a VMware environment, applications on that guest OS will no longer work. In clustering between guest OSes on multiple ESX hosts, with PRIMECLUSTER applied to guest OSes, when an error occurs, applications will failover from the active guest OS to a standby guest OS in the event of a failure, which creates a highly reliable guest OS environment.
When using shared disks and a failover occurs with guest OSes not completely stopped (e.g. when the OS is hanging), I/O fencing can be used for putting the unstopped guest OSes into the panic state and thereby securely stop guest OSes. Also, it is possible to prevent access from both guest OSes to the shared disk in order to effect a safe and secure failover.
In clustering between guest OSes on a single ESX host, an automatic switchover occurs only for an application error on guest OSes.
Figure H.1 Cluster Systems in a VMware Environment
See
For details on VMware, see the documentation for VMware.
Note
Up to two nodes can be added to one cluster system.
If the guest OS fails in clustering between guest OSes on a single ESX host, the node will become the LEFTCLUSTER state. For how to recover from the LEFTCLUSTER state, see "6.2 Recovering from LEFTCLUSTER" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide." For the following operations, see "7.2 Operating the PRIMECLUSTER System."
Since cluster systems between guest OSes on multiple ESX hosts in a VMware environment check guest OS statuses via network paths (administrative LAN or interconnect) before effecting a failover, it may happen that guest OSes on which an error occurred are not completely stopped (e.g. when the OS is hanging). Therefore, when using shared disks, you should make sure to set up I/O fencing.
I/O fencing must be set up at the earlier stage of configuring the cluster application.
When using I/O fencing, the shared disk device should be managed by GDS.
To use a switchover file system on the shared disk, you cannot specify the device of the file system to be mounted by defining names such as the label name or the udev functionality in the /etc/fstab.pcl file. Use the device name which can be specified by the entry beginning with /dev/sfdsk.
If a failure in the cluster interconnect occurs due to the settings of the shutdown facility, each node will become the LEFTCLUSTER state. For details on the settings of the shutdown facility, see "H.2.3.2 Setting Up the Shutdown Facility." For how to recover from the LEFTCLUSTER state, see "6.2 Recovering from LEFTCLUSTER" in "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide." For the following operations, see "7.2 Operating the PRIMECLUSTER System."
Remove a virtual machine that installs PRIMECLUSTER from targets for cluster functions (such as VMware HA, VMware FT, VMware DRS, and VMware DPM) of VMware.
The following functions are not available in a virtual machine in which PRIMECLUSTER is to be installed.
VMware vMotion
VMware Storage vMotion
Migration with VMware vCenter Converter
Snapshot of VMware
Hot clone
Backup by Data Recovery
Backup by VCB
Set the path policy for the Native Multipathing (NMP) as follows:
When using VMware vSphere 4.x
Set to "Most Recently Used".
When using VMware vSphere 5.0 Update1 or later, or VMware vSphere 5.1 or later
Set to "Most Recently Used" or "Round Robin".
Settings other than above are not supported.
Support for multipath software from third parties, contact field engineers.
The following environments and functions are not supported:
iSCSI and FCoE
ESX hosts with different versions
N-Port ID Virtualization (NPIV)
It is necessary to use the hardware version 7 for VMware vSphere 4.x of the virtual machine.
When you cannot refer to a system volume on a disk device in a SAN boot configuration, the PRIMECLUSTER failure detection function cannot be operated depending on the status of the system. In this case, perform a manual switchover by an operator.
Use a shared disk responding to SCSI-3 Persistent Reservation.
When using the file system that is created on the shared disk as Fsystem resources, you need to register all the file systems that are created on the same disk (LUN) or on the same disk class to the same userApplication. Due to the restriction of I/O fencing, you cannot create multiple file systems on one disk (LUN) or on one disk class and register each file system to the different userApplications to monitor and control them.