D.1.10 Cluster System Related Error

For cluster system related errors, in one of the following circumstances, take the actions as indicated for the relevant situation.

(1) The error message "ERROR: class: cannot operate in cluster environment, ..." is output, and the operation cannot be conducted on the class class.
(2) The PRIMECLUSTER CF clinitreset(8) command ends abnormally outputting an error message # 6675.
(3) Cluster applications become "Inconsistent".
(4) The GFS Shared File System is not mounted on node startup.
(5) The error message "ERROR: class: cannot operate shared objects, ..." is output, and the shared class class cannot be created; or the error message "ERROR: cluster communication failure" is output, and automatic registration of resources is failed.
(6) The disk space of a file system on a shared disk is "Full" (100%). .

(1) The error message "ERROR: class: cannot operate in cluster environment, ..." is output, and the operation cannot be conducted on the class class.

Explanation

The local class created when cluster control facility was inactive cannot directly be used in a cluster system. When the cluster control facility is activated, the following message is output to the system log and the GDS daemon log file, and the local class becomes nonoperational.

ERROR: class: cannot operate in cluster environment, created when cluster control facility not ready

This error message will be output when:

The cluster initial configuration was executed after the local class class had been created on a node on which that configuration was incomplete.
The local class class was created in single user mode.
The single node on which the local class class was created was changed over to a cluster system.

Resolution

Re-create the local class in the cluster system to use according to the following steps.

1) In the CF main window of Cluster Admin, execute [Stop CF] in the [Tools] menu to stop CF.

2) Back up volume data if necessary.

3) Delete the class.

4) In the CF main window of Cluster Admin, execute [Load driver] to start CF.

5) Re-create the class and volumes deleted in step 3).

6) Restore the volume data backed up in step 2) as needed.

See

For information on how to operate the CF main window of Cluster Admin, see "PRIMECLUSTER Cluster Foundation (CF) Configuration and Administration Guide."

(2) The PRIMECLUSTER CF clinitreset(8) command ends abnormally outputting an error message # 6675.

Explanation

When a class exists in a cluster system, initializing the PRIMECLUSTER resource database with the PRIMECLUSTER CF clinitreset command results in that the clinitreset command fails outputting the following error message.

FJSVcluster: ERROR: clinitreset: 6675: Cannot run this command because Global Disk Services has already been set up.

When a node containing a shadow class is rebooted because of an event such as shutdown and panic, the shadow class is deleted, but the /dev/sfdsk/Class_Name directory is not deleted. If the clinitreset command is executed here, the command also fails outputting the error message as above.

Resolution

On all nodes in the cluster system, view the configuration of objects and delete a class if any exists. If a class is deleted, volume data will be lost. If necessary, back up volume data in advance.
See
- For using GDS Management View, see "9.3.1 Using GDS Management View."
- For using commands, see "Appendix B Command Reference."
On all nodes in the cluster system, check whether a class directory exists in the /dev/sfdsk directory, and delete a class directory if any exists. The following shows an example when a directory of class Class1 exists.
_adm and _diag are special files used by GDS and cannot be deleted.
# cd /dev/sfdsk
# ls
_adm _diag Class1
# rm -rf Class1

(3) Cluster applications become "Inconsistent".

Explanation

If a shared class is not to be used as an RMS resource, volumes included in the class are started on node startup. If a cluster application that uses those volumes are started there, the cluster application becomes "Inconsistent" because the volumes are already active. By default, classes are not to be used as RMS resources. Classes can be made available as RMS resources by:

Specifying them and using the hvgdsetup -a command

Resolution

Make the shared class available as an RMS resource with one of the following methods. After performing the procedures, restart the cluster application.

Execute the following command, if the class is registered with the resource used for the cluster application.

# /opt/SMAW/SMAWRrms/bin/hvgdsetup -a class_name
...
Do you want to continue with these processes ? [yes/no] y

(4) The GFS Shared File System is not mounted on node startup.

Explanation

If a shared class is to be used as an RMS resource, volumes included in the class are not started on node startup. Therefore, the GFS Shared File System on those volumes is not mounted on node startup. By default, classes are not to be used as RMS resources, but they are made available as RMS resources by:

Specifying them and using the hvgdsetup -a command

Resolution

Take one of the following actions.

a) When using the shared class as an RMS resource, do not create the GFS Shared File System on volumes in the class, but create it on volumes in a difference class.

b) When not using the shared class as an RMS resource, make the class unavailable as an RMS resource again with one of the following methods. After performing the procedures, reboot the system.

Execute the following command.

# /opt/SMAW/SMAWRrms/bin/hvgdsetup -d class_name
...
Do you want to continue with these processes ? [yes/no] y
...
Do you need to start volumes in the specified disk class ? [yes/no] n

(5) The error message "ERROR: class: cannot operate shared objects, ..." is output, and the shared class class cannot be created; or the error message "ERROR: cluster communication failure" is output, and automatic registration of resources is failed.

Explanation

This phenomenon occurs when the package of GDS is installed before installing other products like PRIMECLUSTER HA Server or PRIMECLUSTER Enterprise Edition and setting up a cluster system.

To determine whether or not your case applies to this case, check the installation dates and times of the FJSVsdx-bas and FJSVclapi packages. If the installation date and time of the FJSVsdx-bas package is earlier than that of the FJSVclapi package, this is the cause of the current trouble.

Example

Results of checking installation dates and times

Installation date and time of the FJSVsdx-bas package

	# rpm -qi FJSVsdx-bas
Name        : FJSVsdx-bas                  Relocations: (not relocatable)
...
Install Date: Tue Jan 26 15:27:22 2010         Build Host: xxxxxxxx
...

Installation date and time of the FJSVclapi package

# rpm -qi FJSVclapi
Name        : FJSVclapi                    Relocations: (not relocatable)
...
Install Date: Wed Jan 27 19:08:13 2010         Build Host: xxxxxxxx
...

In this case, the installation date and time of the FJSVsdx-bas package is earlier than that of the FJSVclapi package. From this you can conclude that the package of GDS has already installed before installing PRIMECLUSTER HA Server, PRIMECLUSTER Enterprise Edition or other products.

Resolution

Install the FJSVsdx-bas package again, overwriting the existing one.

Insert CD2, used for installing PRIMECLUSTER, into the CD-ROM drive and mount it. In the following, the CD mount point is defined as <CDROM_DIR>.
# mount /media/cdrom
Install the FJSVsdx-bas package again, overwriting the existing one.
# cd <CDROM_DIR>/Linux/pkgs # rpm -Uvh -force <package name>
<Package name> corresponds to each distribution. Specify the name according to the correspondence chart shown below.
Distribution
Package name
RHEL8(Intel64)
FJSVsdx-bas.rhel8_x86_64.rpm
RHEL9(Intel64)
FJSVsdx-bas.rhel9_x86_64.rpm
Reboot the system.
# /sbin/shutdown -r now

(6) The disk space of a file system on a shared disk is "Full" (100%).

Explanation

The disk space may become "Full" (100%) while using the switchover file system created on a shared class volume.

Resolution

The recovery procedure is shown below.

Check the volume

On a node which is not the target for the recovery, confirm that the volume containing the target file is stopped.

Execute the following command on a node other than the recovery target.

# sdxinfo -V -c class

Example) When the class name is "c0" and the volume name is "v0"

# sdxinfo -V -c c0
OBJ    NAME    CLASS   GROUP   SKIP JRM 1STBLK   LASTBLK  BLOCKS   STATUS
------ ------- ------- ------- ---- --- -------- -------- -------- --------
...
volume v0      c0      g0      off  on  131072   163839   32768    STOP
...

Make sure that STATUS is STOP at the line which NAME is "v0."

Start the volume
On the target node for the recovery, execute the following command.
# sdxvolume -N -c class -v volume
Example) When the class name is "c0" and the volume name is "v0"
# sdxvolume -N -c c0 -v v0
Mount the file system
On the target node for the recovery, execute the following command.
Example) When the class name is "c0", the volume name is "v0", the file system type is "ext4", and the mount point is "/mnt"
# mount -t ext4 /dev/sfdsk/c0/dsk/v0 /mnt
Delete unnecessary files
On the target node for the recovery, delete unnecessary data under <mount_point>.
Unmount the file system
On the target node for the recovery, execute the following command.
Example) When the mount point is "/mnt"
# umount /mnt
Stop the volume
On the target node for the recovery, execute the following command.
# sdxvolume -F -c class -v volume
Example) When the class name is "c0" and the GDS volume name is "v0"
# sdxvolume -F -c c0 -v v0
Clear the "Faulted" state of the cluster application
Execute the following command on all nodes which compose the cluster.
# hvutil -c userApplication_name
Example) When the cluster application name is "app1"
# hvutil -c app1
Start the cluster application
Execute the following command on an active node.
# hvswitch userApplication_name SysNode
Example) When the cluster application of node 1 is "app1"
# hvswitch app1 node1RMS

See

For the hvutil and hvswitch commands, see the hvutil(1M) and hvswitch(1M) manual pages.

Distribution	Package name
RHEL8(Intel64)	FJSVsdx-bas.rhel8_x86_64.rpm
RHEL9(Intel64)	FJSVsdx-bas.rhel9_x86_64.rpm