If the slice status is one of the following statuses, take the actions as indicated for the relevant situation.
(1) Mirror slice configuring the mirror volume is in INVALID status.
Explanation
You can check the status of the slice configuring the volume as follows.
# sdxinfo -S -o Volume1 OBJ CLASS GROUP DISK VOLUME STATUS ------ ------- ------- ------- -------- -------- slice Class1 Group1 Object1 Volume1 ACTIVE slice Class1 Group1 Object2 Volume1 INVALID |
In this example, among the slices that exist in volume Volume1, the slice within Object2 is in INVALID status, as shown in the STATUS field. Object2 is a disk or lower level group connected to the highest level mirror group Group1.
The following five events could possibly cause the INVALID status of the mirror slice Volume1.Object2.
An I/O error occurred on the mirror slice Volume1.Object2.
A disk component relevant to Object2 failed to operate properly, and an I/O error occurred on the mirror slice Volume1.Object2.
A component other than disks relevant to Object2 (such as an I/O adapter, an I/O cable, an I/O controller, a power supply, and a fan) failed to operate properly, and an I/O error occurred on the mirror slice Volume1.Object2.
An I/O error occurred on the mirror slice Volume1.Object1 during synchronization copying to the mirror slice Volume1.Object2.
A disk component relevant to Object1 failed to operate properly during synchronization copying to the mirror slice Volume1.Object2, and an I/O error occurred on the copy source slice Volume1.Object1.
A component other than disks relevant to Object1 (such as an I/O adapter, an I/O cable, an I/O controller, a power supply and a fan) failed to operate properly during synchronization copying to the mirror slice Volume1.Object2, and an I/O error occurred on the mirror slice Volume1.Object1 that is the copy source.
Others
Synchronization copying to the mirror slice Volume1.Object2 was canceled as the result of a cause such as [Cancel Copying] selection from the GDS Management View, sdxcopy command execution, or a power outage.
An I/O error occurred due to a SCSI timeout. Whether a SCSI timeout has occurred can be determined from the following message:
kernel: mptscsih: iocn: attempting task abort! (sc=e00001401a0dce80)
Resolution
1) Identify the physical disk name of a faulty disk using the sdxinfo command.
(Example A1)
# sdxinfo -G -o Volume1 # sdxinfo -D -o Volume1 -e long |
In this example, Object2 is a disk connected with the highest level group Group1. As indicated in the E field, an I/O error occurred on the disk Object2, and the possible cause is (Cause a) or (Cause b). The physical disk name of the disk Object2 is sdb as shown in the DEVNAM field.
In example A1, if the value 0 is in the E field of the disk Object2 including a slice in the INVALID status and if the value 1 is in the E field of the disk Object1 that is mirrored with the disk Object2, it indicates an I/O error occurred on the disk Object1 and the possible cause is (Cause a') or (Cause b'). In such a case, see "(2) The copy destination slice was made INVALID due to an I/O error generated on the copy source slice during synchronization copying." and perform restoration.
If the E field of any disk does not contain the value 1 in example A1, the possible cause is (Cause c).
(Example B1)
# sdxinfo -G -o Volume1 # sdxinfo -D -o Volume1 -e long |
In this example, Object2 is a lower level group connected with the highest level group Group1. As indicated in the E field, an I/O error occurred on the disk Disk3 connected with Object2 and the possible cause is (Cause a) or (Cause b). The physical disk name corresponding to Disk3 is sdc as shown in the DEVNAM field.
In example B1, if the 0 value is in the E field of the disks (Disk3 and Disk4) connected with the lower level group Object2 including a slice in the INVALID status and if the value 1 is in the E field of the disk (Disk1 or Disk2) connected with the lower level group Object1 that is mirrored with the lower level group Object2, it indicates an I/O error occurred on the disk connected with Object1 and the possible cause is (Cause a') or (Cause b'). In such a case, see "(2) The copy destination slice was made INVALID due to an I/O error generated on the copy source slice during synchronization copying." and perform restoration.
In example B1, if there are no disks with "1" in the E field, the possible cause of the INVALID status is (Cause c).
2) Refer to disk driver log messages and check the physical disk abnormalities.
The causes of disk hardware failures can be failures or defects in components such as I/O adapters, I/O cables, I/O controllers, power supplies, and fans other than the disks.
Contact field engineers and specify which component failed, or might be defective.
If there are no failures or defective components, the possible cause of the INVALID is (Cause c).
The resolution procedure is illustrated below for each of the three causes a, b, and c.
a. For (Cause a)
a1) Perform the following operations before and after disk swapping. For the procedures for swapping disks from Operation Management View, see "7.3.1.2 Operation Procedure."
Before swapping the disks, execute the following command.
(Example A1)
# sdxswap -O -c Class1 -d Object2 |
In the example, disk Object2 connected to the highest level group Group1 will be swapped.
(Example B1)
# sdxswap -O -c Class1 -d Disk3 |
In the example, disk Disk3 will be swapped. Disk3 is a disk connected to lower level group Object2, which is a lower level group of the highest level group Group1.
a2) Swap disks.
a3) After swapping the disks, execute the following command.
(Example A3)
# sdxswap -I -c Class1 -d Object2 |
or
(Example B3)
# sdxswap -I -c Class1 -d Disk3 |
a4) Check the slice status according to step 3).
b. For (Cause b)
b1) Shut down the system once, repair the disabled part, and reboot the system. Consequently, synchronization copying is performed and the mirroring status is restored.
b2) Check the slice status according to step 3).
c. For (Cause c)
c1) Perform synchronization copying of mirror volume.
# sdxcopy -B -c Class1 -v Volume1 |
c2) Check the slice status according to step 3).
3) You can confirm the recovery of the slice configuring the volume, as shown below.
# sdxinfo -S -o Volume1 |
In this example, the slices within Object1 and Object2 are both in ACTIVE status. This indicates that the recovery process was completed successfully.
(2) The copy destination slice was made INVALID due to an I/O error generated on the copy source slice during synchronization copying.
Explanation
When an I/O error occurs on the copy source slice during synchronization copying, the copy destination slice becomes INVALID while the source slice is still ACTIVE.
The following two events are possible causes.
A disk component of the copy source failed to operate properly, and an I/O error occurred on the copy source slice.
A component other than the copy source disk (such as an I/O adapter, an I/O cable, an I/O controller, a power supply, and a fan) failed to operate properly during synchronization copying, and an I/O error occurred on the copy source slice.
For details on determining whether the status is relevant to one of these events and identifying the physical disk name of a faulty disk, see [Explanation] and step 1) of [Resolution] described in "(1) Mirror slice configuring the mirror volume is in INVALID status."
Resolution
First examine the physical disk abnormalities referring to disk driver log messages and so on. Then contact field engineers and locate the disabled or faulty part.
When the possible cause is (Cause b), shut down the system once, repair the disabled part, and reboot the system. Consequently, synchronization copying is performed and the mirroring status is restored.
When the possible cause is (Cause a), follow the procedures below and repair the slice. The procedure is illustrated for each of the following three situations.
A. For /(root), /usr, or /var [EFI]
B. For the swap area [EFI]
C. For others (other than /(root), /usr, /var, swap)
The following illustrates restoration procedures when the class name is Class1, the volume name is Volume1, the name of a faulty disk of the copy source is Disk1, and the name of a disk of the copy destination is Disk2 as examples.
# sdxinfo -S -o Volume1 # sdxinfo -G -o Volume1 # sdxinfo -D -o Volume1 -e long |
A. For /(root), /usr, or /var [EFI]
Follow "(4) System cannot be booted. (Failure of all boot disks)" in "D.1.5 System Disk Abnormality [EFI]" for restoration. In the procedure of "Resolution", replace only the faulty copy source disk. Do not replace all the disks that are registered in the root class.
B. For the swap area [EFI]
B.1) Check the swap volume.
# swapon -s
Filename Type Size Used Priority
/dev/sfdsk/gdssys32 partition 4194296 0 -1
(*1) |
(*1) In RHEL8.6 or later, the path for the GDS logical volume is /dev/sfdsksys32.
B.2) Remove the volume from the swap area.
# swapoff /dev/sfdsk/gdssys32 |
Depending on the part or severity of failure in disks that constitute the volume, the swapoff(8) command may fail due to an I/O error. In this event, remove the volume from the swap area performing steps B.1.1) through B.1.2).
B.2.1) Comment out the swap line to prevent use of the volume as a swap area after the system is rebooted.
# vim /etc/fstab Before edit: /dev/sfdsk/gdssys32 swap swap defaults 0 0 After edit: #/dev/sfdsk/gdssys32 swap swap defaults 0 0 |
B.2.2) Reboot the system.
# shutdown -r now |
B.3) Stop the volume.
# sdxvolume -F -c Class1 -v Volume1 |
B.4) Restore the status of the copy destination slice in INVALID status.
# sdxfix -V -c Class1 -d Disk2 -v Volume1 |
B.5) Verify that the restored copy destination slice is in STOP status and the copy source slice is in INVALID status now.
# sdxinfo -S -o Volume1 |
B.6) Start the volume.
# sdxvolume -N -c Class1 -v Volume1 |
B.7) Add the volume to the swap area again.
# swapon /dev/sfdsk/gdssys32 |
When step B.2.1) was performed, undo the edit that was made in the /etc/fstab file.
# vim /etc/fstab Before edit: #/dev/sfdsk/gdssys32 swap swap defaults 0 0 After edit: /dev/sfdsk/gdssys32 swap swap defaults 0 0 |
B.8) Remove the faulty copy source disk from GDS management to give it a replaceable status.
# sdxswap -O -c Class1 -d Disk1 |
B.9) Swap the faulty copy source disk.
B.10) Put the swapped disk back in control of GDS management to make it available.
# sdxswap -I -c Class1 -d Disk1 |
C. For others (other than /(root), /usr, /var, and swap)
C.1) Exit applications using the volume.
C.2) Unmount the file system on the volume when it has been mounted.
In this example, the volume has been used as an ext4 file system.
# umount /dev/sfdsk/Class1/dsk/Volume1 |
Depending on the part or the severity of failure in disks that compose the volume, the umount command may fail due to an I/O error. In this event, unmount the file system performing steps C.2.1) through C.2.3).
C.2.1) If class Class1 is registered with a cluster application, remove the cluster application.
C.2.2) Comment out the /dev/sfdsk/Class1/dsk/Volume1 line in the /etc/fstab file to prevent mounting of the volume after the system is rebooted.
C.2.3) Reboot the system.
C.3) Stop the volume.
# sdxvolume -F -c Class1 -v Volume1 |
C.4) Restore the status of the copy destination slice in the INVALID status.
# sdxfix -V -c Class1 -d Disk2 -v Volume1 |
C.5) Verify that the restored copy destination slice is in the STOP status and the copy source slice is the INVALID status now.
# sdxinfo -S -o Volume1 |
C.6) Start the volume.
# sdxvolume -N -c Class1 -v Volume1 |
C.7) The consistency of volume data may have lost. Restore the backup data or perform repair using the fsck(8) command if necessary.
Note
If an I/O error occurred in the copy source slice during just resynchronization after the system went down, restoration may possibly be performed with the fsck(8) command.
When step C.2.2) was performed, undo the edit that was made in the /etc/fstab file.
When step C.2.1) was performed, re-create the cluster application removed in step C.2.1).
C.8) Remove the faulty copy source disk from GDS management to make it a replaceable status.
# sdxswap -O -c Class1 -d Disk1 |
C.9) Swap the faulty copy source.
C.10) Put the swapped disk back in control of GDS management to make it available.
# sdxswap -I -c Class1 -d Disk1 |
(3) Slice configuring the volume is in TEMP status.
Explanation
The slice was not attached after it has been detached with the sdxslice command. Or else, you have not performed [Attach Slice] after performing [Detach Slice] from Operation Management View.
Resolution
Attach the slice again with the sdxslice command, or perform [Attach Slice] from Operation Management View as necessary.
(4) Slice configuring volume is in TEMP-STOP status.
Explanation
The slice was not activated after it has been stopped with the sdxslice command, or the detached node is not current node. Or else, you have not performed [Stop Slice] after performing [Activate Slice] from Operation Management View.
Resolution
Activate slice or take over slice with the sdxslice command as needed. Or, perform [Activate Slice] from Operation Management View.
(5) Slice configuring the volume is in COPY status.
Explanation
In order to attach a slice, synchronization copying is currently in process. Or, synchronization copying is in process between master and proxy.
Resolution
Wait until synchronization copying is complete. Note that a slice in the process of synchronization copying will not restrict you from accessing an active volume.
(6) Slice configuring the volume is in NOUSE status.
Explanation
When the status of disk related to slice is either in DISABLE or SWAP status, the slice becomes NOUSE to inhibit slice operation.
Resolution
Recover disk in DISABLE or SWAP status. For details, see "D.1.2 Disk Status Abnormality."