Top
PRIMECLUSTERGlobal Disk Services Configuration and AdministrationGuide 4.5
FUJITSU Software

F.1.1 Slice Status Abnormality

If the slice status is one of the following statuses, take action as indicated for the relevant situation.

(1) Mirror slice configuring the mirror volume is in INVALID status.

Explanation

You can check the status of the slice configuring the volume as follows.

# sdxinfo -S -o Volume1
OBJ    CLASS   GROUP   DISK      VOLUME   STATUS
------ ------- ------- -------   -------- --------
slice  Class1  Group1  Object1   Volume1  ACTIVE
slice  Class1  Group1  Object2   Volume1  INVALID

In this example, among the slices that exist in volume Volume1, the slice within Object2 is in INVALID status, as shown in the STATUS field. Object2 is a disk or lower level group connected to the highest level mirror group Group1.

The following five events could possibly cause the INVALID status of the mirror slice Volume1.Object2.

Resolution

1) Identify the physical disk name of a faulty disk using the sdxinfo command.

(Example A1)

# sdxinfo -G -o Volume1
OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE ------ ------- ------- ------------------- -------- -------- ----- group Group1 Class1 Object1:Object2 17596416 17498112 0
# sdxinfo -D -o Volume1 -e long
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS FREEBLKS DEVCONNECT STATUS E ------ ------- ------ ------- ------- ------- -------- -------- ---------------- ------- ----- disk Object1 mirror Class1 Group1 c1t1d0 17596416 * node1 ENABLE * disk Object2 mirror Class1 Group1 c2t3d0 17596416 * node1 ENABLE *

In this example, Object2 is a disk connected with the highest level group Group1. As indicated in the E field, an I/O error occurred on the disk Object2, and the possible cause is (Cause a) or (Cause b). The physical disk name of the disk Object2 is c2t3d0 as shown in the DEVNAM field.

In example A1, if the value 0 is in the E field of the disk Object2 including a slice in the INVALID status and if the value 1 is in the E field of the disk Object1 that is mirrored with the disk Object2, it indicates an I/O error occurred on the disk Object1 and the possible cause is (Cause a') or (Cause b'). In such a case, see "(2) The copy destination slice was made INVALID due to an I/O error generated on the copy source slice during synchronization copying." and perform restoration.

If the E field of any disk does not contain the value 1 in example A1, the possible cause is (Cause c).

(Example B1)

# sdxinfo -G -o Volume1
OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE ------ ------- ------- ------------------- -------- -------- ----- group Group1 Class1 Object1:Object2 35127296 35028992 0 group Object1 Class1 Disk1:Disk2 35127296 * * group Object2 Class1 Disk3:Disk4 35127296 * *
# sdxinfo -D -o Volume1 -e long
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS FREEBLKS DEVCONNECT STATUS E ------ ------- ------ ------- ------- ------- -------- -------- ---------------- ------- ----- disk Disk1 stripe Class1 Object1 c1t1d0 17596416 * node1 ENABLE 0 disk Disk2 stripe Class1 Object1 c1t2d0 17596416 * node1 ENABLE 0 disk Disk3 stripe Class1 Object2 c2t3d0 17682084 * node1 ENABLE 1 disk Disk4 stripe Class1 Object2 c2t4d0 17682084 * node1 ENABLE 0

In this example, Object2 is a lower level group connected with the highest level group Group1. As indicated in the E field, an I/O error occurred on the disk Disk3 connected with Object2 and the possible cause is (Cause a) or (Cause b). The physical disk name corresponding to Disk3 is c2t3d0 as shown in the DEVNAM field.

In example B1, if the 0 value is in the E field of the disks (Disk3 and Disk4) connected with the lower level group Object2 including a slice in the INVALID status and if the value 1 is in the E field of the disk (Disk1 or Disk2) connected with the lower level group Object1 that is mirrored with the lower level group Object2, it indicates an I/O error occurred on the disk connected with Object1 and the possible cause is (Cause a') or (Cause b'). In such a case, see "(2) The copy destination slice was made INVALID due to an I/O error generated on the copy source slice during synchronization copying." and perform restoration.

In example B1, if there are no disks with "1" in the E field, the possible cause of the INVALID status is (Cause c).


2) Refer to disk driver log messages and check the physical disk abnormalities.

The causes of disk hardware failures can be failures or defects in components such as I/O adapters, I/O cables, I/O controllers, power supplies, and fans other than the disks.

Contact field engineers and specify which component failed, or might be defective.

If there are no failures or defective components, the possible cause of the INVALID is (Cause c).

The resolution procedure is illustrated below for each of the three causes a, b, and c.

a. For (Cause a)

a1) Perform the following operations before and after disk swapping. For the procedures for swapping disks from Operation Management View, see "5.3.4 Disk Swap."

Before swapping the disks, execute the following command.

(Example A1)

# sdxswap -O -c Class1 -d Object2

In the example, disk Object2 connected to the highest level group Group1 will be swapped.

(Example B1)

# sdxswap -O -c Class1 -d Disk3

In the example, disk Disk3 will be swapped. Disk3 is a disk connected to lower level group Object2, which is a lower level group of the highest level group Group1.

a2) Swap disks.

a3) After swapping the disks, execute the following command.

(Example A3)

# sdxswap -I -c Class1 -d Object2

or

(Example B3)

# sdxswap -I -c Class1 -d Disk3

a4) Check the slice status according to step 3).

b. For (Cause b)

b1) Shut down the system once, repair the disabled part, and reboot the system. Consequently, synchronization copying is performed and the mirroring status is restored.

b2) Check the slice status according to step 3).

c. For (Cause c)

c1) Perform synchronization copying of mirror volume.

# sdxcopy -B -c Class1 -v Volume1

c2) Check the slice status according to step 3).


3) You can confirm the recovery of the slice configuring the volume, as shown below.

# sdxinfo -S -o Volume1
OBJ CLASS GROUP DISK VOLUME STATUS ------ ------- ------- ------- ------- -------- slice Class1 Group1 Object1 Volume1 ACTIVE slice Class1 Group1 Object2 Volume1 ACTIVE

In this example, the slices within Object1 and Object2 are both in ACTIVE status. This indicates that the recovery process was completed successfully.


(2) The copy destination slice was made INVALID due to an I/O error generated on the copy source slice during synchronization copying.

Explanation

When an I/O error occurs on the copy source slice during synchronization copying, the copy destination slice becomes INVALID while the source slice is still ACTIVE.

The following two events are possible causes.

(Cause a)

A disk component of the copy source failed to operate properly, and an I/O error occurred on the copy source slice.

(Cause b)

A component other than the copy source disk (such as an I/O adapter, an I/O cable, an I/O controller, a power supply, and a fan) failed to operate properly during synchronization copying, and an I/O error occurred on the copy source slice.

For details on determining whether the status is relevant to one of these events and identifying the physical disk name of a faulty disk, see [Explanation] and step 1) of [Resolution] described in "(1) Mirror slice configuring the mirror volume is in INVALID status."

Resolution

First examine the physical disk abnormalities referring to disk driver log messages and so on. Then contact field engineers and locate the disabled or faulty part.

When the possible cause is (Cause b), shut down the system once, repair the disabled part, and reboot the system. Consequently, synchronization copying is performed and the mirroring status is restored.

When the possible cause is (Cause a), follow the procedures below and repair the slice. The procedure is illustrated for each of the following three situations.

A. For /(root), /usr, or /var

B. For the swap area

C. For others (other than /(root), /usr, /var, swap)

The following illustrates restoration procedures when the class name is Class1, the volume name is Volume1, the name of a faulty disk of the copy source is Disk1, and the name of a disk of the copy destination is Disk2 as examples.

# sdxinfo -S -o Volume1
OBJ CLASS GROUP DISK VOLUME STATUS ------ ------- ------- ------- ------- -------- slice Class1 Group1 Disk1 Volume1 ACTIVE slice Class1 Group1 Disk2 Volume1 INVALID
# sdxinfo -G -o Volume1
OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE ------ ------- ------- ------------------- -------- -------- ----- group Group1 Class1 Disk1:Disk2 17596416 17498112 0
# sdxinfo -D -o Volume1 -e long
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS FREEBLKS DEVCONNECT STATUS E ------ ------- ------ ------- ------- ------- -------- -------- ---------------- ------- ----- disk Disk1 mirror Class1 Group1 c1t1d0 17596416 * node1 ENABLE 1 disk Disk2 mirror Class1 Group1 c2t3d0 17596416 * node1 ENABLE 0

A. For /(root), /usr, or /var

Contact field engineers.


B. For the swap area

B.1) Remove the volume from the swap area.

# swap -d /dev/sfdsk/Class1/dsk/Volume1

Depending on the part or severity of failure in disks that constitute the volume, the swap -d command may fail due to an I/O error. In this event, remove the volume from the swap area performing steps B.1.1) through B.1.3).

B.1.1) Comment out the /dev/sfdsk/Class1/dsk/Volume1 line in the /etc/vfstab file to prevent use of the volume as a swap area after the system is rebooted.

# vi /etc/vfstab
Before edit: /dev/sfdsk/Class1/dsk/Volume1 - - swap - no - After edit: #/dev/sfdsk/Class1/dsk/Volume1 - - swap - no -

B.1.2) To remove the dump device after the system is rebooted, edit the definition of DUMPADM_DEVICE in the /etc/dumpadm.conf file.

# vi /etc/dumpadm.conf

Before edit:
DUMPADM_DEVICE=/dev/sfdsk/Class1/dsk/Volume1

After edit:
DUMPADM_DEVICE=swap

B.1.3) Reboot the system.

# shutdown -y -i6 -g0

B.2) Stop the volume.

# sdxvolume -F -c Class1 -v Volume1

B.3) Restore the status of the copy destination slice in INVALID status.

# sdxfix -V -c Class1 -d Disk2 -v Volume1

B.4) Verify that the restored copy destination slice is in STOP status and the copy source slice is in INVALID status now.

# sdxinfo -S -o Volume1
OBJ CLASS GROUP DISK VOLUME STATUS ------ ------- ------- ------- ------- -------- slice Class1 Group1 Disk1 Volume1 INVALID slice Class1 Group1 Disk2 Volume1 STOP

B.5) Start the volume.

# sdxvolume -N -c Class1 -v Volume1

B.6) Add the volume to the swap area again.

# swap -a /dev/sfdsk/Class1/dsk/Volume1

When step B.1.1) was performed, undo the edit that was made in the /etc/vfstab file.

# vi /etc/vfstab
Before edit: #/dev/sfdsk/Class1/dsk/Volume1 - - swap - no - After edit: /dev/sfdsk/Class1/dsk/Volume1 - - swap - no -

B.7) Remove the faulty copy source disk from GDS management to give it a replaceable status.

# sdxswap -O -c Class1 -d Disk1

B.8) Swap the faulty copy source disk.

B.9) Put the swapped disk back in control of GDS management to make it available.

# sdxswap -I -c Class1 -d Disk1

C. For others (other than /(root), /usr, /var, and swap)

C.1) Exit applications using the volume.

C.2) Unmount the file system on the volume when it has been mounted.

# umount /dev/sfdsk/Class1/dsk/Volume1

Depending on the part or severity of failure in disks that constitute the volume, the umount command may fail due to an I/O error. In this event, execute the umount command again with the -f option.

C.3) Stop the volume.

# sdxvolume -F -c Class1 -v Volume1

C.4) Restore the status of the copy destination slice in INVALID status.

# sdxfix -V -c Class1 -d Disk2 -v Volume1

C.5) Verify that the restored copy destination slice is in STOP status and the copy source slice is in INVALID status now.

# sdxinfo -S -o Volume1
OBJ CLASS GROUP DISK VOLUME STATUS ------ ------- ------- ------- ------- -------- slice Class1 Group1 Disk1 Volume1 INVALID slice Class1 Group1 Disk2 Volume1 STOP

C.6) Start the volume.

# sdxvolume -N -c Class1 -v Volume1

C.7) The consistency of volume data may have been lost. Restore the backup data or perform repair using the fsck(1M) command if necessary.

Note

If the I/O error occurred on the copy source slice during resynchronization copying after the system went down, restoration may possibly be performed with the fsck(1M) command.

C.8) Remove the faulty copy source disk from GDS management to make it a replaceable status.

# sdxswap -O -c Class1 -d Disk1

C.9) Swap the faulty copy source.

C.10) Put the swapped disk back in the control of GDS management to make it available.

# sdxswap -I -c Class1 -d Disk1

(3) Slice configuring the volume is in TEMP status.

Explanation

The slice was not attached after it has been detached with the sdxslice command. Or else, you have not performed [Attach Slice] after performing [Detach Slice] from Operation Management View.

Resolution

Attach the slice again with the sdxslice command, or perform [Attach Slice] from Operation Management View as necessary.


(4) Slice configuring volume is in TEMP-STOP status.

Explanation

The slice was not activated after it has been stopped with the sdxslice command, or the detached node is not current node. Or else, you have not performed [Stop Slice] after performing [Activate Slice] from Operation Management View.

Resolution

Activate slice or take over slice with the sdxslice command as needed. Or, perform [Activate Slice] from Operation Management View.


(5) Slice configuring the volume is in COPY status.

Explanation

In order to attach a slice, synchronization copying is currently in process. Or, synchronization copying is in process between master and proxy.

Resolution

Wait until synchronization copying is complete. Note that a slice in the process of synchronization copying will not restrict you from accessing an active volume.


(6) Slice configuring the volume is in NOUSE status.

Explanation

When the status of disk related to slice is either in DISABLE or SWAP status, the slice becomes NOUSE to inhibit slice operation.

Resolution

Recover disk in DISABLE or SWAP status. For details, see "F.1.2 Disk Status Abnormality."