Top
PRIMECLUSTERGlobal Disk Services Configuration and AdministrationGuide 4.5
FUJITSU Software

F.1.3 Volume Status Abnormality

If the volume status is one of the following statuses, take action as indicated for the relevant situation.

(1) Mirror volume is in INVALID status.

Explanation

You can confirm the status of the volume as shown below.

# sdxinfo -V -o Volume1
OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 65535 65536 PRIVATE volume Volume1 Class1 Group1 off on 65536 17596415 17530880 INVALID

In this example, volume Volume1 that exists in the highest level group Group1 is in INVALID status, as shown in the STATUS field.

If none of the mirror slices consisting the mirror volume contains valid data (ACTIVE or STOP), the mirror volume becomes INVALID. You cannot start a volume in INVALID status.

There are two reasons that may cause this INVALID status.

(Cause a)

Disk is in DISABLE status.

(Cause b)

Master-proxy relationship was cancelled forcibly while master data was being copied to proxy.

Resolution

1) Confirm that there is a disk in DISABLE status within the group with which the volume is associated as follows.

(Example A1)

# sdxinfo -G -o Volume1
OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE ------ ------- ------- ------------------- -------- -------- ----- group Group1 Class1 Disk1:Disk2 17596416 0 0
# sdxinfo -D -o Volume1
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 mirror Class1 Group1 c1t1d0 17596416 node1 ENABLE disk Disk2 mirror Class1 Group1 c2t3d0 17596416 node1 DISABLE

In this example, disks Disk1 and Disk2 are connected to the highest level mirror group Group1. The disk Disk2 is in DISABLE status as shown in the STATUS field.

(Example B1)

# sdxinfo -G -o Volume1
OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE ------ ------- ------- ------------------- -------- -------- ----- group Group1 Class1 Group2:Group3 35127296 17530880 0 group Group2 Class1 Disk1:Disk2 35127296 * 0 group Group3 Class1 Disk3:Disk4 35127296 * 0
# sdxinfo -D -o Volume1
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 stripe Class1 Group2 c1t1d0 17596416 node1 ENABLE disk Disk2 stripe Class1 Group2 c1t2d0 17596416 node1 DISABLE disk Disk3 stripe Class1 Group3 c2t3d0 17682084 node1 ENABLE disk Disk4 stripe Class1 Group3 c2t4d0 17682084 node1 ENABLE

In this example, lower level stripe groups, Group2 and Group3 are connected to the highest level mirror group Group1. Disk Disk2 which is connected to Group2 is in DISABLE status as shown in the STATUS field.


2) If the possible cause is (Cause a), restore the disk by following the procedures in "F.1.2 Disk Status Abnormality."


3) From the disks and lower level groups connected to the highest level mirror group, determine the disk or lower level group to which the slice you will use to recover data belongs. Then, execute the sdxfix command to recover data.

(Example A3)

# sdxfix -V -c Class1 -d Disk1 -v Volume1

In this example, Volume1 is recovered after a slice in disk Disk1.

(Example B3)

# sdxfix -V -c Class1 -g Group3 -v Volume1

In this example, Volume1 is recovered after a slice in lower level stripe group Group3.


4) Start the volume.

# sdxvolume -N -c Class1 -v Volume1 -e nosync

5) Access Volume1 and check its contents. Restore backup data or run fsck to regain data integrity as necessary.


6) Perform synchronization copying on volume.

# sdxcopy -B -c Class1 -v Volume1

(2) Single volume is in INVALID status.

Explanation

You can confirm the status of the volume as shown below.

# sdxinfo -V -o Volume1
OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Disk1 * * 0 32767 32768 PRIVATE volume Volume1 Class1 Disk1 off on 32768 65535 32768 INVALID volume * Class1 Disk1 * * 65536 8421375 8355840 FREE

In the example, the single volume Volume1 that exists on single disk Disk1 is in INVALID status, as shown in the STATUS field.

You cannot start a volume in INVALID status.

There are two reasons that may cause this INVALID status.

(Cause a)

Single disk is in DISABLE status. In this case, the single slice becomes NOUSE status.

(Cause b)

Master-proxy relationship was cancelled forcibly while master data was being copied to proxy.

Resolution

1) Confirm that the single disk is in DISABLE status as shown below.

# sdxinfo -D -o Volume1
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 single Class1 * c1t11d0 8355840 node1 DISABLE

In this example, the single disk Disk1 is in DISABLE status, as shown in the STATUS field.


2) If the possible cause is (Cause a), restore the disk by following the procedures given in section "F.1.2 Disk Status Abnormality."


3) Execute the sdxfix command to recover the single volume's data.

# sdxfix -V -c Class1 -d Disk1 -v Volume1

4) Start the volume.

# sdxvolume -N -c Class1 -v Volume1

5) Access Volume1 and check its content. Restore backup data or run the fsck command to regain data integrity as necessary.


(3) Stripe volume or volume in concatenation group is in INVALID status.

Explanation

You can confirm the status of the volume as shown below.

# sdxinfo -V -o Volume1
OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 65535 65536 PRIVATE volume Volume1 Class1 Group1 off on 65536 17596415 17530880 INVALID

In this example, volume Volume1 that exists in the highest level group Group1 is in INVALID status, as shown in the STATUS field.

If any of the disks related to volume is in DISABLE status, the slices consisting that volume become NOUSE status, and the volume becomes INVALID. You cannot start a volume in INVALID status.

Resolution

1) You can confirm the status of the disk related to the volume as shown below.

# sdxinfo -G -o Volume1 -e long
OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE MASTER TYPE WIDTH ------ ------- ------- ------------------- -------- -------- ----- ------ ------ ----- group Group1 Class1 Group2:Group3 70189056 65961984 * * stripe 32 group Group2 Class1 Disk1:Disk2 35127296 * * * concat * group Group3 Class1 Disk3:Disk4 35127296 * * * concat *
# sdxinfo -D -o Volume1
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 concat Class1 Group2 c1t1d0 17596416 node1 ENABLE disk Disk2 concat Class1 Group2 c1t2d0 17596416 node1 DISABLE disk Disk3 concat Class1 Group3 c2t3d0 17682084 node1 ENABLE disk Disk4 concat Class1 Group3 c2t4d0 17682084 node1 ENABLE

In this example, the lower level concatenation groups Group2 and Group3 are connected to the highest level stripe group Group1, and Disk2 connected to Group2 is in DISABLE status as shown in the STATUS field.


2) Follow the procedures in "F.1.2 Disk Status Abnormality" and restore the disk status.


3) Execute the sdxfix command to recover the volume's data. With -g option, indicate the highest level group name (in this example, Group1).

# sdxfix -V -c Class1 -g Group1 -v Volume1

4) Start the volume.

# sdxvolume -N -c Class1 -v Volume1

5) Access Volume1 and check its content. Restore backup data or run the fsck command to regain data integrity as necessary.


(4) Master volume is in INVALID status.

Explanation

If the copying process fails while copying data from the proxy volume to the master volume because of an I/O error or such, the status of the master volume to which the data is being copied becomes INVALID.

Resolution

1) Check if there is a DISABLE status disk in the group to which the volume belongs with the following command.

# sdxinfo -D -o Volume1
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 mirror Class1 Group1 c1t1d0 8421376 * ENABLE disk Disk2 mirror Class1 Group1 c1t2d0 8421376 * DISABLE

In this example, Disk2 is in DISABLE status.

If there is a disk in DISABLE status, see section "(1) Disk is in DISABLE status." in "F.1.2 Disk Status Abnormality," and check which of the causes listed in that section applies. If the possible cause is (Cause a) or (Cause b), follow the procedures and restore the disk.


2) Follow the procedures given in section "(1) Mirror slice configuring the mirror volume is in INVALID status." in "F.1.1 Slice Status Abnormality," and check if there is a disk hardware abnormality. If there is, identify the faulty part.

When the abnormality is caused by a failed or defective non-disk component, repair the faulty part.


3) Procedures to restore the data for different scenarios are given below.

a) Procedures to recover master volume data using proxy volume.

a1) In order to check if the proxy volume that will be used to recover data is separated from the master volume, execute the sdxinfo -V -e long command, and check the PROXY field.

a2) If the proxy volume is not separated, execute the following command.

# sdxproxy Part -c Class1 -p Volume2

a3) Exit all applications accessing the proxy volume. When using the proxy volume as a file system, execute unmount. When using the proxy volume as a file system, execute unmount.

a4) If the proxy volume is started, execute the following command.

# sdxvolume -F -c Class1 -v Volume2

a5) Recover master volume data using the proxy volume's data.

# sdxproxy RejoinRestore -c Class1 -p Volume2

b) Procedures to recover data using backup data.

b1) When the volume is in INVALID status, you must first change it to STOP status. Decide on the disk (slice) you wish to use to recover data, and execute the sdxfix command.

# sdxfix -V -c Class1 -d Disk1 -v Volume1

In this example, Volume 1 is restored after a slice in Disk 1.

b2) When the volume to be restored is stopped, start it with the following command.

# sdxvolume -N -c Class1 -v Volume1 -e nosync

b3) Access the volume to be restored and check its contents. Restore backup data or run fsck to regain data integrity as necessary.

b4) When mirroring is configured with the volume, perform synchronization copying.

# sdxcopy -B -c Class1 -v Volume1

c) Procedures to swap some disks connected to the group.

c1) If you restore the INVALID master volume later using data of a proxy volume related to the master volume, or use data of proxy volumes related to the master volume after restoring it, part the proxy volumes using the sdxproxy Part command.

# sdxproxy Part -c Class1 -p Volume2

c2) When there is a volume in INVALID status in the group, change it to STOP status with the sdxfix -V command. -d option indicates the disk without abnormality.

# sdxfix -V -c Class1 -d Disk1 -v Volume1

c3) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk" and "5.3.4 Disk Swap."

c4) Recover the master volume data. If data will be recovered using the proxy volume, follow procedures described in a). If data will be recovered using backup data on media such as tapes, follow procedures described in b).


d) Procedures to swap all disks connected to the group.

d1) Exit all applications accessing the master volume and the proxy volume that will be used to recover data. When using the proxy volume or the master volume as a file system, execute unmount.

d2) Stop the master volume and proxy volume in d1).

# sdxvolume -F -c Class1 -v Volume1
# sdxvolume -F -c Class1 -v Volume2

d3) Execute the sdxproxy RejoinRestore command and restore the master volume data using proxy volume in d1). If the command terminates normally and the master volume is not in INVALID status, restoration process is complete, and you do not need to perform steps d4) and after.

# sdxproxy RejoinRestore -c Class1 -p Volume2  

d4) Execute the sdxproxy Swap command and swap the slices of the master volume with the proxy volume in d1).

# sdxproxy Swap -c Class1 -p Volume2

d5) By performing step d4), the status of master volume will not be in INVALID status, and the status of the proxy volume becomes INVALID. Follow the procedures given in section "(5) Proxy volume is in INVALID status." in "F.1.3 Volume Status Abnormality," and restore the proxy volume in INVALID status.

d6) Execute the sdxproxy Swap command and swap the slices of the master volume and the proxy volume you swapped in step d4).

# sdxproxy Swap -c Class1 -p Volume2

e) Procedures to swap disks connected to the master group.

e1) Exit all applications accessing the master group, and volumes in the proxy group that will be used to recover data. When using the volume as a file system, execute unmount.

e2) Stop all volumes in the master group and the proxy group in e1).

# sdxvolume -F -c Class1 -v Volume1
# sdxvolume -F -c Class1 -v Volume2

e3) Execute the sdxproxy RejoinRestore command and restore the master group data using the proxy group in e1). If the command terminates normally and all master volumes are not in INVALID status, restoration process is complete, and you do not need to perform steps e4) and after.

# sdxproxy RejoinRestore -c Class1 -p Volume2  

e4) Execute the sdxproxy Swap command and swap the slices of the master group and the proxy group in e1).

# sdxproxy Swap -c Class1 -p Group2

e5) By performing step e4), the master volume will not be in INVALID status and the status of the proxy volume becomes INVALID. Follow the procedures given in section "(5) Proxy volume is in INVALID status." in "F.1.3 Volume Status Abnormality," and restore the proxy volume in INVALID status.

e6) Execute the sdxproxy Swap command and swap the slices of the master group and the proxy group you swapped in step e4).

# sdxproxy Swap -c Class1 -p Group2

(5) Proxy volume is in INVALID status.

Explanation

If the copying process fails while copying data from the master volume to the proxy volume because of an I/O error or such, the status of the proxy volume to which the data is being copied becomes INVALID.

Resolution

1) Check if there is a DISABLE status disk in the group to which the volume belongs with the following command.

# sdxinfo -D -o Volume1
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 mirror Class1 Group1 c1t1d0 8421376 * ENABLE disk Disk2 mirror Class1 Group1 c1t2d0 8421376 * DISABLE

In this example, Disk2 is in DISABLE status.

If there is a disk in DISABLE status, see section "(1) Disk is in DISABLE status." in "F.1.2 Disk Status Abnormality," and check which of the causes (There are three causes a, b, and c listed.) listed in that section applies. If it is due to (Cause a) or (Cause b), follow the procedures and restore the disk.


2) Follow the procedures given in "(1) Mirror slice configuring the mirror volume is in INVALID status." in "F.1.1 Slice Status Abnormality," and check if there is a disk hardware abnormality. If there is, identify the faulty part. When the abnormality was caused by a failed or defective non-disk component, repair the faulty part.


3) Procedures to restore the data for different scenarios are given below.

a) Procedures to recover proxy volume data using the master volume.

a1) In order to check if the proxy volume is separated from the master volume, execute the sdxinfo -V -e long command, and check the PROXY field.

a2) If the proxy volume is not separated, execute the following command.

# sdxproxy Part -c Class1 -p Volume2

a3) Rejoin the proxy volume with the master volume.

# sdxproxy Rejoin -c Class1 -p Volume2

b) Procedures to swap some disks connected to the group.

b1) Cancel the relationship with master volume using the sdxproxy Break command.

# sdxproxy Break -c Class1 -p Volume2

b2) Separate the volumes that are in INVALID status in the group with the sdxfix -V command, and change them to STOP status. -d option indicates the disk without abnormality.

# sdxfix -V -c Class1 -d Disk1 -v Volume2

b3) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk," or section "5.3.4 Disk Swap."

b4) Join the master and the proxy again with the sdxproxy Join command.

# sdxproxy Join -c Class1 -m Volume1 -p Volume2

c) Procedures to swap all disks connected to the group.

c1) Cancel the relationship with the master using the sdxproxy Break command.

# sdxproxy Break -c Class1 -p Volume2

c2) Exit all applications accessing the volume in the group. When using the volume as a file system, execute unmount.

c3) Stop all volumes in the group.

# sdxvolume -F -c Class1 -v Volume2

c4) Check the volume configuration of the group (such as volume names and sizes) with the sdxinfo command, and keep a note of it.

c5) Remove all volumes in the group.

# sdxvolume -R -c Class1 -v Volume2

c6) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk," or section "5.3.4 Disk Swap."

c7) Create the volume that you removed in step c5) again.

# sdxvolume -M -c Class1 -g Group2 -v Volume2 -s size

c8) Stop the volume you created in step c7).

# sdxvolume -F -c Class1 -v Volume2

Note

When the volume is in INVALID status, the error message "ERROR: disk: volume in INVALID status" will be displayed. Ignore the message and proceed to the next step.

c9) Join the master volume and the proxy volume again, with the sdxproxy Join command.

# sdxproxy Join -c Class1 -m Volume1 -p Volume2

d) Procedures to swap disks connected to the proxy group.

d1) Cancel the relationship with the master using the sdxproxy Break command.

# sdxproxy Break -c Class1 -p Group2

d2) Exit all applications accessing the volume in the group. When using the volume as a file system, execute unmount.

d3) Stop all volumes in the group.

# sdxvolume -F -c Class1 -v Volume2

Note

When the volume is in INVALID status, the error message "ERROR: disk: volume in INVALID status" will be displayed. Ignore the message and proceed to the next step.

d4) Remove all volumes in the group.

# sdxvolume -R -c Class1 -v Volume2

d5) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk," or section "5.3.4 Disk Swap."

d6) Join the master group and the proxy group again with the sdxproxy Join command.

# sdxproxy Join -c Class1 -m Group1 -p Group2 -a Volume1=Volume2:on

(6) Volume is in STOP status.

Explanation

Normally, volumes automatically start when the system is booted and become ACTIVE. The volume status will change to STOP when the volume is stopped with the Stop Volume menu in the GDS Management View or the sdxvolume -F command.

In a cluster system, among volumes within GDS shared classes registered with cluster applications, volumes other than proxy volumes start or stop according to the cluster application modes. If a cluster application is in Offline mode, volumes other than proxy volumes are in STOP status.

Accessing a volume in STOP status will result in an EIO error (I/O error) or an ENXIO error (No such device or address).

See

For the problem in a cluster system that volumes in a shared class not registered with a cluster application do not start at node startup, see "(4) The GFS Shared File System is not mounted on node startup." in "F.1.9 Cluster System Related Error."

Resolution

Start the volumes with the Start Volume menu in GDS Management View or the sdxvolume -N as necessary.

To start volumes within a GDS shared class registered with a cluster application, change the cluster application mode to Online.


(7) I/O error occurs although mirror volume is in ACTIVE status.

Explanation

A mirror volume consists of multiple slices, and in an event of an I/O error, the crashed slice will be detached. Therefore, accessing the volume will complete normally.

However, when an I/O error occurs when only one slice is ACTIVE amongst those configuring the volume, accessing the volume will result in an error. At such time, the status of the slice and the volume remains ACTIVE.

Probable situations resulting in such a problem will be described using a two-way multiplex mirroring configuration, where two disks or two lower level groups are connected to a group. As an example, means to circumvent such problems will also be described.

(Situation 1)

One of the slices was detached with the sdxslice -M in order to backup volume data. While accessing the volume, an I/O error occurred with the other slice.

(Prevention 1)

Before executing the sdxslice -M command, connect a reserved disk and temporarily configure a three-way multiplex mirroring, or make the mirrored volume available for backup.

(Situation 2)

While restoring a slice with an I/O error, an I/O error also occurred on another slice.

(Prevention 2)

By securing a spare disk within the class, effects due to delay in restoring the slice will be avoided to a certain degree

Resolution

Identify the cause of I/O error occurrence in the last ACTIVE slice, by referring to the disk driver log message.

Resolutions are described below assuming the following three circumstances:

  1. Error occurred due to a disk component failure. Will attempt recovery using backup data.

  2. Error occurred due to a disk component failure. Will attempt data recovery from a slice in INVALID status.

  3. Error occurred due to a failed or a defective non-disk component failure.


a. When the error cause is a disk component failure and recovery is performed using backup data

a1) When the error was caused by a disk component failure, no slice with valid data exists. Restore data from the backup data following the procedures given below.

a2) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command. When I/O error occurs on the unmount command, execute -f option of the unmount command.

a3) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1

a4) If there is a TEMP status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

a5) If there is a NOUSE status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

a6) Record the volume size which can be checked as follows.

# sdxinfo -V -o Volume1
OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 32767 32768 PRIVATE volume Volume1 Class1 Group1 off on 32768 4161535 4128768 STOP

In this example, the volume size would be 4128768 blocks given in Volume1 BLOCKS field.

a7) Remove the volume with the sdxvolume command.

# sdxvolume -R -c Class1 -v Volume1

a8) Swap disks following the procedures given in "5.3.4 Disk Swap" and "D.8 sdxswap - Swap disk."

a9) Create a volume with the sdxvolume command again. For the number_of_blocks, use the size recorded in a6), in this example, 4128768.

# sdxvolume -M -c Class1 -g Group1 -v Volume1 -s number_of_blocks

a10) Finally, restore the backup data to Volume1.


b. When the error cause is a disk component failure and data is restored from a slice in INVALID status

b1) When the error was caused by a disk failure, and when no backup data exists, or even if it did, the data is too old, restore data from the detached INVALID status slice, following the procedures given below.

b2) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command. When I/O error occurs on the unmount command, execute -f option of the unmount command.

b3) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1

b4) If there is a TEMP status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

b5) If there is a NOUSE status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

b6) Determine the original mirror slice after the volume is recovered. Then, execute the sdxfix command.

(Example 1)

# sdxfix -V -c Class1 -d Disk2 -v Volume1

In this example, data is recovered from a mirror slice in the disk Disk2 which is connected to the highest level mirror group.

(Example 2)

# sdxfix -V -c Class1 -g Group2 -v Volume1

In this example, data is recovered from a mirror slice in the lower level group Group2 which is connected to the highest level mirror group.

b7) Start the volume.

# sdxvolume -N -c Class1 -v Volume1 -e nosync

b8) Create backup of Volume1 and regain data integrity by running fsck as necessary.

b9) Lastly, swap disks following the procedures given in "5.3.4 Disk Swap" and "D.8 sdxswap - Swap disk."


c. When the error cause is a non-disk component failure or defect

The slice with valid data exists within the disk, and shut down the system once, recover the failed component, and then reboot the system. Synchronization copying is automatically performed and the mirroring status will be recovered.


(8) An I/O error occurs on a single volume.

Explanation

Since a single volume consists of only one slice, accessing the volume at the time of an I/O error will result in an error. However, the status of slice and volume will remain ACTIVE.

Resolution

Identify the cause of I/O error occurrence by referring to the disk driver log message.

How to resolve the problem is described in two cases:

  1. When the error cause is a disk component failure and recovery is performed using backup data

  2. When the error cause is a non-disk component failure or defect


a. When the error cause is a disk component failure and recovery is performed using backup data

a1) In the event of a disk component failure, there will be no slice with valid data. Follow the procedures below and restore the data using the backup data. In this example, Disk1 (c1t11d0) has a failure.

# sdxinfo -D -o Disk1
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 single Class1 * c1t11d0 8493876 node1 ENABLE

a2) Search the volumes within the faulty disk using the sdxinfo command. And record the volume size,

# sdxinfo -V -o Disk1
OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Disk1 * * 0 32767 32768 PRIVATE volume Volume1 Class1 Disk1 off on 32768 65535 32768 ACTIVE volume Volume2 Class1 Disk1 off on 65536 4194303 4128768 ACTIVE volume * Class1 Disk1 * * 4194304 8421375 4227072 FREE

In this example, Volume1 and Volume2 are within the faulty Disk1. The size of Volume1 would be 32,768 blocks as shown in the BLOCKS field. The size of Volume2 would be 4,128,768 blocks as shown in the BLOCKS field.

a3) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command. When I/O error occurs on the unmount command, execute -f option of the unmount command.

a4) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1,Volume2

a5) Remove the volumes with the sdxvolume command.

# sdxvolume -R -c Class1 -v Volume1
# sdxvolume -R -c Class1 -v Volume2

a6) Before swapping the disks, execute the following command.

# sdxswap -O -c Class1 -d Disk1

Note

If the disk is the only remaining disk in the disk class, the command results in an error as shown below. In that event, follow the steps a6'), a7') and a8').

SDX:sdxswap: ERROR: Disk1: The last ENABLE disk in class cannot be swapped

a7) Swap the disks.

a8) After swapping the disks, execute the following command.

# sdxswap -I -c Class1 -d Disk1

a6') Before swapping the disks, execute the following command.

Note

If no error is output in a6), the steps a6'), a7'), and a8') are not required.

# sdxdisk -R -c Class1 -d Disk1

a7') Swap the disks.

a8') After swapping the disks, execute the following command.

# sdxdisk -M -c Class1 -d c1t11d0=Disk1:single

a9) Create volumes with the sdxvolume command again. For the -s option, use the size recorded in 2a), in this example.

# sdxvolume -M -c Class1 -d Disk1 -v Volume1 -s 32768
# sdxvolume -M -c Class1 -d Disk1 -v Volume2 -s 4128768

a10) Finally, restore the backup data to Volume1 and Volume2.


b. When the error cause is a non-disk component failure or defect

Shut down the system, recover the failed component, and then reboot the system. Slice data is valid and there is no need to restore the data.

However, as the E field of the disk information shown by the sdxinfo -e long command will display "1" due to an I/O error, run the sdxfix -D command to cancel the I/O error status.

# sdxfix -D -c class name -d disk name -e online -x NoRdchk

(9) An I/O error occurs on a stripe volume or a volume in a concatenation group.

Explanation

Since a stripe volume or a volume within a concatenation group consists of only one slice, accessing the volume at the time of an I/O error will also result in an error. However, the status of slice and volume will remain ACTIVE.

Resolution

Identify the cause of I/O error occurrence by referring to the disk driver log message.

You can confirm the error status of the disk related to volume and the physical disk name as shown below.

# sdxinfo -D -o Volume1 -e long
OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS FREEBLKS DEVCONNECT STATUS E ------ ------- ------ ------- ------- ------- -------- -------- ---------------- ------- ----- disk Disk1 concat Class1 Group2 c1t1d0 17596416 * node1 ENABLE 0 disk Disk2 concat Class1 Group2 c1t2d0 17596416 * node1 ENABLE 1 disk Disk3 concat Class1 Group3 c2t3d0 17682084 * node1 ENABLE 0 disk Disk4 concat Class1 Group3 c2t4d0 17682084 * node1 ENABLE 0

In this example, an I/O error occurs on Disk2, as shown in the E field. The physical disk name corresponding to Disk2 is c1t2d0, as shown in the DEVNAM field.

How to resolve the problem is described in two cases:

  1. When the error cause is a disk component failure and recovery is performed using backup data

  2. When the error cause is a non-disk component failure or defect


a. When the error cause is a disk component failure and recovery is performed using backup data

a1) In the event of a disk component failure, there will be no slices with valid data. Follow the procedures below and restore the data using the backup data.

a2) Record the configuration information of the group that was related to the failed disk using the sdxinfo command.

# sdxinfo -G -o Disk2 -e long
OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE MASTER TYPE WIDTH ACTDISK ------ ------- ------- ------------- -------- -------- ----- ------ ------ ----- ------- group Group1 Class1 Group2:Group3 70189056 65961984 * * stripe 32 * group Group2 Class1 Disk1:Disk2 35127296 * * * concat * * group Group3 Class1 Disk3:Disk4 35127296 * * * concat * *

In this example, the lower level concatenation groups Group2 and Group3 are connected to the highest level stripe group Group1. The disks Disk1 and Disk2 are connected to Group2, and the disks Disk3 and Disk4 are connected to Group3. The stripe width for Group1 is 32 blocks.

a3) Search the volumes that exist in the highest level group that are related to the faulty disk using the sdxinfo command.

# sdxinfo -V -o Disk2
OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 65535 65536 PRIVATE volume Volume1 Class1 Group1 * * 65536 98303 32768 ACTIVE volume Volume2 Class1 Group1 * * 98304 4227071 4128768 ACTIVE volume * Class1 Group1 * * 4227072 70189055 65961984 FREE

In this example, Volume1 and Volume2 exist in the highest level group Group1, that is related to the faulty disk Disk2. The size of Volume1 is 32768 blocks, and the size of Volume2 is 4128768 blocks as shown in the BLOCKS field.

a4) Exit the application accessing the volume. When the volume is used as a file system, execute unmount command. When I/O error occurs on unmount command, execute -f option of unmount command.

a5) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1,Volume2

a6) Remove the volumes with the sdxvolume command.

# sdxvolume -R -c Class1 -v Volume1
# sdxvolume -R -c Class1 -v Volume2

a7) Disconnect the faulty disk from the group. If the group is in a hierarchical structure, disconnect from the higher group in descending order.

# sdxgroup -D -c Class1 -h Group1 -l Group2
# sdxdisk -D -c Class1 -g Group2 -d Disk2

In this example, the faulty disk Disk2 is connected to Group2, and Group2 is connected to Group1. Therefore, you should disconnect Group2 first, and then Disk2.

a8) Before swapping the disks, execute the following command.

# sdxswap -O -c Class1 -d Disk2

Note

If the disk is the only remaining disk in the disk class, the command results in an error as shown below. In that event, follow the steps a8'), a9') and a10').

SDX:sdxswap: ERROR: Disk2: The last ENABLE disk in class cannot be swapped

a9) Swap the disks.

a10) After swapping the disks, execute the following command.

# sdxswap -I -c Class1 -d Disk2

a8') Before swapping the disks, execute the following command.

Note

If no error is output in a8), the steps a8'), a9'), and a10') are not required.

# sdxdisk -R -c Class1 -d Disk2

a9') Swap the disks.

a10') After swapping the disks, execute the following command.

# sdxdisk -M -c Class1 -d c1t2d0=Disk2

a11) Connect the swapped disk to the group, referring to the group information recorded in a2). If the groups were in a hierarchical structure, connect the groups in an ascending order.

# sdxdisk -C -c Class1 -g Group2 -d Disk2
# sdxgroup -C -c Class1 -h Group1 -l Group2 -a type=stripe,width=32

a12) Create volumes with the sdxvolume command again. For the -s option, use the size recorded in a3), in this example, 32768 and 4128768.

# sdxvolume -M -c Class1 -g Group1 -v Volume1 -s 32768 -a pslice=off
# sdxvolume -M -c Class1 -g Group1 -v Volume2 -s 4128768 -a pslice=off

a13) Finally, restore the backup data to Volume1 and Volume2.


b. When the error cause is a non-disk component failure or defect

Shut down the system, recover the failed component, and then reboot the system. Slice data is valid and there is no need to restore the data.

However, as the E field of the disk information shown by the sdxinfo -e long command will display "1" due to an I/O error, run the sdxfix -D command to cancel the I/O error status.

# sdxfix -D -c class name -d disk name -e online -x NoRdchk