F.1.3 Volume Status Abnormality

If the volume status is one of the following statuses, take action as indicated for the relevant situation.

(1) Mirror volume is in INVALID status.

Explanation

You can confirm the status of the volume as shown below.

# sdxinfo -V -o Volume1
OBJ    NAME    CLASS   GROUP   SKIP JRM 1STBLK   LASTBLK  BLOCKS   STATUS
------ ------- ------- ------- ---- --- -------- -------- -------- --------
volume *       Class1  Group1  *    *          0    65535    65536 PRIVATE
volume Volume1 Class1  Group1  off  on     65536 17596415 17530880 INVALID

In this example, volume Volume1 that exists in the highest level group Group1 is in INVALID status, as shown in the STATUS field.

If none of the mirror slices consisting the mirror volume contains valid data (ACTIVE or STOP), the mirror volume becomes INVALID. You cannot start a volume in INVALID status.

There are two reasons that may cause this INVALID status.

(Cause a): Disk is in DISABLE status.

(Cause b): Master-proxy relationship was cancelled forcibly while master data was being copied to proxy.

Resolution

1) Confirm that there is a disk in DISABLE status within the group with which the volume is associated as follows.

(Example A1)

# sdxinfo -G -o Volume1
OBJ    NAME    CLASS   DISKS               BLKS     FREEBLKS SPARE
------ ------- ------- ------------------- -------- -------- -----
group  Group1  Class1  Disk1:Disk2         17596416        0 0

# sdxinfo -D -o Volume1
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  DEVCONNECT       STATUS
------ ------- ------ ------- ------- ------- -------- ---------------- -------
disk   Disk1   mirror Class1  Group1  c1t1d0  17596416 node1            ENABLE
disk   Disk2   mirror Class1  Group1  c2t3d0  17596416 node1            DISABLE

In this example, disks Disk1 and Disk2 are connected to the highest level mirror group Group1. The disk Disk2 is in DISABLE status as shown in the STATUS field.

(Example B1)

# sdxinfo -G -o Volume1
OBJ    NAME    CLASS   DISKS               BLKS     FREEBLKS SPARE
------ ------- ------- ------------------- -------- -------- -----
group  Group1  Class1  Group2:Group3       35127296 17530880 0
group  Group2  Class1  Disk1:Disk2         35127296        * 0
group  Group3  Class1  Disk3:Disk4         35127296        * 0

# sdxinfo -D -o Volume1
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  DEVCONNECT       STATUS
------ ------- ------ ------- ------- ------- -------- ---------------- -------
disk   Disk1   stripe Class1  Group2  c1t1d0  17596416 node1            ENABLE  
disk   Disk2   stripe Class1  Group2  c1t2d0  17596416 node1            DISABLE
disk   Disk3   stripe Class1  Group3  c2t3d0  17682084 node1            ENABLE 
disk   Disk4   stripe Class1  Group3  c2t4d0  17682084 node1            ENABLE

In this example, lower level stripe groups, Group2 and Group3 are connected to the highest level mirror group Group1. Disk Disk2 which is connected to Group2 is in DISABLE status as shown in the STATUS field.

2) If the possible cause is (Cause a), restore the disk by following the procedures in "F.1.2 Disk Status Abnormality."

3) From the disks and lower level groups connected to the highest level mirror group, determine the disk or lower level group to which the slice you will use to recover data belongs. Then, execute the sdxfix command to recover data.

(Example A3)

# sdxfix -V -c Class1 -d Disk1 -v Volume1

In this example, Volume1 is recovered after a slice in disk Disk1.

(Example B3)

# sdxfix -V -c Class1 -g Group3 -v Volume1

In this example, Volume1 is recovered after a slice in lower level stripe group Group3.

4) Start the volume.

# sdxvolume -N -c Class1 -v Volume1 -e nosync

5) Access Volume1 and check its contents. Restore backup data or run fsck to regain data integrity as necessary.

6) Perform synchronization copying on volume.

# sdxcopy -B -c Class1 -v Volume1

(2) Single volume is in INVALID status.

Explanation

You can confirm the status of the volume as shown below.

# sdxinfo -V -o Volume1
OBJ    NAME    CLASS   GROUP   SKIP JRM 1STBLK   LASTBLK  BLOCKS   STATUS
------ ------- ------- ------- ---- --- -------- -------- -------- --------
volume *       Class1  Disk1   *    *          0    32767    32768 PRIVATE
volume Volume1 Class1  Disk1   off  on     32768    65535    32768 INVALID
volume *       Class1  Disk1   *    *      65536  8421375  8355840 FREE

In the example, the single volume Volume1 that exists on single disk Disk1 is in INVALID status, as shown in the STATUS field.

You cannot start a volume in INVALID status.

There are two reasons that may cause this INVALID status.

(Cause a): Single disk is in DISABLE status. In this case, the single slice becomes NOUSE status.

(Cause b): Master-proxy relationship was cancelled forcibly while master data was being copied to proxy.

Resolution

1) Confirm that the single disk is in DISABLE status as shown below.

# sdxinfo -D -o Volume1
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  DEVCONNECT       STATUS
------ ------- ------ ------- ------- ------- -------- ---------------- -------
disk   Disk1   single Class1  *       c1t11d0  8355840 node1            DISABLE

In this example, the single disk Disk1 is in DISABLE status, as shown in the STATUS field.

2) If the possible cause is (Cause a), restore the disk by following the procedures given in section "F.1.2 Disk Status Abnormality."

3) Execute the sdxfix command to recover the single volume's data.

# sdxfix -V -c Class1 -d Disk1 -v Volume1

4) Start the volume.

# sdxvolume -N -c Class1 -v Volume1

5) Access Volume1 and check its content. Restore backup data or run the fsck command to regain data integrity as necessary.

(3) Stripe volume or volume in concatenation group is in INVALID status.

Explanation

You can confirm the status of the volume as shown below.

# sdxinfo -V -o Volume1
OBJ    NAME    CLASS   GROUP   SKIP JRM 1STBLK   LASTBLK  BLOCKS   STATUS
------ ------- ------- ------- ---- --- -------- -------- -------- --------
volume *       Class1  Group1  *    *          0    65535    65536 PRIVATE
volume Volume1 Class1  Group1  off  on     65536 17596415 17530880 INVALID

In this example, volume Volume1 that exists in the highest level group Group1 is in INVALID status, as shown in the STATUS field.

If any of the disks related to volume is in DISABLE status, the slices consisting that volume become NOUSE status, and the volume becomes INVALID. You cannot start a volume in INVALID status.

Resolution

1) You can confirm the status of the disk related to the volume as shown below.

# sdxinfo -G -o Volume1 -e long
OBJ    NAME    CLASS   DISKS               BLKS     FREEBLKS SPARE MASTER TYPE   WIDTH
------ ------- ------- ------------------- -------- -------- ----- ------ ------ -----
group  Group1  Class1  Group2:Group3       70189056 65961984 *     *      stripe 32
group  Group2  Class1  Disk1:Disk2         35127296        * *     *      concat *
group  Group3  Class1  Disk3:Disk4         35127296        * *     *      concat *

# sdxinfo -D -o Volume1
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  DEVCONNECT       STATUS
------ ------- ------ ------- ------- ------- -------- ---------------- -------
disk   Disk1   concat Class1  Group2  c1t1d0  17596416 node1            ENABLE
disk   Disk2   concat Class1  Group2  c1t2d0  17596416 node1            DISABLE
disk   Disk3   concat Class1  Group3  c2t3d0  17682084 node1            ENABLE
disk   Disk4   concat Class1  Group3  c2t4d0  17682084 node1            ENABLE

In this example, the lower level concatenation groups Group2 and Group3 are connected to the highest level stripe group Group1, and Disk2 connected to Group2 is in DISABLE status as shown in the STATUS field.

2) Follow the procedures in "F.1.2 Disk Status Abnormality" and restore the disk status.

3) Execute the sdxfix command to recover the volume's data. With -g option, indicate the highest level group name (in this example, Group1).

# sdxfix -V -c Class1 -g Group1 -v Volume1

4) Start the volume.

# sdxvolume -N -c Class1 -v Volume1

5) Access Volume1 and check its content. Restore backup data or run the fsck command to regain data integrity as necessary.

(4) Master volume is in INVALID status.

Explanation

If the copying process fails while copying data from the proxy volume to the master volume because of an I/O error or such, the status of the master volume to which the data is being copied becomes INVALID.

Resolution

1) Check if there is a DISABLE status disk in the group to which the volume belongs with the following command.

# sdxinfo -D -o Volume1
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  DEVCONNECT       STATUS
------ ------- ------ ------- ------- ------- -------- ---------------- -------
disk   Disk1   mirror Class1  Group1  c1t1d0   8421376 *                ENABLE
disk   Disk2   mirror Class1  Group1  c1t2d0   8421376 *                DISABLE

In this example, Disk2 is in DISABLE status.

If there is a disk in DISABLE status, see section "(1) Disk is in DISABLE status." in "F.1.2 Disk Status Abnormality," and check which of the causes listed in that section applies. If the possible cause is (Cause a) or (Cause b), follow the procedures and restore the disk.

2) Follow the procedures given in section "(1) Mirror slice configuring the mirror volume is in INVALID status." in "F.1.1 Slice Status Abnormality," and check if there is a disk hardware abnormality. If there is, identify the faulty part.

When the abnormality is caused by a failed or defective non-disk component, repair the faulty part.

3) Procedures to restore the data for different scenarios are given below.

Non-disk component failure
- When recovering data using the proxy volume:
  -> Follow steps a) to restore.
- When recovering data using backup data on media such as tapes:
  -> Follow steps b) to restore.
When caused by a disk component failure:
- When the disk does not belong to master group:
  - some disks connected to the group have failed
    -> Follow steps c) to restore.
  - all disks connected to the group have failed
    -> Follow steps d) to restore.
- When the disk belongs to master group:
  -> Follow steps e) to restore.

a) Procedures to recover master volume data using proxy volume.

a1) In order to check if the proxy volume that will be used to recover data is separated from the master volume, execute the sdxinfo -V -e long command, and check the PROXY field.

a2) If the proxy volume is not separated, execute the following command.

# sdxproxy Part -c Class1 -p Volume2

a3) Exit all applications accessing the proxy volume. When using the proxy volume as a file system, execute unmount. When using the proxy volume as a file system, execute unmount.

a4) If the proxy volume is started, execute the following command.

# sdxvolume -F -c Class1 -v Volume2

a5) Recover master volume data using the proxy volume's data.

# sdxproxy RejoinRestore -c Class1 -p Volume2

b) Procedures to recover data using backup data.

b1) When the volume is in INVALID status, you must first change it to STOP status. Decide on the disk (slice) you wish to use to recover data, and execute the sdxfix command.

# sdxfix -V -c Class1 -d Disk1 -v Volume1

In this example, Volume 1 is restored after a slice in Disk 1.

b2) When the volume to be restored is stopped, start it with the following command.

# sdxvolume -N -c Class1 -v Volume1 -e nosync

b3) Access the volume to be restored and check its contents. Restore backup data or run fsck to regain data integrity as necessary.

b4) When mirroring is configured with the volume, perform synchronization copying.

# sdxcopy -B -c Class1 -v Volume1

c) Procedures to swap some disks connected to the group.

c1) If you restore the INVALID master volume later using data of a proxy volume related to the master volume, or use data of proxy volumes related to the master volume after restoring it, part the proxy volumes using the sdxproxy Part command.

# sdxproxy Part -c Class1 -p Volume2

c2) When there is a volume in INVALID status in the group, change it to STOP status with the sdxfix -V command. -d option indicates the disk without abnormality.

# sdxfix -V -c Class1 -d Disk1 -v Volume1

c3) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk" and "5.3.4 Disk Swap."

c4) Recover the master volume data. If data will be recovered using the proxy volume, follow procedures described in a). If data will be recovered using backup data on media such as tapes, follow procedures described in b).

d) Procedures to swap all disks connected to the group.

d1) Exit all applications accessing the master volume and the proxy volume that will be used to recover data. When using the proxy volume or the master volume as a file system, execute unmount.

d2) Stop the master volume and proxy volume in d1).

# sdxvolume -F -c Class1 -v Volume1
# sdxvolume -F -c Class1 -v Volume2

d3) Execute the sdxproxy RejoinRestore command and restore the master volume data using proxy volume in d1). If the command terminates normally and the master volume is not in INVALID status, restoration process is complete, and you do not need to perform steps d4) and after.

# sdxproxy RejoinRestore -c Class1 -p Volume2

d4) Execute the sdxproxy Swap command and swap the slices of the master volume with the proxy volume in d1).

# sdxproxy Swap -c Class1 -p Volume2

d5) By performing step d4), the status of master volume will not be in INVALID status, and the status of the proxy volume becomes INVALID. Follow the procedures given in section "(5) Proxy volume is in INVALID status." in "F.1.3 Volume Status Abnormality," and restore the proxy volume in INVALID status.

d6) Execute the sdxproxy Swap command and swap the slices of the master volume and the proxy volume you swapped in step d4).

# sdxproxy Swap -c Class1 -p Volume2

e) Procedures to swap disks connected to the master group.

e1) Exit all applications accessing the master group, and volumes in the proxy group that will be used to recover data. When using the volume as a file system, execute unmount.

e2) Stop all volumes in the master group and the proxy group in e1).

# sdxvolume -F -c Class1 -v Volume1
# sdxvolume -F -c Class1 -v Volume2

e3) Execute the sdxproxy RejoinRestore command and restore the master group data using the proxy group in e1). If the command terminates normally and all master volumes are not in INVALID status, restoration process is complete, and you do not need to perform steps e4) and after.

# sdxproxy RejoinRestore -c Class1 -p Volume2

e4) Execute the sdxproxy Swap command and swap the slices of the master group and the proxy group in e1).

# sdxproxy Swap -c Class1 -p Group2

e5) By performing step e4), the master volume will not be in INVALID status and the status of the proxy volume becomes INVALID. Follow the procedures given in section "(5) Proxy volume is in INVALID status." in "F.1.3 Volume Status Abnormality," and restore the proxy volume in INVALID status.

e6) Execute the sdxproxy Swap command and swap the slices of the master group and the proxy group you swapped in step e4).

# sdxproxy Swap -c Class1 -p Group2

(5) Proxy volume is in INVALID status.

Explanation

If the copying process fails while copying data from the master volume to the proxy volume because of an I/O error or such, the status of the proxy volume to which the data is being copied becomes INVALID.

Resolution

1) Check if there is a DISABLE status disk in the group to which the volume belongs with the following command.

# sdxinfo -D -o Volume1
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  DEVCONNECT       STATUS
------ ------- ------ ------- ------- ------- -------- ---------------- -------
disk   Disk1   mirror Class1  Group1  c1t1d0   8421376 *                ENABLE
disk   Disk2   mirror Class1  Group1  c1t2d0   8421376 *                DISABLE

In this example, Disk2 is in DISABLE status.

If there is a disk in DISABLE status, see section "(1) Disk is in DISABLE status." in "F.1.2 Disk Status Abnormality," and check which of the causes (There are three causes a, b, and c listed.) listed in that section applies. If it is due to (Cause a) or (Cause b), follow the procedures and restore the disk.

2) Follow the procedures given in "(1) Mirror slice configuring the mirror volume is in INVALID status." in "F.1.1 Slice Status Abnormality," and check if there is a disk hardware abnormality. If there is, identify the faulty part. When the abnormality was caused by a failed or defective non-disk component, repair the faulty part.

3) Procedures to restore the data for different scenarios are given below.

When caused by a non-disk component failure:
-> Follow steps a) to restore.
When caused by a disk component failure:
- disk does not belong to proxy group
  - some disks connected to the group have failed
    -> Follow steps b) to restore.
  - When all disks connected to the group have failed:
    -> Follow steps c) to restore.
- When the disk belongs to proxy group:
  ->Follow steps d) to restore.

a) Procedures to recover proxy volume data using the master volume.

a1) In order to check if the proxy volume is separated from the master volume, execute the sdxinfo -V -e long command, and check the PROXY field.

a2) If the proxy volume is not separated, execute the following command.

# sdxproxy Part -c Class1 -p Volume2

a3) Rejoin the proxy volume with the master volume.

# sdxproxy Rejoin -c Class1 -p Volume2

b) Procedures to swap some disks connected to the group.

b1) Cancel the relationship with master volume using the sdxproxy Break command.

# sdxproxy Break -c Class1 -p Volume2

b2) Separate the volumes that are in INVALID status in the group with the sdxfix -V command, and change them to STOP status. -d option indicates the disk without abnormality.

# sdxfix -V -c Class1 -d Disk1 -v Volume2

b3) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk," or section "5.3.4 Disk Swap."

b4) Join the master and the proxy again with the sdxproxy Join command.

# sdxproxy Join -c Class1 -m Volume1 -p Volume2

c) Procedures to swap all disks connected to the group.

c1) Cancel the relationship with the master using the sdxproxy Break command.

# sdxproxy Break -c Class1 -p Volume2

c2) Exit all applications accessing the volume in the group. When using the volume as a file system, execute unmount.

c3) Stop all volumes in the group.

# sdxvolume -F -c Class1 -v Volume2

c4) Check the volume configuration of the group (such as volume names and sizes) with the sdxinfo command, and keep a note of it.

c5) Remove all volumes in the group.

# sdxvolume -R -c Class1 -v Volume2

c6) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk," or section "5.3.4 Disk Swap."

c7) Create the volume that you removed in step c5) again.

# sdxvolume -M -c Class1 -g Group2 -v Volume2 -s size

c8) Stop the volume you created in step c7).

# sdxvolume -F -c Class1 -v Volume2

Note

When the volume is in INVALID status, the error message "ERROR: disk: volume in INVALID status" will be displayed. Ignore the message and proceed to the next step.

c9) Join the master volume and the proxy volume again, with the sdxproxy Join command.

# sdxproxy Join -c Class1 -m Volume1 -p Volume2

d) Procedures to swap disks connected to the proxy group.

d1) Cancel the relationship with the master using the sdxproxy Break command.

# sdxproxy Break -c Class1 -p Group2

d2) Exit all applications accessing the volume in the group. When using the volume as a file system, execute unmount.

d3) Stop all volumes in the group.

# sdxvolume -F -c Class1 -v Volume2

Note

When the volume is in INVALID status, the error message "ERROR: disk: volume in INVALID status" will be displayed. Ignore the message and proceed to the next step.

d4) Remove all volumes in the group.

# sdxvolume -R -c Class1 -v Volume2

d5) Follow the procedures and swap the disks. For details, see "D.8 sdxswap - Swap disk," or section "5.3.4 Disk Swap."

d6) Join the master group and the proxy group again with the sdxproxy Join command.

# sdxproxy Join -c Class1 -m Group1 -p Group2 -a Volume1=Volume2:on

(6) Volume is in STOP status.

Explanation

Normally, volumes automatically start when the system is booted and become ACTIVE. The volume status will change to STOP when the volume is stopped with the Stop Volume menu in the GDS Management View or the sdxvolume -F command.

In a cluster system, among volumes within GDS shared classes registered with cluster applications, volumes other than proxy volumes start or stop according to the cluster application modes. If a cluster application is in Offline mode, volumes other than proxy volumes are in STOP status.

Accessing a volume in STOP status will result in an EIO error (I/O error) or an ENXIO error (No such device or address).

See

For the problem in a cluster system that volumes in a shared class not registered with a cluster application do not start at node startup, see "(4) The GFS Shared File System is not mounted on node startup." in "F.1.9 Cluster System Related Error."

Resolution

Start the volumes with the Start Volume menu in GDS Management View or the sdxvolume -N as necessary.

To start volumes within a GDS shared class registered with a cluster application, change the cluster application mode to Online.

(7) I/O error occurs although mirror volume is in ACTIVE status.

Explanation

A mirror volume consists of multiple slices, and in an event of an I/O error, the crashed slice will be detached. Therefore, accessing the volume will complete normally.

However, when an I/O error occurs when only one slice is ACTIVE amongst those configuring the volume, accessing the volume will result in an error. At such time, the status of the slice and the volume remains ACTIVE.

Probable situations resulting in such a problem will be described using a two-way multiplex mirroring configuration, where two disks or two lower level groups are connected to a group. As an example, means to circumvent such problems will also be described.

(Situation 1): One of the slices was detached with the sdxslice -M in order to backup volume data. While accessing the volume, an I/O error occurred with the other slice.

(Prevention 1): Before executing the sdxslice -M command, connect a reserved disk and temporarily configure a three-way multiplex mirroring, or make the mirrored volume available for backup.

(Situation 2): While restoring a slice with an I/O error, an I/O error also occurred on another slice.

(Prevention 2): By securing a spare disk within the class, effects due to delay in restoring the slice will be avoided to a certain degree

Resolution

Identify the cause of I/O error occurrence in the last ACTIVE slice, by referring to the disk driver log message.

Resolutions are described below assuming the following three circumstances:

Error occurred due to a disk component failure. Will attempt recovery using backup data.
Error occurred due to a disk component failure. Will attempt data recovery from a slice in INVALID status.
Error occurred due to a failed or a defective non-disk component failure.

a. When the error cause is a disk component failure and recovery is performed using backup data

a1) When the error was caused by a disk component failure, no slice with valid data exists. Restore data from the backup data following the procedures given below.

a2) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command. When I/O error occurs on the unmount command, execute -f option of the unmount command.

a3) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1

a4) If there is a TEMP status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

a5) If there is a NOUSE status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

a6) Record the volume size which can be checked as follows.

# sdxinfo -V -o Volume1
OBJ    NAME    CLASS   GROUP   SKIP JRM 1STBLK   LASTBLK  BLOCKS   STATUS
------ ------- ------- ------- ---- --- -------- -------- -------- --------
volume *       Class1  Group1  *    *          0    32767    32768 PRIVATE
volume Volume1 Class1  Group1  off  on     32768  4161535  4128768 STOP

In this example, the volume size would be 4128768 blocks given in Volume1 BLOCKS field.

a7) Remove the volume with the sdxvolume command.

# sdxvolume -R -c Class1 -v Volume1

a8) Swap disks following the procedures given in "5.3.4 Disk Swap" and "D.8 sdxswap - Swap disk."

a9) Create a volume with the sdxvolume command again. For the number_of_blocks, use the size recorded in a6), in this example, 4128768.

# sdxvolume -M -c Class1 -g Group1 -v Volume1 -s number_of_blocks

a10) Finally, restore the backup data to Volume1.

b. When the error cause is a disk component failure and data is restored from a slice in INVALID status

b1) When the error was caused by a disk failure, and when no backup data exists, or even if it did, the data is too old, restore data from the detached INVALID status slice, following the procedures given below.

b2) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command. When I/O error occurs on the unmount command, execute -f option of the unmount command.

b3) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1

b4) If there is a TEMP status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

b5) If there is a NOUSE status slice within the volume, attempt recovery following the procedures given in "F.1.1 Slice Status Abnormality."

b6) Determine the original mirror slice after the volume is recovered. Then, execute the sdxfix command.

(Example 1)

# sdxfix -V -c Class1 -d Disk2 -v Volume1

In this example, data is recovered from a mirror slice in the disk Disk2 which is connected to the highest level mirror group.

(Example 2)

# sdxfix -V -c Class1 -g Group2 -v Volume1

In this example, data is recovered from a mirror slice in the lower level group Group2 which is connected to the highest level mirror group.

b7) Start the volume.

# sdxvolume -N -c Class1 -v Volume1 -e nosync

b8) Create backup of Volume1 and regain data integrity by running fsck as necessary.

b9) Lastly, swap disks following the procedures given in "5.3.4 Disk Swap" and "D.8 sdxswap - Swap disk."

c. When the error cause is a non-disk component failure or defect

The slice with valid data exists within the disk, and shut down the system once, recover the failed component, and then reboot the system. Synchronization copying is automatically performed and the mirroring status will be recovered.

(8) An I/O error occurs on a single volume.

Explanation

Since a single volume consists of only one slice, accessing the volume at the time of an I/O error will result in an error. However, the status of slice and volume will remain ACTIVE.

Resolution

Identify the cause of I/O error occurrence by referring to the disk driver log message.

How to resolve the problem is described in two cases:

When the error cause is a disk component failure and recovery is performed using backup data
When the error cause is a non-disk component failure or defect

a. When the error cause is a disk component failure and recovery is performed using backup data

a1) In the event of a disk component failure, there will be no slice with valid data. Follow the procedures below and restore the data using the backup data. In this example, Disk1 (c1t11d0) has a failure.

# sdxinfo -D -o Disk1
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  DEVCONNECT       STATUS
------ ------- ------ ------- ------- ------- -------- ---------------- -------
disk   Disk1   single Class1  *       c1t11d0  8493876 node1            ENABLE

a2) Search the volumes within the faulty disk using the sdxinfo command. And record the volume size,

# sdxinfo -V -o Disk1
OBJ    NAME    CLASS   GROUP   SKIP JRM 1STBLK   LASTBLK  BLOCKS   STATUS
------ ------- ------- ------- ---- --- -------- -------- -------- --------
volume *       Class1  Disk1   *    *          0    32767    32768 PRIVATE
volume Volume1 Class1  Disk1   off  on     32768    65535    32768 ACTIVE
volume Volume2 Class1  Disk1   off  on     65536  4194303  4128768 ACTIVE
volume *       Class1  Disk1   *    *    4194304  8421375  4227072 FREE

In this example, Volume1 and Volume2 are within the faulty Disk1. The size of Volume1 would be 32,768 blocks as shown in the BLOCKS field. The size of Volume2 would be 4,128,768 blocks as shown in the BLOCKS field.

a3) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command. When I/O error occurs on the unmount command, execute -f option of the unmount command.

a4) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1,Volume2

a5) Remove the volumes with the sdxvolume command.

# sdxvolume -R -c Class1 -v Volume1
# sdxvolume -R -c Class1 -v Volume2

a6) Before swapping the disks, execute the following command.

# sdxswap -O -c Class1 -d Disk1

Note

If the disk is the only remaining disk in the disk class, the command results in an error as shown below. In that event, follow the steps a6'), a7') and a8').

SDX:sdxswap: ERROR: Disk1: The last ENABLE disk in class cannot be swapped

a7) Swap the disks.

a8) After swapping the disks, execute the following command.

# sdxswap -I -c Class1 -d Disk1

a6') Before swapping the disks, execute the following command.

Note

If no error is output in a6), the steps a6'), a7'), and a8') are not required.

# sdxdisk -R -c Class1 -d Disk1

a7') Swap the disks.

a8') After swapping the disks, execute the following command.

# sdxdisk -M -c Class1 -d c1t11d0=Disk1:single

a9) Create volumes with the sdxvolume command again. For the -s option, use the size recorded in 2a), in this example.

# sdxvolume -M -c Class1 -d Disk1 -v Volume1 -s 32768
# sdxvolume -M -c Class1 -d Disk1 -v Volume2 -s 4128768

a10) Finally, restore the backup data to Volume1 and Volume2.

b. When the error cause is a non-disk component failure or defect

Shut down the system, recover the failed component, and then reboot the system. Slice data is valid and there is no need to restore the data.

However, as the E field of the disk information shown by the sdxinfo -e long command will display "1" due to an I/O error, run the sdxfix -D command to cancel the I/O error status.

# sdxfix -D -c class name -d disk name -e online -x NoRdchk

(9) An I/O error occurs on a stripe volume or a volume in a concatenation group.

Explanation

Since a stripe volume or a volume within a concatenation group consists of only one slice, accessing the volume at the time of an I/O error will also result in an error. However, the status of slice and volume will remain ACTIVE.

Resolution

Identify the cause of I/O error occurrence by referring to the disk driver log message.

You can confirm the error status of the disk related to volume and the physical disk name as shown below.

# sdxinfo -D -o Volume1 -e long
OBJ    NAME    TYPE   CLASS   GROUP   DEVNAM  DEVBLKS  FREEBLKS DEVCONNECT       STATUS  E
------ ------- ------ ------- ------- ------- -------- -------- ---------------- ------- -----
disk   Disk1   concat Class1  Group2  c1t1d0  17596416        * node1            ENABLE      0
disk   Disk2   concat Class1  Group2  c1t2d0  17596416        * node1            ENABLE      1
disk   Disk3   concat Class1  Group3  c2t3d0  17682084        * node1            ENABLE      0
disk   Disk4   concat Class1  Group3  c2t4d0  17682084        * node1            ENABLE      0

In this example, an I/O error occurs on Disk2, as shown in the E field. The physical disk name corresponding to Disk2 is c1t2d0, as shown in the DEVNAM field.

How to resolve the problem is described in two cases:

When the error cause is a disk component failure and recovery is performed using backup data
When the error cause is a non-disk component failure or defect

a. When the error cause is a disk component failure and recovery is performed using backup data

a1) In the event of a disk component failure, there will be no slices with valid data. Follow the procedures below and restore the data using the backup data.

a2) Record the configuration information of the group that was related to the failed disk using the sdxinfo command.

# sdxinfo -G -o Disk2 -e long
OBJ    NAME    CLASS   DISKS         BLKS     FREEBLKS SPARE MASTER TYPE   WIDTH ACTDISK
------ ------- ------- ------------- -------- -------- ----- ------ ------ ----- -------
group  Group1  Class1  Group2:Group3 70189056 65961984 *     *      stripe 32    *
group  Group2  Class1  Disk1:Disk2   35127296        * *     *      concat *     *
group  Group3  Class1  Disk3:Disk4   35127296        * *     *      concat *     *

In this example, the lower level concatenation groups Group2 and Group3 are connected to the highest level stripe group Group1. The disks Disk1 and Disk2 are connected to Group2, and the disks Disk3 and Disk4 are connected to Group3. The stripe width for Group1 is 32 blocks.

a3) Search the volumes that exist in the highest level group that are related to the faulty disk using the sdxinfo command.

# sdxinfo -V -o Disk2
OBJ    NAME    CLASS   GROUP   SKIP JRM 1STBLK   LASTBLK  BLOCKS   STATUS
------ ------- ------- ------- ---- --- -------- -------- -------- --------
volume *       Class1  Group1  *    *          0    65535    65536 PRIVATE
volume Volume1 Class1  Group1  *    *      65536    98303    32768 ACTIVE
volume Volume2 Class1  Group1  *    *      98304  4227071  4128768 ACTIVE
volume *       Class1  Group1  *    *    4227072 70189055 65961984 FREE

In this example, Volume1 and Volume2 exist in the highest level group Group1, that is related to the faulty disk Disk2. The size of Volume1 is 32768 blocks, and the size of Volume2 is 4128768 blocks as shown in the BLOCKS field.

a4) Exit the application accessing the volume. When the volume is used as a file system, execute unmount command. When I/O error occurs on unmount command, execute -f option of unmount command.

a5) Stop the volume with the sdxvolume command.

# sdxvolume -F -c Class1 -v Volume1,Volume2

a6) Remove the volumes with the sdxvolume command.

# sdxvolume -R -c Class1 -v Volume1
# sdxvolume -R -c Class1 -v Volume2

a7) Disconnect the faulty disk from the group. If the group is in a hierarchical structure, disconnect from the higher group in descending order.

# sdxgroup -D -c Class1 -h Group1 -l Group2
# sdxdisk -D -c Class1 -g Group2 -d Disk2

In this example, the faulty disk Disk2 is connected to Group2, and Group2 is connected to Group1. Therefore, you should disconnect Group2 first, and then Disk2.

a8) Before swapping the disks, execute the following command.

# sdxswap -O -c Class1 -d Disk2

Note

If the disk is the only remaining disk in the disk class, the command results in an error as shown below. In that event, follow the steps a8'), a9') and a10').

SDX:sdxswap: ERROR: Disk2: The last ENABLE disk in class cannot be swapped

a9) Swap the disks.

a10) After swapping the disks, execute the following command.

# sdxswap -I -c Class1 -d Disk2

a8') Before swapping the disks, execute the following command.

Note

If no error is output in a8), the steps a8'), a9'), and a10') are not required.

# sdxdisk -R -c Class1 -d Disk2

a9') Swap the disks.

a10') After swapping the disks, execute the following command.

# sdxdisk -M -c Class1 -d c1t2d0=Disk2

a11) Connect the swapped disk to the group, referring to the group information recorded in a2). If the groups were in a hierarchical structure, connect the groups in an ascending order.

# sdxdisk -C -c Class1 -g Group2 -d Disk2
# sdxgroup -C -c Class1 -h Group1 -l Group2 -a type=stripe,width=32

a12) Create volumes with the sdxvolume command again. For the -s option, use the size recorded in a3), in this example, 32768 and 4128768.

# sdxvolume -M -c Class1 -g Group1 -v Volume1 -s 32768 -a pslice=off
# sdxvolume -M -c Class1 -g Group1 -v Volume2 -s 4128768 -a pslice=off

a13) Finally, restore the backup data to Volume1 and Volume2.

b. When the error cause is a non-disk component failure or defect

Shut down the system, recover the failed component, and then reboot the system. Slice data is valid and there is no need to restore the data.

However, as the E field of the disk information shown by the sdxinfo -e long command will display "1" due to an I/O error, run the sdxfix -D command to cancel the I/O error status.

# sdxfix -D -c class name -d disk name -e online -x NoRdchk