PRIMECLUSTER Global Disk Services Configuration and Administration Guide 4.1 (Solaris(TM) Operating System) |
Contents
![]() ![]() |
Appendix F Troubleshooting | > F.1 Resolving Problems |
If the volume status is one of the following statuses, take action as indicated for the relevant situation.
Stripe volume or volume in concatenation group is in INVALID status.
I/O error occurs although mirror volume is in ACTIVE status.
An I/O error occurs on a stripe volume or a volume in a concatenation group.
You can confirm the status of the volume as shown below.
# sdxinfo -V -o Volume1 OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 65535 65536 PRIVATE volume Volume1 Class1 Group1 off on 65536 17596415 17530880 INVALID |
In this example, volume Volume1 that exists in the highest level group Group1 is in INVALID status, as shown in the STATUS field.
If none of the mirror slices consisting the mirror volume contains valid data (ACTIVE or STOP), the mirror volume becomes INVALID. You cannot start a volume in INVALID status.
There are two reasons that may cause this INVALID status.
1) Confirm that there is a disk in DISABLE status within the group with which the volume is associated as follows.
(Example A1)
# sdxinfo -G -o Volume1 OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE ------ ------- ------- ------------------- -------- -------- ----- group Group1 Class1 Disk1:Disk2 17596416 0 0
# sdxinfo -D -o Volume1 OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 mirror Class1 Group1 c1t1d0 17596416 node1 ENABLE disk Disk2 mirror Class1 Group1 c2t3d0 17596416 node1 DISABLE |
In this example, disks Disk1 and Disk2 are connected to the highest level mirror group Group1. The disk Disk2 is in DISABLE status as shown in the STATUS field.
(Example B1)
# sdxinfo -G -o Volume1 OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE ------ ------- ------- ------------------- -------- -------- ----- group Group1 Class1 Group2:Group3 35127296 17530880 0 group Group2 Class1 Disk1:Disk2 35127296 * 0 group Group3 Class1 Disk3:Disk4 35127296 * 0
# sdxinfo -D -o Volume1 OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 stripe Class1 Group2 c1t1d0 17596416 node1 ENABLE disk Disk2 stripe Class1 Group2 c1t2d0 17596416 node1 DISABLE disk Disk3 stripe Class1 Group3 c2t3d0 17682084 node1 ENABLE disk Disk4 stripe Class1 Group3 c2t4d0 17682084 node1 ENABLE |
In this example, lower level stripe groups, Group2 and Group3 are connected to the highest level mirror group Group1. Disk Disk2 which is connected to Group2 is in DISABLE status as shown in the STATUS field.
2) If the possible cause is (Cause a), restore the disk by following the procedures in "Disk Status Abnormality."
3) From the disks and lower level groups connected to the highest level mirror group, determine the disk or lower level group to which the slice you will use to recover data belongs. Then, execute the sdxfix command to recover data.
(Example A3)
# sdxfix -V -c Class1 -d Disk1 -v Volume1 |
In this example, Volume1 is recovered after a slice in disk Disk1.
(Example B3)
# sdxfix -V -c Class1 -g Group3 -v Volume1 |
In this example, Volume1 is recovered after a slice in lower level stripe group Group3.
4) Start the volume.
# sdxvolume -N -c Class1 -v Volume1 -e nosync |
5) Access Volume1 and check its contents. Restore backup data or run fsck to regain data integrity as necessary.
6) Perform synchronization copying on volume.
# sdxcopy -B -c Class1 -v Volume1 |
You can confirm the status of the volume as shown below.
# sdxinfo -V -o Volume1 OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Disk1 * * 0 32767 32768 PRIVATE volume Volume1 Class1 Disk1 off on 32768 65535 32768 INVALID volume * Class1 Disk1 * * 65536 8421375 8355840 FREE |
In the example, the single volume Volume1 that exists on single disk Disk1 is in INVALID status, as shown in the STATUS field.
You cannot start a volume in INVALID status.
There are two reasons that may cause this INVALID status.
1) Confirm that the single disk is in DISABLE status as shown below.
# sdxinfo -D -o Volume1 OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 single Class1 * c1t11d0 8355840 node1 DISABLE |
In this example, the single disk Disk1 is in DISABLE status, as shown in the STATUS field.
2) If the possible cause is (Cause a), restore the disk by following the procedures given in section "Disk Status Abnormality."
3) Execute the sdxfix command to recover the single volume's data.
# sdxfix -V -c Class1 -d Disk1 -v Volume1 |
4) Start the volume.
# sdxvolume -N -c Class1 -v Volume1 |
5) Access Volume1 and check its content. Restore backup data or run the fsck command to regain data integrity as necessary.
You can confirm the status of the volume as shown below.
# sdxinfo -V -o Volume1 OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 65535 65536 PRIVATE volume Volume1 Class1 Group1 off on 65536 17596415 17530880 INVALID |
In this example, volume Volume1 that exists in the highest level group Group1 is in INVALID status, as shown in the STATUS field.
If any of the disks related to volume is in DISABLE status, the slices consisting that volume become NOUSE status, and the volume becomes INVALID. You cannot start a volume in INVALID status.
1) You can confirm the status of the disk related to the volume as shown below.
# sdxinfo -G -o Volume1 -e long OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE MASTER TYPE WIDTH ------ ------- ------- ------------------- -------- -------- ----- ------ ------ ----- group Group1 Class1 Group2:Group3 70189056 65961984 * * stripe 32 group Group2 Class1 Disk1:Disk2 35127296 * * * concat * group Group3 Class1 Disk3:Disk4 35127296 * * * concat *
# sdxinfo -D -o Volume1 OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 concat Class1 Group2 c1t1d0 17596416 node1 ENABLE disk Disk2 concat Class1 Group2 c1t2d0 17596416 node1 DISABLE disk Disk3 concat Class1 Group3 c2t3d0 17682084 node1 ENABLE disk Disk4 concat Class1 Group3 c2t4d0 17682084 node1 ENABLE |
In this example, the lower level concatenation groups Group2 and Group3 are connected to the highest level stripe group Group1, and Disk2 connected to Group2 is in DISABLE status as shown in the STATUS field.
2) Follow the procedures in "Disk Status Abnormality" and restore the disk status.
3) Execute the sdxfix command to recover the volume's data. With -g option, indicate the highest level group name (in this example,Group1).
# sdxfix -V -c Class1 -g Group1 -v Volume1 |
4) Start the volume.
# sdxvolume -N -c Class1 -v Volume1 |
5) Access Volume1 and check its content. Restore backup data or run the fsck command to regain data integrity as necessary.
If the copying process fails while copying data from the proxy volume to the master volume because of an I/O error or such, the status of the master volume to which the data is being copied becomes INVALID.
1) Check if there is a DISABLE status disk in the group to which the volume belongs with the following command.
# sdxinfo -D -o Volume1 OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 mirror Class1 Group1 c1t1d0 8421376 * ENABLE disk Disk2 mirror Class1 Group1 c1t2d0 8421376 * DISABLE |
In this example, Disk2 is in DISABLE status.
If there is a disk in DISABLE status, see section "(1) Disk is in DISABLE status" in "Disk Status Abnormality," and check which of the causes listed in that section applies. If the possible cause is (Cause a) or (Cause b), follow the procedures and restore the disk.
2) Follow the procedures given in section "(1) Mirror slice configuring the mirror volume is in INVALID status" in "Slice Status Abnormality," and check if there is a disk hardware abnormality. If there is, identify the faulty part. When the abnormality is caused by a failed or defective non-disk component, repair the faulty part.
3) Procedures to restore the data for different scenarios are given below.
When recovering data using the proxy volume:
-> Follow steps a) to restore.
When recovering data using backup data on media such as tapes:
-> Follow steps b) to restore.
When the disk does not belong to master group:
some disks connected to the group have failed
-> Follow steps c) to restore.
all disks connected to the group have failed
-> Follow steps d) to restore.
When the disk belongs to master group:
-> Follow steps e) to restore.
a) Procedures to recover master volume data using proxy volume.
a1) In order to check if the proxy volume that will be used to recover data is separated from the master volume, execute the sdxinfo -V -e long command, and check the PROXY field.
a2) If the proxy volume is not separated, execute the following command.
# sdxproxy Part -c Class1 -p Volume2 |
a3) Exit all applications accessing the proxy volume. When using the proxy volume as a file system, execute unmount. When using the proxy volume as a file system, execute unmount.
a4) If the proxy volume is started, execute the following command.
# sdxvolume -F -c Class1 -v Volume2 |
a5) Recover master volume data using the proxy volume's data.
# sdxproxy RejoinRestore -c Class1 -p Volume2 |
b) Procedures to recover data using backup data.
b1) When the volume is in INVALID status, you must first change it to STOP status. Decide on the disk (slice) you wish to use to recover data, and execute the sdxfix command.
# sdxfix -V -c Class1 -d Disk1 -v Volume1 |
In this example, Volume 1 is restored after a slice in Disk 1.
b2) When the volume to be restored is stopped, start it with the following command.
# sdxvolume -N -c Class1 -v Volume1 -e nosync |
b3) Access the volume to be restored and check its contents. Restore backup data or run fsck to regain data integrity as necessary.
b4) When mirroring is configured with the volume, perform synchronization copying.
# sdxcopy -B -c Class1 -v Volume1 |
c) Procedures to swap some disks connected to the group.
c1) If you restore the INVALID master volume later using data of a proxy volume related to the master volume, or use data of proxy volumes related to the master volume after restoring it, part the proxy volumes using the sdxproxy Part command.
# sdxproxy Part -c Class1 -p Volume2 |
c2) When there is a volume in INVALID status in the group, change it to STOP status with the sdxfix -V command. -d option indicates the disk without abnormality.
# sdxfix -V -c Class1 -d Disk1 -v Volume1 |
c3) Follow the procedures and swap the disks. For details, see "sdxswap - Swap disk" and "Disk Swap."
c4) Recover the master volume data. If data will be recovered using the proxy volume, follow procedures described in a). If data will be recovered using backup data on media such as tapes, follow procedures described in b).
d) Procedures to swap all disks connected to the group.
d1) Exit all applications accessing the master volume and the proxy volume that will be used to recover data. When using the proxy volume or the master volume as a file system, execute unmount.
d2) Stop the master volume and proxy volume in d1).
# sdxvolume -F -c Class1 -v Volume1 |
d3) Execute the sdxproxy RejoinRestore command and restore the master volume data using proxy volume in d1). If the command terminates normally and the master volume is not in INVALID status, restoration process is complete, and you do not need to perform steps d4) and after.
# sdxproxy RejoinRestore -c Class1 -p Volume2 |
d4) Execute the sdxproxy Swap command and swap the slices of the master volume with the proxy volume in d1).
# sdxproxy Swap -c Class1 -p Volume2 |
d5) By performing step d4), the status of master volume will not be in INVALID status, and the status of the proxy volume becomes INVALID. Follow the procedures given in section "(5) Proxy volume is in INVALID status" in "Volume Status Abnormality," and restore the proxy volume in INVALID status.
d6) Execute the sdxproxy Swap command and swap the slices of the master volume and the proxy volume you swapped in step d4).
# sdxproxy Swap -c Class1 -p Volume2 |
e) Procedures to swap disks connected to the master group.
e1) Exit all applications accessing the master group, and volumes in the proxy group that will be used to recover data. When using the volume as a file system, execute unmount.
e2) Stop all volumes in the master group and the proxy group in e1).
# sdxvolume -F -c Class1 -v Volume1 |
e3) Execute the sdxproxy RejoinRestore command and restore the master group data using the proxy group in e1). If the command terminates normally and all master volumes are not in INVALID status, restoration process is complete, and you do not need to perform steps e4) and after.
# sdxproxy RejoinRestore -c Class1 -p Volume2 |
e4) Execute the sdxproxy Swap command and swap the slices of the master group and the proxy group in e1).
# sdxproxy Swap -c Class1 -p Group2 |
e5) By performing step e4), the master volume will not be in INVALID status and the status of the proxy volume becomes INVALID. Follow the procedures given in section "(5) Proxy volume is in INVALID status" in "Volume Status Abnormality," and restore the proxy volume in INVALID status.
e6) Execute the sdxproxy Swap command and swap the slices of the master group and the proxy group you swapped in step e4).
# sdxproxy Swap -c Class1 -p Group2 |
If the copying process fails while copying data from the master volume to the proxy volume because of an I/O error or such, the status of the proxy volume to which the data is being copied becomes INVALID.
1) Check if there is a DISABLE status disk in the group to which the volume belongs with the following command.
# sdxinfo -D -o Volume1 OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 mirror Class1 Group1 c1t1d0 8421376 * ENABLE disk Disk2 mirror Class1 Group1 c1t2d0 8421376 * DISABLE |
In this example, Disk2 is in DISABLE status.
If there is a disk in DISABLE status, see section "(1) Disk is in DISABLE status" in "Disk Status Abnormality," and check which of the causes (There are three causes a, b and c listed.) listed in that section applies. If it is due to (Cause a) or (Cause b), follow the procedures and restore the disk.
2) Follow the procedures given in "(1) Mirror slice configuring the mirror volume is in INVALID status " in "Slice Status Abnormality," and check if there is a disk hardware abnormality. If there is, identify the faulty part. When the abnormality was caused by a failed or defective non-disk component, repair the faulty part.
3) Procedures to restore the data for different scenarios are given below.
When caused by a non-disk component failure:
-> Follow steps a) to restore.
When caused by a disk component failure:
disk does not belong to proxy group
some disks connected to the group have failed
-> Follow steps b) to restore.
When all disks connected to the group have failed:
-> Follow steps c) to restore.
When the disk belongs to proxy group:
->Follow steps d) to restore.
a) Procedures to recover proxy volume data using the master volume.
a1) In order to check if the proxy volume is separated from the master volume, execute the sdxinfo -V -e long command, and check the PROXY field.
a2) If the proxy volume is not separated, execute the following command.
# sdxproxy Part -c Class1 -p Volume2 |
a3) Rejoin the proxy volume with the master volume.
# sdxproxy Rejoin -c Class1 -p Volume2 |
b) Procedures to swap some disks connected to the group.
b1) Cancel the relationship with master volume using the sdxproxy Break command.
# sdxproxy Break -c Class1 -p Volume2 |
b2) Separate the volumes that are in INVALID status in the group with the sdxfix -V command, and change them to STOP status. -d option indicates the disk without abnormality.
# sdxfix -V -c Class1 -d Disk1 -v Volume2 |
b3) Follow the procedures and swap the disks. For details, see "sdxswap - Swap disk," or section "Disk Swap."
b4) Join the master and the proxy again with the sdxproxy Join command.
# sdxproxy Join -c Class1 -m Volume1 -p Volume2 |
c) Procedures to swap all disks connected to the group.
c1) Cancel the relationship with the master using the sdxproxy Break command.
# sdxproxy Break -c Class1 -p Volume2 |
c2) Exit all applications accessing the volume in the group. When using the volume as a file system, execute unmount.
c3) Stop all volumes in the group.
# sdxvolume -F -c Class1 -v Volume2 |
c4) Check the volume configuration of the group (such as volume names and sizes) with the sdxinfo command, and keep a note of it.
c5) Remove all volumes in the group.
# sdxvolume -R -c Class1 -v Volume2 |
c6) Follow the procedures and swap the disks. For details, see "sdxswap - Swap disk," or section "Disk Swap."
c7) Create the volume that you removed in step c5) again.
# sdxvolume -M -c Class1 -g Group2 -v Volume2 -s size |
c8) Stop the volume you created in step c7).
# sdxvolume -F -c Class1 -v Volume2 |
c9) Join the master volume and the proxy volume again, with the sdxproxy Join command.
# sdxproxy Join -c Class1 -m Volume1 -p Volume2 |
d) Procedures to swap disks connected to the proxy group.
d1) Cancel the relationship with the master using the sdxproxy Break command.
# sdxproxy Break -c Class1 -p Group2 |
d2) Exit all applications accessing the volume in the group. When using the volume as a file system, execute unmount.
d3) Stop all volumes in the group.
# sdxvolume -F -c Class1 -v Volume2 |
d4) Remove all volumes in the group.
# sdxvolume -R -c Class1 -v Volume2 |
d5) Follow the procedures and swap the disks. For details, see "sdxswap - Swap disk," or section "Disk Swap."
d6) Join the master group and the proxy group again with the sdxproxy Join command.
# sdxproxy Join -c Class1 -m Group1 -p Group2 -a Volume1=Volume2:on |
Normally, volumes automatically start when the system is booted and become ACTIVE. The volume status will change to STOP when the volume is stopped with the Stop Volume menu in the GDS Management View or the sdxvolume -F command.
In a cluster system, among volumes within GDS shared classes registered with cluster applications, volumes other than proxy volumes start or stop according to the cluster application modes. If a cluster application is in Offline mode, volumes other than proxy volumes are in STOP status.
Accessing a volume in STOP status will result in an EIO error (I/O error) or an ENXIO error (No such device or address).
For the problem in a cluster system that volumes in a shared class not registered with a cluster application do not start at node startup, see "(4) The GFS Shared File System is not mounted on node startup" in "Cluster System Related Error."
Start the volumes with the Start Volume menu in GDS Management View or the sdxvolume -N as necessary.
To start volumes within a GDS shared class registered with a cluster application, change the cluster application mode to Online.
A mirror volume consists of multiple slices, and in an event of an I/O error, the crashed slice will be detached. Therefore, accessing the volume will complete normally.
However, when an I/O error occurs when only one slice is ACTIVE amongst those configuring the volume, accessing the volume will result in an error. At such time, the status of the slice and the volume remains ACTIVE.
Probable situations resulting in such a problem will be described using a two-way multiplex mirroring configuration, where two disks or two lower level groups are connected to a group. As an example, means to circumvent such problems will also be described.
Identify the cause of I/O error occurrence in the last ACTIVE slice, by referring to the disk driver log message.
Resolutions are described below assuming the following three circumstances:
Error occurred due to a disk component failure. Will attempt recovery using backup data.
Error occurred due to a disk component failure. Will attempt data recovery from a slice in INVALID status.
Error occurred due to a failed or a defective non-disk component failure.
a. When the error cause is a disk component failure and recovery is performed using backup data
a1) When the error was caused by a disk component failure, no slice with valid data exists. Restore data from the backup data following the procedures given below.
a2) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command.
When I/O error occurs on the unmount command, execute -f option of the unmount command.
a3) Stop the volume with the sdxvolume command.
# sdxvolume -F -c Class1 -v Volume1 |
a4) If there is a TEMP status slice within the volume, attempt recovery following the procedures given in "Slice Status Abnormality."
a5) If there is a NOUSE status slice within the volume, attempt
recovery following the procedures given in "Slice Status Abnormality."
a6) Record the volume size which can be checked as follows.
# sdxinfo -V -o Volume1 OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 32767 32768 PRIVATE volume Volume1 Class1 Group1 off on 32768 4161535 4128768 STOP |
In this example, the volume size would be 4128768 blocks given in Volume1 BLOCKS field.
a7) Remove the volume with the sdxvolume command.
# sdxvolume -R -c Class1 -v Volume1 |
a8) Swap disks following the procedures given in "Disk Swap" and "sdxswap - Swap disk."
a9) Create a volume with the sdxvolume command again. For the number_of_blocks, use the size recorded in a6), in this example, 4128768.
# sdxvolume -M -c Class1 -g Group1 -v Volume1 -s number_of_blocks |
a10) Finally, restore the backup data to Volume1.
b. When the error cause is a disk component failure and data is restored from a slice in INVALID status
b1) When the error was caused by a disk failure, and when no backup data exists, or even if it did, the data is too old, restore data from the detached INVALID status slice, following the procedures given below.
b2) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command.
When I/O error occurs on the unmount command, execute -f option of the unmount command.
b3) Stop the volume with the sdxvolume command.
# sdxvolume -F -c Class1 -v Volume1 |
b4) If there is a TEMP status slice within the volume, attempt recovery following the procedures given in "Slice Status Abnormality."
b5) If there is a NOUSE status slice within the volume, attempt recovery following the procedures given in "Slice Status Abnormality."
b6) Determine the original mirror slice after the volume is recovered. Then, execute the sdxfix command.
(Example 1)
# sdxfix -V -c Class1 -d Disk2 -v Volume1 |
In this example, data is recovered from a mirror slice in the disk Disk2 which is connected to the highest level mirror group.
(Example 2)
# sdxfix -V -c Class1 -g Group2 -v Volume1 |
In this example, data is recovered from a mirror slice in the lower level group Group2 which is connected to the highest level mirror group.
b7) Start the volume.
# sdxvolume -N -c Class1 -v Volume1 -e nosync |
b8) Create backup of Volume1 and regain data integrity by running fsck as necessary.
b9) Lastly, swap disks following the procedures given in "Disk Swap" and "sdxswap - Swap disk."
c. When the error cause is a non-disk component failure or defect
The slice with valid data exists within the disk, and shut down the system once, recover the failed component, and then reboot the system. Synchronization copying is automatically performed and the mirroring status will be recovered.
Since a single volume consists of only one slice, accessing the volume at the time of an I/O error will result in an error. However, the status of slice and volume will remain ACTIVE.
Identify the cause of I/O error occurrence by referring to the disk driver log message.
How to resolve the problem is described in two cases:
When the error cause is a disk component failure and recovery is performed using backup data
When the error cause is a non-disk component failure or defect
a. When the error cause is a disk component failure and recovery is performed using backup data
a1) In the event of a disk component failure, there will be no slice with valid data. Follow the procedures below and restore the data using the backup data. In this example, Disk1 (c1t11d0) has a failure.
# sdxinfo -D -o Disk1 OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS DEVCONNECT STATUS ------ ------- ------ ------- ------- ------- -------- ---------------- ------- disk Disk1 single Class1 * c1t11d0 8493876 node1 ENABLE |
a2) Search the volumes within the faulty disk using the sdxinfo command. And record the volume size,
# sdxinfo -V -o Disk1 OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Disk1 * * 0 32767 32768 PRIVATE volume Volume1 Class1 Disk1 off on 32768 65535 32768 ACTIVE volume Volume2 Class1 Disk1 off on 65536 4194303 4128768 ACTIVE volume * Class1 Disk1 * * 4194304 8421375 4227072 FREE |
In this example, Volume1 and Volume2 are within the faulty Disk1. The size of Volume1 would be 32,768 blocks as shown in the BLOCKS field. The size of Volume2 would be 4,128,768 blocks as shown in the BLOCKS field.
a3) Exit the application accessing the volume. When the volume is used as a file system, execute the unmount command. When I/O error occurs on the unmount command, execute -f option of the unmount command.
a4) Stop the volume with the sdxvolume command.
# sdxvolume -F -c Class1 -v Volume1,Volume2 |
a5) Remove the volumes with the sdxvolume command.
# sdxvolume -R -c Class1 -v Volume1 |
a6) Before swapping the disks, execute the following command.
# sdxswap -O -c Class1 -d Disk1 |
If the disk is the only remaining disk in the disk class, the command results in an error as shown below.
In that event, follow the steps a6'), a7') and a8').
SDX:sdxswap: ERROR: Disk1: The last ENABLE disk in class cannot be swapped
a7) Swap the disks.
a8) After swapping the disks, execute the following command.
# sdxswap -I -c Class1 -d Disk1 |
a6') Before swapping the disks, execute the following command.
If no error is output in a6), the steps a6'), a7'), and a8') are not required.
# sdxdisk -R -c Class1 -d Disk1 |
a7') Swap the disks.
a8') After swapping the disks, execute the following command.
# sdxdisk -M -c Class1 -d c1t11d0=Disk1:single |
a9) Create volumes with the sdxvolume command again. For the -s option, use the size recorded in 2a), in this example.
# sdxvolume -M -c Class1 -d Disk1 -v Volume1 -s 32768 |
a10) Finally, restore the backup data to Volume1 and Volume2.
b. When the error cause is a non-disk component failure or defect
Shut down the system once, recover the failed component, and then reboot the system. Slice data is valid and there is no need to restore the data.
Since a stripe volume or a volume within a concatenation group consists of only one slice, accessing the volume at the time of an I/O error will also result in an error. However, the status of slice and volume will remain ACTIVE.
Identify the cause of I/O error occurrence by referring to the disk driver log message.
You can confirm the error status of the disk related to volume and the physical disk name as shown below.
# sdxinfo -D -o Volume1 -e long OBJ NAME TYPE CLASS GROUP DEVNAM DEVBLKS FREEBLKS DEVCONNECT STATUS E ------ ------- ------ ------- ------- ------- -------- -------- ---------------- ------- ----- disk Disk1 concat Class1 Group2 c1t1d0 17596416 * node1 ENABLE 0 disk Disk2 concat Class1 Group2 c1t2d0 17596416 * node1 ENABLE 1 disk Disk3 concat Class1 Group3 c2t3d0 17682084 * node1 ENABLE 0 disk Disk4 concat Class1 Group3 c2t4d0 17682084 * node1 ENABLE 0 |
In this example, an I/O error occurs on Disk2, as shown in the E field. The physical disk name corresponding to Disk2 is c1t2d0, as shown in the DEVNAM field.
How to resolve the problem is described in two cases:
When the error cause is a disk component failure and recovery is performed using backup data
When the error cause is a non-disk component failure or defect
a. When the error cause is a disk component failure and recovery is performed using backup data
a1) In the event of a disk component failure, there will be no slices with valid data. Follow the procedures below and restore the data using the backup data.
a2) Record the configuration information of the group that was related to the failed disk using the sdxinfo command.
# sdxinfo -G -o Disk2 -e long OBJ NAME CLASS DISKS BLKS FREEBLKS SPARE MASTER TYPE WIDTH ------ ------- ------- ------------------- -------- -------- ----- ------ ------ ----- group Group1 Class1 Group2:Group3 70189056 65961984 * * stripe 32 group Group2 Class1 Disk1:Disk2 35127296 * * * concat * group Group3 Class1 Disk3:Disk4 35127296 * * * concat * |
In this example, the lower level concatenation groups Group2 and Group3 are connected to the highest level stripe group Group1. The disks Disk1 and Disk2 are connected to Group2, and the disks Disk3 and Disk4 are connected to Group3. The stripe width for Group1 is 32 blocks.
a3) Search the volumes that exist in the highest level group that are related to the faulty disk using the sdxinfo command.
# sdxinfo -V -o Disk2 OBJ NAME CLASS GROUP SKIP JRM 1STBLK LASTBLK BLOCKS STATUS ------ ------- ------- ------- ---- --- -------- -------- -------- -------- volume * Class1 Group1 * * 0 65535 65536 PRIVATE volume Volume1 Class1 Group1 * * 65536 98303 32768 ACTIVE volume Volume2 Class1 Group1 * * 98304 4227071 4128768 ACTIVE volume * Class1 Group1 * * 4227072 70189055 65961984 FREE |
In this example, Volume1 and Volume2 exist in the highest level group Group1, that is related to the faulty disk Disk2. The size of Volume1 is 32768 blocks, and the size of Volume2 is 4128768 blocks as shown in the BLOCKS field.
a4) Exit the application accessing the volume. When the volume is used as a file system, execute unmount command. When I/O error occurs on unmount command, execute -f option of unmount command.
a5) Stop the volume with the sdxvolume command.
# sdxvolume -F -c Class1 -v Volume1,Volume2 |
a6) Remove the volumes with the sdxvolume command.
# sdxvolume -R -c Class1 -v Volume1 |
a7) Disconnect the faulty disk from the group. If the group is in a hierarchical structure, disconnect from the higher group in descending order.
# sdxgroup -D -c Class1 -h Group1 -l Group2 |
In this example, the faulty disk Disk2 is connected to Group2, and Group2 is connected to Group1. Therefore, you should disconnect Group2 first, and then Disk2.
a8) Before swapping the disks, execute the following command.
# sdxswap -O -c Class1 -d Disk2 |
If the disk is the only remaining disk in the disk class, the command results in an error as shown below. In that event, follow the steps a8'), a9') and a10').
SDX:sdxswap: ERROR: Disk2: The last ENABLE disk in class cannot be swapped
a9) Swap the disks.
a10) After swapping the disks, execute the following command.
# sdxswap -I -c Class1 -d Disk2 |
a8') Before swapping the disks, execute the following command.
If no error is output in a8), the steps a8'), a9'), and a10') are not required.
# sdxdisk -R -c Class1 -d Disk2 |
a9') Swap the disks.
a10') After swapping the disks, execute the following command.
# sdxdisk -M -c Class1 -d c1t2d0=Disk2 |
a11) Connect the swapped disk to the group, referring to the group information recorded in a2). If the groups were in a hierarchical structure, connect the groups in an ascending order.
# sdxdisk -C -c Class1 -g Group2 -d Disk2 |
a12) Create volumes with the sdxvolume command again. For the -s option, use the size recorded in a3), in this example, 32768 and 4128768.
# sdxvolume -M -c Class1 -g Group1 -v Volume1 -s 32768 -a pslice=off |
a13) Finally, restore the backup data to Volume1 and Volume2.
b. When the error cause is a non-disk component failure or defect
Shut down the system once, recover the failed component, and then reboot the system. Slice data is valid and there is no need to restore the data.
Contents
![]() ![]() |