This section describes the following operations and concepts required for the Storage Cluster function:
TFOV is a volume for which failover is enabled.
Among TFOVs created on both Primary Storage and Secondary Storage, those TFOVs that have the same Host Logical Unit (HLU) number and capacity are volumes whose data is synchronized. In addition to data synchronization, the volume information on Secondary Storage is changed as shown in "Table 9.1 Change in Volume Information on Pre- and Post-Data Synchronization Secondary Storage".
Volume Information to be Changed | Pre-Data Synchronization | Post-Data Synchronization |
---|---|---|
UID | UID unique to volume on Secondary Storage. | UID of volume on Primary Storage of a pair. |
Product ID | Product ID unique to Secondary Storage. | Product ID of Primary Storage of a pair. |
Note
Even if a volume on the Secondary Storage is removed from the synchronized volume by the operation of deleting the TFO group after data has been synchronized, post-data synchronization volume information is inherited. For this reason, when continuing to use the volume on the Secondary Storage, use the ETERNUS CLI to return the volume information to pre-data synchronization status.
Refer to the ETERNUS Disk storage system manuals for the command name and the format of ETERNUS CLI used.
TFOV data is transferred in synchronization mode using the REC path.
An ETERNUS Disk storage system manages copy sessions used in the Storage Cluster function and Advanced Copy sessions separately. Since the ETERNUS Disk storage system automatically controls copy sessions used in the Storage Cluster function, it is not required to configure copy sessions and copy groups in this product.
Point
When an REC route temporary fault (communication break) occurs, a differential copy is executed after the REC route recovers, and the data is automatically recovered in the equivalent state. As failover does not occur during the period until the REC route is recovered, we recommend a redundant REC route configuration.
This mode is related to the failover method from the Primary Storage to the Secondary Storage. Either of the following can be selected:
Mode | Explanation |
---|---|
Auto | This is the failover mode that runs automatically when a failure of the Primary Storage is detected. |
Manual | This is the failover mode that runs manually. |
Note
Set this mode to "Manual" when the REC path interface type is "iSCSI".
This mode is related to the failback method from the Secondary Storage to the Primary Storage. Either of the following can be specified:
Mode | Explanation |
---|---|
Auto | This is the failover mode that runs automatically when a failure recovery of the Primary storage is detected. |
Manual | This is the failback mode that runs manually. |
Split Mode specifies for volumes in Primary Storage whose REC Path is disconnected whether to give priority to business continuity and continue Write or to assure the equivalent state of data on Primary Storage and Secondary Storage.
Either of the following is specified:
"Read/Write"(default)
Give priority to business continuity and continue writing data to volumes in Primary Storage.
In this case, data is written only on the volumes in Primary Storage, causing the data to be nonequivalent to data in Secondary Storage.
"Read"
Give priority to maintenance of data equivalent state and inhibit writing data to the volumes in Primary Storage.
TFO Group is a motion unit of failover on one device and a group for which the connection configuration, policy, status and maintenance required to perform failover is consolidated. TFO Group includes one or more CA ports and volumes allowed to access those CA ports. The example of TFO Group is shown in "Figure 9.2 Example of TFO Group".
Figure 9.2 Example of TFO Group
TFO Group has the following status. The TFO status changes with execution of failover or failback:
Point
The input conditions for the TFO Group name are as follows:
The 1-16 characters which are half-size alphanumeric characters "A-Z, a-z, 0-9" and special characters. However, ", ? " ' \ * %" cannot be used.
TFO Status | Meaning |
---|---|
Active | Indicates an active side. Accessible from Management Server. |
Standby | Indicates a standby side. Inaccessible from Management Server. |
Also, if default TFO status is Active in creating an environment, the TFO Group is called "Primary TFO Group" and if Standby, "Secondary TFO Group".
The Storage Cluster function shares WWPN/WWNN in the CA ports of two ETERNUS Disk storage systems, controls the Link status of each CA port, and achieves the failover.
The CA ports included in one TFO Group shares one WWPN/WWNN with the CA ports included in the other TFO Group between storage systems. This sharing operation is referred to as "Pairing of CA ports". Also, a pair of CA ports sharing WWPN/WWNN is referred to as "CA port pair".
By pairing CA ports, WWPN/WWNN of a CA port for Primary Storage is configured as a logical WWPN/WWNN to a CA port for Secondary Storage and the CA port for Secondary Storage links down.
The image of CA port pair is shown in "Figure 9.3 Example of CA Port Pair".
Figure 9.3 Example of CA Port Pair
Automatic Failover is a function that makes Secondary TFO Group active automatically when any failure is detected in an ETERNUS Disk storage system in which Primary TFO Group exists.
To perform Automatic Failover, Storage Cluster Controller connected with management LAN is required. Moreover, the interface type of REC path should be "FC".
When Storage A (Primary Storage) and Storage B (Secondary Storage) are in use, if Storage A is down, failover to Storage B is performed. The image of this operation is shown in "Figure 9.4 Operation of Automatic Failover".
Figure 9.4 Operation of Automatic Failover
In accordance with link switchover of each CA port, the status of TFO Group is also automatically switched, so that the volumes in Secondary TFO Group become accessible.
Automatic Failback is a function in which the Primary TFO Group automatically becomes "Active" when recovering from the failure of ETERNUS Disk storage system with the Primary TFO Group detected.
Deconstruction of Storage Cluster Environment
If such a trouble as requires device replacement occurs and an ETERNUS Disk storage system should be replaced, deconstruct the Storage Cluster environment.
Delete TFO Groups to deconstruct the Storage Cluster environment.
When deleting TFO Groups, select either of the following actions to handle WWPN/WWNN of the CA port for Secondary Storage:
Return Secondary Storage CA port to its original WWPN/WWNN.
Return Secondary Storage CA port not to its original WWPN/WWNN but continue to use the logical WWPN/WWNN.
If Step "a" is selected, it does not compete with the WWPN/WWNN of Primary Storage CA port.
If Step "b" is selected, the device operated as Primary Storage can be replaced while Secondary Storage is accessible from the Management Server.
Note
If both Primary Storage CA port and Secondary Storage CA port get active, their WWPN/WWNN competes with each other, possibly causing data corruption. Therefore, when selecting Step "b", keep the following rules:
Do not delete the TFO Group before making sure that the Primary Storage CA port is physically disconnected from SAN.
Do not connect to SAN the ETERNUS Disk storage system that had Primary TFO Group deleted.
Control of Link Status of Primary/Secondary Storage CA Ports
Depending on pairing of CA ports and failover, the WWPN/WWNN and Link status of CA ports are changed. The device whose Link status is Linkup is accessible from the Management Server.
Change in Link status of each CA port with CA port pairing and failover/failback operations are shown in "Table 9.3 Change in Link Status of CA Port".
Primary Storage | Timing | Secondary Storage | ||
---|---|---|---|---|
WWPN/WWNN | Link Status | Link Status | WWPN/WWNN | |
WWPN/WWNN for Primary Storage | Linkup | Pre-CA port pairing | Linkup | WWPN/WWNN for Secondary Storage |
Post-CA port pairing | Linkdown | WWPN/WWNN for Primary Storage | ||
Linkdown | Primary Storage stops | |||
Under failover | ||||
Failover completed | Linkup | |||
Primary Storage recovered | ||||
Failback started | ||||
Under failback | Linkdown | |||
Linkup | Failback completed | |||
Storage Cluster deconstructed (*) | Linkup | WWPN/WWNN for Secondary Storage |
*: When returning the WWPN/WWNN of Secondary Storage CA port to its original status.
Storage Cluster Controller
To perform Automatic Failover, Storage Cluster Controller connected with management LAN is required.
Two ETERNUS Disk storage systems use a REC Path for checking the living confirmation. If the REC Path is disconnected, even if the two ETERNUS Disk storage systems are running, failover may be performed by false recognition. To prevent this false recognition, install Storage Cluster Controller to communicate with both Primary and Secondary ETERNUS Disk storage systems with management LAN.
The example of structure between Primary and Secondary ETERNUS Disk storage systems and Storage Cluster Controller is shown in "Figure 9.5 Structure Example of Life Check via Storage Cluster Controller". In this structure example, communication status, device status and timing of Automatic Failover are shown in "Table 9.4 Timing of Automatic Failover Operation".
Figure 9.5 Structure Example of Life Check via Storage Cluster Controller
No. | Communication Status | Device Status | ||||
---|---|---|---|---|---|---|
(1) | (2) | (3) | Primary Storage | Secondary Storage | Timing of Automatic Failover Operation and Status Change | |
1 | Y | Y | Y | Alive | Alive | N/A |
2 | N | Y | Y | |||
3 | Y | N | Y | |||
4 | Y | Y | N | |||
5 | N | N | Y | Down | A Primary Storage: Active -> Standby | |
6 | N | Y | N | Alive | Down | N/A |
7 | Y | N | N | Alive | ||
8 | N | N | N | Down | Down | N/A (all blocked) |
Y: Communication enabled status
N: Communication disabled status
Note
Automatic Failover will not operate in the following cases:
When "route (1) between the Primary Storage and Secondary Storage" failed 10 seconds after "route (2) between the Primary Storage and Storage Cluster controller" had failed.
When "route (2) between the Primary Storage and Storage Cluster controller" failed 3 seconds after "route (1) between the Primary Storage and Secondary Storage" had failed.
Storage Cluster Controller and each monitored ETERNUS Disk storage system monitor each other. Therefore, if the Storage Cluster Controller and the managed ETERNUS Disk storage systems are placed in the same building, the following trouble could occur:
If the building is exposed to disaster, all the paths are blocked and failover gets disabled.
To prevent the above trouble, it is recommended to place the Storage Cluster Controller and each monitored ETERNUS Disk storage system in separate buildings respectively. The location example is shown in "Figure 9.6 Location Example of Storage Cluster Controller and Monitored ETERNUS Disk Storage Systems".
Also, the Storage Cluster Controller can be located on the same server as Management Server.
Figure 9.6 Location Example of Storage Cluster Controller and Monitored ETERNUS Disk Storage Systems