A.3.2 Actions to take when switching is interrupted

If the switch is interrupted while the system is operating with the smart workload recovery feature, check the Lambda function log for error messages about the switch interruption. To check for error messages, use the AWS Management Console and do the following:.

From the Region Name drop-down list, select the region in which your system is running.
Open the Amazon CloudWatch service administration screen and choose "Log Group" from the drop-down list that displays "Log" from the sidebar.
From the list of log groups displayed in the Log Groups screen, choose a log group name in the following format that includes the Lambda function name you are using on your production system.
```
/aws/lambda/< Lambda function name >
```
From the list of log streams displayed in the "Log Stream" block, select an item that contains a log that is close to the time when the switch was interrupted in the active system. Check the time in the "Last event time" column to see if this is the log stream that contains the log.
Check the list of messages displayed on the screen for error messages.
Refer to "8.1.4 Error (ERROR) Message" in the "PRIMECLUSTER Messages" and take action according to the error message identified in step 5.

A.3.2.2 Recovering According to Instance State

If you need to recover the instance as per the error message identified in "A.3.2.1 Reviewing Error Messages", recover the instance from which the switch was interrupted. To recover your instance, use the AWS Management Console and perform the following steps:.

A.3.2.2.1 Deleting a Source Instance

To delete the source instance, use the AWS Management Console and do the following.

This procedure is not necessary if the source instance does not exist in the displayed list of instances when you open the Manage Amazon EC2 Service screen, or if the state of the source instance is terminated.

Sets the tag "fujitsu.pclswr.is_recovery_target" value of the source instance to false.
Stop the source instance.
Not required if the instance state is stopped.
Note the integer value that identifies the cluster node for the tag "fujitsu.pclswr.id" of the source instance that you stopped in Step 2. The saved integer value is used by "A.3.2.2.5 Creating the target instance" when the error message does not identify the cluster node of the source instance. Do not use if the error message confirms it. Terminate the instance that you stopped in step 2.

A.3.2.2.2 Checking the Existence of the target instance

To verify the existence of the switched instance, use the AWS Management Console to check if the error message contains "new instance_id = <new_instance_id>", and then do one of the following:.

If the Error Message Contains "new instance_id = <new_instance_id>"
If the Error Message Does Not Contain "new_instance_id = <new_instance_id>"

If the Error Message Contains "new instance_id = <new_instance_id>"

Depending on the instance ID value of the switched instance described in <new_instance_id>, do the following:.

<new_instance_id> is not None

Verify that an instance with the same instance ID exists.

Open the Manage Amazon EC2 Service screen and choose Instances from the sidebar.
From the list of displayed instances, verify that the instance ID in the Instance ID column is the same as the instance ID of the switched instance. If the same instance ID is found, the switched instance exists.

Proceed according to the following table depending on whether you have switched to an instance or not.

Whether to switch to the instance	Next Steps
There is a switched to instance.	Proceed to "A.3.2.2.3 Actions to take depending on the status of the target instance".
Destination Instance Does Not Exist	Proceed to "A.3.2.2.5 Creating the target instance".

The value of <new_instance_id> is None.

Verify that there is an instance with an integer tag that identifies the same cluster node.

Open the Manage Amazon EC2 Service screen and choose Tags from the sidebar.
Select the [Manage Tags]button.
Ensure that Instance is selected in the filter drop-down list.
From the list of displayed instances, verify that the value in the "fujitsu.pclswr.id" column is the same as the value set in "A.2.3.1 Creating Cluster Node Instance" for the following tag, and note the value in the "Instance ID" column.
Key
Value
fujitsu.pclswr.id
Same as the system_id value in the message
From the sidebar, select "Instance" and from the list of displayed instances, verify that the instance ID in the "Instance ID" column is the same as the instance ID you noted in step 4. If the same instance ID is found, the switched instance exists.

Key	Value
fujitsu.pclswr.id	Same as the system_id value in the message

Proceed to the procedure depending on the existence of the switched instance according to the table below.

Whether to switch to the instance	Next Steps
There is a switched to instance.	Proceed to "A.3.2.2.3 Actions to take depending on the status of the target instance".
Destination Instance Does Not Exist	Proceed to "A.3.2.2.5 Creating the target instance".

If the Error Message Does Not Contain "new_instance_id = <new_instance_id>"

The destination instance does not exist.

Proceed to "A.3.2.2.5 Creating the target instance".

A.3.2.2.3 Actions to take depending on the status of the target instance

Check the status of the instance that you checked in "A.3.2.2.2 Checking the Existence of the target instance", and take appropriate action. The procedure is as follows:.

Check the status of the destination instance.
From the AWS Management Console, open the Manage Amazon EC2 Service screen and check the Instance State column from the displayed list of instances.

Take action according to the status of the switched instance.

Follow the table below to take action depending on the state of the switched instance.

Instance State	treatment content
terminated	Because the destination instance does not exist, skip to step 1 in "A.3.2.2.5 Creating the target instance".
Running	Because the destination instance was created successfully, skip to step 2 in "A.3.2.2.5 Creating the target instance". Note the instance ID of the switched instance from < new_instance_id > in the error message.
pending	Proceed to "A.3.2.2.4 Deleting a target instance" because the destination instance was not created successfully.
Other than the above	Because an unexpected error might have occurred while creating the instance, collect the survey information and contact your our company service representative (SE). If you are in a hurry to switch the system, collect the survey information and proceed to "A.3.2.2.4 Deleting a target instance".

A.3.2.2.4 Deleting a target instance

To delete the switched instance, use the AWS Management Console and do the following:.

Open the "Amazon EC2 Service" administration screen and select "Instances" from the sidebar.
From the list of displayed instances, select the check box for the instance.
From the Instance State drop-down list, select Terminate Instance.

A.3.2.2.5 Creating the target instance

To create a switched instance, use the AWS Management Console and do the following:.

Create a destination instance from the AMI.
After creating the switched instance as described in "A.2.3.1 Creating Cluster Node Instance", note the instance ID. The instance ID of this switched to instance is used in Step 2, "A.3.2.2.6 Configuring CloudWatch Alarm", and Step 2 in "A.3.2.2.7 Configuring Amazon DynamoDB".
Configure the target group for the ELB.
This step is not required if the target instance has already been added to the ELB target group.
Add the instance ID of the switched instance to the ELB target group.
For information about configuring target groups, see "A.2.3.2 Configuring Network Takeover". However, do not perform Step 1 of "A.2.3.2 Configuring Network Takeover" and add to an existing target group.
If you provided the instance ID of the switched instance in step 2 of "A.3.2.2.3 Actions to take depending on the status of the target instance", add it to the target group.

A.3.2.2.6 Configuring CloudWatch Alarm

Configure CloudWatch alarms.

If the target instance is already configured for CloudWatch alarms, this step is not required.

Set the instance ID of the switched instance to the instance name (InstanceID) for the CloudWatch alarm.

For information about configuring CloudWatch alarms, see "A.2.4.3 Configuring CloudWatch Alarm".

If you recorded the instance ID of the switched instance in step 2 of "A.3.2.2.3 Actions to take depending on the status of the target instance", set the recorded instance ID to a CloudWatch alarm.

A.3.2.2.7 Configuring Amazon DynamoDB

Review the tables in Amazon DynamoDB.
Verify that the table "Fujitsu-Pclswr-DB-Switcher" has the following entries:.
Attribute name
Value
SystemID
Same value as the tag for the instance on the cluster node "fujitsu.pclswr.id"
InstanceID
Same as the <instance id> of the source instance in the error message
Update the table in Amazon DynamoDB according to the results of step 1.
- If the item checked in step 1 exists:
  Update the table Fujitsu-Pclswr-DB-Switcher from step 1.
  Update the value of the attribute "InstanceID" to the instance ID of the switched instance.
  If you recorded the instance ID of the switched instance in step "A.3.2.2.3 Actions to take depending on the status of the target instance", set that instance ID to the attribute "InstanceID".
  If the value of attribute "State" is SWITCHING, update the value of attribute "State" to NOT_SWITCHED.
  Attribute name
  Value
  SystemID
  Same value as the tag for the instance on the cluster node "fujitsu.pclswr.id"
  InstanceID
  Instance ID of the switched instance
  State
  NOT_SWITCHED
- If the item checked in step 1 does not exist:
  Add the following to the table "Fujitsu-Pclswr-DB-Switcher":. If you recorded the instance ID of the switched instance in step "A.3.2.2.3 Actions to take depending on the status of the target instance", set that instance ID to the attribute "InstanceID".
  Attribute name
  Value
  SystemID
  Same value as the tag for the instance on the cluster node "fujitsu.pclswr.id"
  InstanceID
  Instance ID of the switched instance
  State
  NOT_SWITCHED

A.3.2.3 Changing the settings of the switch destination AZ

Attribute name	Value
SystemID	Same value as the tag for the instance on the cluster node "fujitsu.pclswr.id"
InstanceID	Same as the <instance id> of the source instance in the error message

Attribute name	Value
SystemID	Same value as the tag for the instance on the cluster node "fujitsu.pclswr.id"
InstanceID	Instance ID of the switched instance
State	NOT_SWITCHED

Attribute name	Value
SystemID	Same value as the tag for the instance on the cluster node "fujitsu.pclswr.id"
InstanceID	Instance ID of the switched instance
State	NOT_SWITCHED

Follow the actions for the error message identified in "A.3.2.1 Reviewing Error Messages", and take action in case the switch is interrupted due to AZ resource exhaustion. Because AZ resources are depleted, remove the subnet of the AZ whose resources are depleted from the switched AZ. Also, add a subnet for the AZ that has not experienced resource exhaustion to the switched AZ. Use the AWS Management Console to do the following:.

Create an instance according to "A.3.2.2.5 Creating the target instance" in the switch-source AZ.
Change the value of the tag "fujitsu.pclswr.is_recovery_target" of the subnet to which you want to switch AZ that is experiencing resource exhaustion to false.
Note the integer value that identifies the cluster node from the tag "fujitsu.pclswr.idlist" of the subnet to which AZ is switched when resource exhaustion occurs. The integer value that identifies the cluster node may have multiple values. This integer value is used in step 4.

Configure the subnet to set AZ as the switch destination.

To add a new switched subnet

Create a virtual system that contains the subnet to switch to, as described in "A.2.1 Creating Virtual System". Make sure that the subnet you create can use API endpoints and mount targets for EFS.
Configure network takeover, as described in "A.2.3.2 Configuring Network Takeover". Make sure that NLB and ALB are available on the subnet you created.
Add the integer value that identifies the cluster node you recorded in Step 3 to the subnet tag "fujitsu.pclswr.idlist". This step is not required if it has already been added.
Change the subnet tag "fujitsu.pclswr.is_recovery_target" to true. This step is not required if it has already been changed.

You already have a subnet to switch to.

Based on the value of the subnet tag "fujitsu.pclswr.idlist" and the integer value you recorded in Step 3 that identifies the cluster node, consider the integer value that identifies the cluster node you want to set as a tag for the source and destination subnets so that the switch destination is set to multiple AZ.
Change the subnet tag "fujitsu.pclswr.is_recovery_target" to true.

Example

Here's how to set the subnet tag "fujitsu.pclswr.idlist":.

The following is an example of setting change when resource exhaustion occurs in AZ of SubnetB.

Note that SubnetA, SubnetB, and SubnetC are different AZ subnets.

Example 1) Only one configured subnet

Change the value of the SubnetB tag "fujitsu.pclswr.is_recovery_target" to false and the SubnetC tag "fujitsu.pclswr.is_recovery_target" to true.

The following shows the settings before and after the change.

Table A.1 Previous setting (only one subnet has been set)
Subnet Name	Tag "fujitsu.pclswr.idlist" Value	Tag "fujitsu.pclswr.is_recovery_target" Value
SubnetA	1,2,3,4	true
SubnetB	1,2,3,4	true
SubnetC	tag "fujitsu.pclswr.idlist" value	false

Table A.2 The new setting, if there is only one subnet.
Subnet Name	Tag "fujitsu.pclswr.idlist" Value	Tag "fujitsu.pclswr.is_recovery_target" Value
SubnetA	1,2,3,4	true
SubnetB	-	false
SubnetC	1,2,3,4	true

Example 2) Two or more configured subnets

Change the value of the SubnetB tag "fujitsu.pclswr.is_recovery_target" to false.

The following shows the settings before and after the change.

Table A.3 Previous setting (if there are two or more configured subnets)
Subnet Name	Tag "fujitsu.pclswr.idlist" Value	Tag "fujitsu.pclswr.is_recovery_target" Value
SubnetA	1,3,4	true
SubnetB	1,2,4	true
SubnetC	2,3,4	true

Table A.4 The new setting, if there are two or more configured subnets
Subnet Name	Tag "fujitsu.pclswr.idlist" Value	Tag "fujitsu.pclswr.is_recovery_target" Value
SubnetA	1,2,3,4	true
SubnetB	-	false
SubnetC	1,2,3,4	true