3.1.1 Standby Operation

Oracle instance startup

Startup procedure of an Oracle instance is as follows:

su - <Oracle user>
sqlplus / nolog
connect / as sysdba
startup nomount or startup mount
alter database mount (if “startup nomount” was executed at step 4.)
alter database open
alter pluggable database all open (if UsePDB of Oracle instance resource is set to yes.)
In the Oracle Data Guard/Oracle Active Data Guard environment, when the CDB started to the OPEN state, which is able to start the PDBs, this command is executed. For details, see "Starting and Stopping CDB and PDB" in "G.1 Feature Outline".

Initialization parameter file and server parameter file
The initialization parameter file is not specified for Oracle instance startup through PRIMECLUSTER Wizard for Oracle, so the default initialization parameter file will be used instead. Specify the initialization parameter file for the following default path (symbolic link).
```
<$ORACLE_HOME>/dbs/init<$ORACLE_SID>.ora
```
The server parameter file must be located on the shared disk device because it is dynamically changed. When you use the server parameter file, enter the full path for the initialization parameter file. See “2.2.6 Oracle database Creation and Setting”.
It is recommended that the initialization parameter file settings are the same on the operating nodes and standby nodes.
Recovery processing
PRIMECLUSTER Wizard for Oracle recovers the Oracle instance in the following cases:
1. When there is ACTIVE tablespace in the V$BACKUP view.
2. When there are files required to be recovered in the V$RECOVER_FILE view.
DBA authentication
PRIMECLUSTER Wizard for Oracle connects to Oracle instance with SYSDBA system privilege to start up or stop Oracle instance/database. In this case, local connection in operating system authentication is used.

Oracle instance shutdown

Shutdown procedure of an Oracle instance is as follows:

When users shut down and switch userApplication, the procedure is as follows:

su - <Oracle user>
sqlplus / nolog
connect / as sysdba
shutdown <immediate / abort / transactional> (Setup with StopModeStop)
Default : immediate
If the Oracle instance is not stopped at step 4 (except for abort), use shutdown abort.
If the Oracle instance is not stopped at step 4 or step 5, shut it down forcibly by sending SIGKILL to the background process.

The procedure of stopping failed Oracle resources including non-Oracle resources is as follows:

su - <Oracle user>
sqlplus / nolog
connect / as sysdba
shutdown <immediate / abort> (Setup with StopModeFail)
Default : abort
If the Oracle instance is not stopped at step 4 (except for abort), use shutdown abort.
If the Oracle instance is not stopped at step 4, shut it down forcibly by sending SIGKILL to the background process.

Even if UsePDB of Oracle instance resource is set to yes, PDBs is not stopped. PDBs stops by stopping the CDB.

Listener startup

Startup procedure of a Listener is as follows:

su - <Oracle user>
lsnrctl start <ListenerName>
Make sure that a listener process does exist.

Listener shutdown

Shutdown procedure of a Listener is as follows:

su - <Oracle user>
lsnrctl stop <ListenerName>
Make sure that a listener process does not exist.
If Listener is not stopped at step 3, shut it down forcibly by sending SIGKILL to the background process.

Oracle ASM instance startup

Oracle ASM instance startup procedure is as follows:

Oracle Database 10g R2/11g R1/11g R2/12c R1(12.1.0.1)
1. su - <Oracle user>
2. sqlplus /nolog
3. connect / as sysdba (In case of Oracle Database 11g or later, "connect / as sysasm")
4. startup mount (When the state is already STARTED, "alter diskgroup all mount;")

Oracle Database 12c R1 PSR12.1.0.2 or later
1. su - <Oracle user>
2. Confirm that Oracle Restart started.
3. srvctl enable asm
4. srvctl start asm
5. srvctl disable asm

Oracle ASM instance shutdown

Oracle ASM instance shutdown procedure is as follows:

Oracle Database 10g R2/11g R1/11g R2/12c R1(12.1.0.1)
- At the shutdown of manual shutdown and switch userApplication by operator
  1. su - <Oracle user>
  2. sqlplus /nolog
  3. connect / as sysdba (In case of Oracle Database 11g or later, "connect / as sysasm")
  4. shutdown <immediate/abort/transactional> (Setup by "StopModeStop")
    default : immediate
  5. If the Oracle ASM has not stopped in "4", execute "shutdown abort". (When choosing excluding "abort" in "4")
  6. If the Oracle ASM has not stopped in "4" or "5", abort the background process by sending SIGKILL.
- At the shutdown in the event of a resource failure (resource failure of excluding Oracle ASM is included)
  1. su - <Oracle user>
  2. sqlplus /nolog
  3. connect / as sysdba (In case of Oracle Database 11g or later, "connect / as sysasm")
  4. shutdown <immediate/abort> (Setup by "StopModeFail")
    default : abort
  5. If the Oracle ASM has not stopped in "4", execute "shutdown abort". (When choosing excluding "abort" in "4")
  6. If the Oracle ASM has not stopped in "4" or "5", abort the background process by sending SIGKILL.

Oracle Database 12c R1 PSR12.1.0.2 or later
- At the shutdown of manual shutdown and switch userApplication by operator
  1. su - <Oracle user>
  2. srvctl stop asm -stopoption <immediate/abort/transactional (Setup by "StopModeStop")> -force
    default : immediate
  3. If the Oracle ASM has not stopped in "2", execute the following. (When choosing excluding "abort" in "2")
    sqlplus /nolog
    connect / as sysdba (In case of Oracle Database 11g or later, "connect / as sysasm")
    shutdown abort
  4. If the Oracle ASM has not stopped in "2" or "3", abort the background process by sending SIGKILL.
- At the shutdown in the event of a resource failure (resource failure of excluding Oracle ASM is included)
  1. su - <Oracle user>
  2. srvctl stop asm -stopoption <immediate/abort (Setup by "StopModeFail")> -force
    default : abort
  3. If the Oracle ASM has not stopped in "2", execute the following. (When choosing excluding "abort" in "2")
    sqlplus /nolog
    connect / as sysdba (In case of Oracle Database 11g or later, "connect / as sysasm")
    shutdown abort
  4. If the Oracle ASM has not stopped in "2" or "3", abort the background process by sending SIGKILL.

Monitoring Oracle instances

Monitoring procedure of an Oracle instance is as follows:

Check the background processes (PMON, SMON) periodically. If the process status can be confirmed after Oracle instance gets activated, go to step 2.
su - <Oracle user>
Local connection to the Oracle instance as a SYSTEM user
If the state of the database which is OPEN can be confirmed, go to step "5".
Check if the background processes (PMON, SMON, DBWn, LGWR, CKPT) are alive.
The monitoring interval can be changed at the setting of "Interval" and its default value is 30 seconds.
Check if SQL (INSERT, UPDATE, DELETE and COMMIT) can be properly executed using the monitoring table on the SYSTEM user's default tablespace.
The monitoring with SQL is executed in accordance with the setting of "Interval". The elapsed time from the last monitoring is checked. Only when 60 seconds or more pass, the monitoring with SQL is executed.
Monitoring PDB
When UsePDB of Oracle instance resource is set to yes, the monitoring of PDB is executed in accordance with the setting of "Interval".
OPEN_MODE of each PDBs is checked by the V$PDBS table.
In the Oracle Data Guard/Oracle Active Data Guard environment, when the CDB started to the OPEN state, which is able to start the PDBs, this step is executed. For details, see "Starting and Stopping CDB and PDB" in "G.1 Feature Outline".
Oracle instance is reconnected once every 24 hours.

In the standby node, step 1 is executed to confirm that the background processes (PMON, SMON) do not exist.

SYSTEM user password
PRIMECLUSTER Wizard for Oracle monitors Oracle instances as a SYSTEM user. Register the SYSTEM user’s password. See “4.3 clorapass - Register Password for Monitoring”.
Monitoring table (FAILSAFEORACLE_<ORACLE_SID>)
PRIMECLUSTER Wizard for Oracle creates a monitoring table on the SYSTEM user’s default tablespace if the monitoring table does not exist. The table is only a few bites, and will not be deleted.
Warning notification
If the following symptoms are detected, PRIMECLUSTER Wizard for Oracle will notify RMS of the warning state. It is not the Fault state, so a failover will not occur.
- Oracle instance cannot be connected due to incorrect SYSTEM user’s password that is registered with the “clorapass” command. (ORA-01017 detected)
- Since the SYSTEM user's account is locked, so Oracle instance connection is not allowed. (ORA-28000 detected)
- Since the SYSTEM user's password has expired, so Oracle instance connection is not allowed (ORA-28001 detected)
- When the max session or max process error occurs, so Oracle instance connection is not allowed. (ORA-00018 or ORA-00020 detected)
- When the monitoring timeout occurs due to getting no reply from SQL for a certain period of time.
  If the monitoring timeout occurs, SQL is executed again. If a reply from SQL is received, the Online state is notified.

Oracle database errors that causes failover
If the Oracle database errors are detected, PRIMECLUSTER Wizard for Oracle will notify RMS of the Offline state. Then the Oracle instance resources become the resource failure state and a failover will occur.
If the AutoRecover(A) flags of the Oracle instance resources are selected, the Oracle instances will be restart before failover when the Oracle instance resource failure occurs. For details about AutoRecover(A), refer to "2.2.7.2 Oracle Resource Creation".
In the following case, the Offline state is notified to RMS:
- The background processes (PMON, SMON, DBWn, LGWR and CKPT) do not exist.
  Example
  For example, the following cases correspond:
  - Oracle instance terminates abnormally.
  - Oracle instance is stopped without stopping the monitoring.
- Oracle database errors (ORA-xxxxx) are returned after executing SQL.
  Oracle database errors (ORA-xxxxx) detected during monitoring will be handled in accordance with the action definition file (/opt/FJSVclora/etc/FJSVclorafm.actionlist).
  If the Oracle database errors defined as Of in the action definition file are detected, the Offline state is notified. See "Appendix H (Information) Action Definition File".
  Example
  For example, the following cases correspond:
  - ORA-04031 (out of memory in the shared pool) occurs.
- The monitoring timeout occurs twice in a row after executing SQL.
  If the reply from SQL does not return for 300 seconds (default), the monitoring timeout occurs and the Oracle instance resource will be the Warning state. Then PRIMECLUSTER Wizard for Oracle reconnects to the Oracle instance. If the reply does not return for 300 seconds during reconnection, the Offline state is notified.
  The monitoring timeout can be changed at the setting of "WatchTimeout" and its default value is 300 seconds.
  Example
  For example, the following cases correspond:
  - Oracle Database hangs up because archive logs run out of space.
  - The system load is too high.
Monitoring PDB
OPEN_MODE of each PDBs is checked by the V$PDBS table. If OPEN_MODE is "READ WRITE", it is judged that the PDB is normal. If OPEN_MODE is not "READ WRITE", It is judged that the PDB is abnormal. The monitoring of PDB is executed in accordance with the setting of "Interval". When the state changes when monitoring it last time, the message is output to syslog. The restart and the failover due to fault of PDBs is not executed.
When the state of PDB becomes normal, the following messages are output.
```
FSP_PCLW-ORACLE_FJSVclora: INFO: 9142: OPEN_MODE of PDB <PDB name> was OPEN. (CDB=<ORACLE_SID of CDB> PDB=<PDB name> OPEN_MODE=<state of PDB>)
```
When the state of PDB becomes abnormal, the following messages are output.
```
FSP_PCLW-ORACLE_FJSVclora: ERROR: 9242: clorapdbmon detected OPEN_MODE of PDB <PDB name> is invalid. (CDB=<ORACLE_SID of CDB> PDB=<PDB name> OPEN_MODE=<state of PDB>)
```
For details about the monitoring PDB in the Oracle Data Guard/Oracle Active Data Guard environment, see "G.1 Feature Outline".
Note
Failover occurs according to the setting of AutoSwitchOver of userApplication (cluster application).
If AutoSwitchOver=ResourceFailure (at resource failure) is selected, a userApplication will failover when a resource failure occurs.
For details about the settings of userApplication (cluster application), refer to "PRIMECLUSTER Installation and Administration Guide"

Monitoring Listeners

Monitoring procedure of a Listener is as follows:

Make sure that a listener process does not exist.
The monitoring interval can be changed at the setting of "Interval" and its default value is 30 seconds.
Make sure that the net service name is valid by using "tnsping" command.
The monitoring with tnsping is executed in accordance with the setting of "Interval". The elapsed time from the last tnsping is checked. Only when 60 seconds or more pass, the monitoring with tnsping is executed.
Note
When TNSName is set, tnsping is executed. For details about TNSName, refer to "2.2.7.2 Oracle Resource Creation".

In the standby node, step 1 is executed to confirm that the Listener processes do not exist.

Monitoring timeout
If there is no reply from tnsping command after a certain period of time, the monitoring timeout will be considered then the Oracle Listener resource will be put into Warning. If the monitoring timeout occurs twice in a row, a resource will be considered as fault then a failover will be performed.
The monitoring timeout (the wait time from Oracle Listener) can be changed with WatchTimeout.
Failover
If the Oracle listener errors are detected, PRIMECLUSTER Wizard for Oracle will notify RMS of the Offline state. Then the Oracle listener resources become the resource failure state and a failover will occur.
If the AutoRecover(A) flags of the Oracle listener resources are selected, the Oracle listener will be restart before failover when the Oracle listener resource failure occurs. For details about AutoRecover(A), refer to "2.2.7.2 Oracle Resource Creation".
In the following case, the Offline state is notified to RMS:
- The listener process does not exist.
- The tnsping command fails.
- The monitoring timeout occurs twice in a row.
Note
Failover occurs according to the setting of AutoSwitchOver of userApplication (cluster application).
If AutoSwitchOver=ResourceFailure (at resource failure) is selected, a userApplication will failover when a resource failure occurs.
For details about the settings of userApplication (cluster application), refer to "PRIMECLUSTER Installation and Administration Guide".

Monitoring Oracle ASM instance

Oracle ASM is not monitored. NullDetector flag is automatically enabled.

3.1.1 Standby Operation

3.1.1.1 Starting and Stopping Oracle resources

3.1.1.2 Monitoring Oracle resources