Top
PRIMECLUSTER Global Link Services Configuration and AdministrationGuide 4.2Redundant Line Control Function

D.2.2 Virtual interface or the various functions of Redundant Line Control function cannot be used

D.2.2.1 An interface of NIC switching mode is not activated

Phenomenon:

The following message is output and activation of an interface fails.

hanet: ERROR: 85700: polling information is not defined. Devname = sha0(0)
Cause and how to deal with:

In NIC switching mode, switching interfaces inside a node and between nodes is controlled using a failure monitoring function. Therefore, NIC switching mode does not work only by defining the information of a virtual interface using a hanetconfig create command. Necessary to set the monitor-to information by a hanetpoll create command. When the monitor-to information is not set, a takeover IP address is not activated either. Activation of a userApplication fails in cluster operation.
When using a logical address takeover function, and also when sharing a physical interface, necessary to have the monitor-to information in a unit of information of each virtual interface. In such a case, duplicate the information of a virtual interface and the monitor-to information that defined initially using a hanetconfig copy command and a hanetpoll copy command.

D.2.2.2 It does not failback at the time of the restoration detection by standby patrol in NIC switching mode

Phenomenon:

The following messages display during recovering process of standby patrol in NIC switching mode. As a result, it fails to instantly switch back from the secondary interface to the primary interface.

hanet: INFO: 88500: standby interface recovered. (sha1)
hanet: INFO: 89700: immediate exchange to primary interface is canceled. (sha1)
Cause and how to deal with:

After switching from the primary interface to the secondary interface due to transfer path failure, if a standby patrol recovers prior to elapsed link up delay time (default is 60 sec), the switching process between the primary and secondary interface may loop infinitely. To prevent from this symptom, the above messages will display to stop the switching process for the primary interface. The main reason of covering this issue in this section is to prevent infinite loop of switching interfaces when setting routes for monitoring and instead of HUBs.

D.2.2.3 Error detection message displays for standby patrol in NIC switching mode

Phenomenon:

The following message is output and activation of an interface fails.

hanet: WARNING: 87500: standby interface failed.
Cause and how to deal with:

On the network where VLAN switch exists on the transfer path monitored via standby patrol function, this error occurs if the following two circumstances take place:

1) Connecting a redundant NIC to a port of disparate VLAN identifier.
2) Connecting one of a redundant NIC or both redundant NICs to tagged member port of the switch.

The VLAN switch cannot communicate in between the ports where VLAN identifiers are disparate. Therefore, when connecting redundant NIC to disparate VLAN identifier, transmitting the monitoring frame fails between standby NIC and operation NIC, consequentially outputting 875 message. Additionally, even if VLAN identifiers are the same port and this port is set to tag member, and in the condition where the NIC does not support tagged VLAN (IEE802.1Q compliance), it still fails to retrieve tag frame from the switch. Once again, transmitting the monitoring frame fails outputting 875 message. To rectify this problem, double check the VLAN configuration of the switch and make sure VLAN identifier is identical on the port connecting redundant NIC. If the NIC you are using does not support tagged VLAN, set the port of the switch as non-tag member.

D.2.2.4 Command aborts and Redundant Line Control function startup fails

Symptom

Executing hanetconfig create/delete command, hanetpoll create/delete command, dsphanet command, and dsppoll commands output the following error message and aborts. Also, the virtual interface fails to activate during the system startup.
"hanet: 56100: internal error: daemon process does not exist."

Cause and workaround

The problem is most likely occurred due to cut off of symbolic link that was linked to initialization script for Redundant Line Control function. The user might have performed illegal operation to generate this issue. Therefore, the initialization script of Redundant Line Control function did not run during the system startup in which have caused activation failure of the virtual interface as well as startup failure of GLS daemon (hanetcltd). In such a case, the command also aborts since the GLS daemon is not running.
To resolve this issue, refer to the following recovery procedure to create the symbolic link under /etc/rc2.d and /etc/rc3.d that links to the initialization script and reboot the system.

[Recovery Procedure]

# ln -s /etc/init.d/hanet /etc/rc2.d/S32hanet
# ln -s /etc/init.d/hanet99 /etc/rc3.d/S99hanet

D.2.2.5 Unable to establish connection using virtual IP address of GS/SURE Linkage mode

Symptom

Fails to establish connection using a virtual IP address on GS/SURE Linkage mode due to routing daemon startup failure during the system startup.

Cause and workaround

On Solaris 8 or Solaris 9, if /etc/defaultrouter file does not exist, it runs /usr/sbin/in.rdisc(1M) to implement reference process by RDISC (router search protocol). If a router on the network is running RDISC, it uses RDISC as the routing protocol instead of RIP, preventing /usr/sbin/in.routed(1M) from startup. This issue can be resolved by changing the name of /usr/sbin/in.rdisc file (for example,/usr/sbin/in.rdisc.saved) to disable RDISC reference process.
If this problem occurs on Solaris 10, change the setting as /usr/sbin/in.routed(1M) can be executed as a routing daemon by executing /usr/sbin/routeadm(1M).
For details on this issue, refer to the Solaris manual.

How to detect this symptom:

Your system is having this problem if all of the followings are found.

Solaris 8 or Solaris 9:

  1. /etc/notrouter file (empty file) exists

  2. /etc/defaultrouter file exists.

  3. Routing daemon (/usr/sbin/in.routed) does not exist after system startup.

  4. Routing table contains the default path.

Solaris 10:

  1. The routing daemon is set in "/usr/sbin/in.rdisc" through /usr/sbin/routeadm(1M).

  2. Routing daemon (/usr/sbin/in.routed) does not exist after system startup.

  3. Routing table contains the default path.

Detecting routing daemon

If /usr/sbin/in.routed process name appears when running the following command, the routing daemon process is running.

# ps -ef | grep in.routed

Detecting the default path

You can check for the default path by running the following command. If a word "default" is displayed under "Destination", the default path is present.

# netstat -rn | grep default
default              192.168.70.254       UG        1      1  hme0

D.2.2.6 Solaris container cannot be started

Symptom

If the virtual interface in fast switching or physical interface in NIC switching is specified for the network setting, the following error message will be output and zone startup will fail:

# zoneadm -z zone0 boot
could not verify net address=192.168.80.10 physical=sha0: No such device or address
zoneadm: zone zone0 failed to verify

or

# zoneadm -z zone0 boot
zoneadm: zone 'zone0': hme0:1: could not bring interface up: address in use by zone 'global': Cannot assign requested address
zoneadm: zone 'zone0': call to zoneadmd failed
Cause and workaround

If the specified interface does not exist in the zone network setting or the IP address same as that specified for the zone network setting, the zone cannot be started. Check if the specified interface or IP address already exists using the "ifconfig(1M)" command.
If you are using NIC switching, check if the method of deactivating the standby interface can be used in the zone. For details, see "7.6 hanetparam Command" and "D.1 Changing Methods of Activating and Inactivating Interface".

Information

If a zone is installed, and interfaces for the zone do not exist, zone installation will fail. You need to activate the interfaces specified for the zone network settings before zone installation.

D.2.2.7 Services of Redundant Line Control function cannot be started (when NIC failed)

Phenomenon:

When rebooting the system in the case of NIC or system board failure in the Solaris 10 OS, the following message is output and services for Redundant Line Control function may not started.

Failed to plumb IPv4 interface(s): hme0
svc.startd[7]: svc:/network/physical:default: Method "/lib/svc/method/net-physical" failed with exit status 96.
svc.startd[7]: network/physical:default misconfigured: transitioned to maintenance (see 'svcs -xv' for details)
Cause and how to deal with:

If rebooting the system with a failure of NIC or system boards in the following configuration, starting the network service (svc:/network/physical) fails. In this case, network services including Redundant Line Control function (svc:/network/fjsvhanet) will not start.

Figure D.1 Configuration of NIC switching mode before change

When rebooting the system with a failure of NIC or system boards, delete the network configuration file (/etc/hostname.interface file or /etc/hostname6.interface file) according to the following procedure. After that, reboot the system by creating the network configuration files of a physical interface without any failure, or restore the service using the svcadm(1M) command.

Recovery procedure

  1. Change the network configuration file of the physical interface with a failure to the network configuration file of the physical interface without any failure.

    # mv /etc/hostname.hme0 /etc/hostname.hme3
    # mv /etc/hostname.hme1 /etc/hostname.hme2
  2. Reboot the system or restore the network service.

    # /usr/sbin/shutdown -y -i6 -g0

    Or,

    # svcadm clear svc:/network/physical

See

For details on the svcadm(1M) command, refer to the Solaris manual.

Change process to recommended environment

If the network configuration files are created only for one system board in the case of Figure D.1 Configuration of NIC switching mode before change, it is recommended to change the environment for the Primary interface for each system board as shown in Figure D.2 Configuration of NIC switching mode after change.

Figure D.2 Configuration of NIC switching mode after change

Procedure for setting up is as follows:

  1. Stop the HUB monitoring function.

    # /opt/FJSVhanet/usr/sbin/hanetpoll off
  2. Deactivate all virtual interfaces.

    # /opt/FJSVhanet/usr/sbin/stphanet
  3. Change the configuration information.
    Switch the redundant physical interfaces (Primary:hme1 and Secondary:hme2) in sha1 by executing the hanetconfig modify command.

    # /opt/FJSVhanet/usr/sbin/hanetconfig modify -n sha1 -t hme2,hme1
  4. Rename the /etc/hostname.*** file.
    In accordance with switch of redundant physical interfaces, rename the file from /etc/hostname.hme1 to /etc/hostname.hme2. In addition. It is not necessary to execute the following procedure when changing the network configuration files in recovery procedure.

    # mv /etc/hostname.hme1 /etc/hostname.hme2
  5. Reboot the system.

    # /usr/sbin/shutdown -y -g0 -i6

D.2.2.8 Services of Redundant Line Control function cannot be started (when inconsistency of file system occurred)

Phenomenon:

When /opt cannot be mounted at system startup in the Solaris 10 OS, the following message will be output and the services for Redundant Line Control function may not start.

hanet: ERROR: 98400: file system is inconsistent. (details)
Cause and how to deal with:

Since the inconsistency of the /opt file system was detected at the startup of GLS service, the startup stopped.
Make sure that the inconsistency of the /opt file system is resolved and /opt is mounted. After that, perform one of the actions:

  • Staring the system again

  • Starting GLS service

    # svcadm clear fjsvhanet

If the operation is not started normally after starting GLS service, the TCP/IP application that uses Redundant Line Control function may have failed to start. Start the system again.