PCI Hot Plug User's Guide I/O device edition - for Solaris(TM) Operating System -
Contents PreviousNext

Chapter 3 File Devices> 3.1 Replacement of PCI card

3.1.2 Replacement of PCI cards on redundant system

PCI cards can be replaced without stopping services such as user applications on a redundant system using software such as multipath control.

Here, the procedure to replace PCI cards on a redundant system using the following redundancy software products is explained.

- Multipath Disk Control (MPHD)

- GR Multipath Driver (GRMPD)

- ETERNUS Multipath Driver (ETERNUS MPD)

If other redundancy software products are used, see the manual of each product.

  1. Stop the machine administration hardware monitoring daemon

    Use the following command to stop the hardware monitoring daemon of machine administration.

    # /usr/sbin/FJSVmadm/prephp <Return>
  2. If you use the Fibre Channel Card (PW028FC3*/PW028FC4*/PW028FC5*), execute the following.

    The daemons will be stopped.

    # /etc/rc0.d/K10ElxRMSrv stop <Return>
    # /etc/rc0.d/K10ElxDiscSrv stop <Return>
  3. Specify the replacing PCI card

    Follow the instructions below and determine the interface name of the path connecting the target PCI card and I/O devices and the connected I/O device. The following procedure is described for MPHD/GRMPHD. If you use the redundancy software other than MPHD/GRMPHD, see the document for each product.
    If you use ETERNUS MPD, mplbt is displayed in stead of hddv.

    1. Determine from the WARNING messages output on the console the interface name of the path connecting the target PCI card and I/O devices and the connected I/O device.

      In the example below, fjpfca3 interface name of the path connecting the target PCI card and I/O devices, and hddv1is the LUN(Logical Unit Number) of the disk array device connected to fjpfca3.

      :
      WARNING: /pci@8d,2000/fibre-channel@1 (fjpfca3):
      Hard Error : PCI DMA error.
      :
      WARNING: /pci@8d,2000/fibre-channel@1/hddv@1,0 (hddv1):
      SCSI transport failed: reason 'reset': giving up
      :
      NOTICE: mphd0: I/O path switchover succeeded.
      /pci@8d,2000/fibre-channel@1/hddv@1,0 => /pci@89,2000/fibre-
      channel@1/hddv@2,0
      :

      The following procedures are also explained assuming the console messages above.

    2. Find the hddv1 with the status "offline fail" from the results of the iompadm command, and determine the logical path name of the LUN connected to the target PCI card.
      Please refer to Chapter 3 of "Multipath Disk Control Guide" for details of the iompadm command.

      The following is an example of MPHD, and "/dev/rdsk/c3t1d0s2" is the logical path name corresponding to hddv1.

      In case of GRMPD/ETERNUS MPD, specify "mplb" in the -c option parameter of the iompadm command. In case of updated GRMPD from MPHD, specify the "mphd" as -c option parameter of the iompadm command.

      # /usr/opt/FJSViomp/bin/iompadm -c mphd -p info <Return>
      :
      IOMP: /dev/FJSVmphd/fiomp/adm2
      -> /devices/pseudo/mphd@2:adm
      Element:
      /dev/rdsk/c3t1d0s2 offline fail block "target completed hard
      reset sequence [GR7104546- 010000-00-00-30] (hddv1)"
      -> /devices/pci@8d,2000/fibre-channel@1/hddv@1,0:c,raw
      /dev/rdsk/c2t2d0s2 online active block "good status with
      active [GR7104546- 010000-01-01-32] (hddv15)"
      -> /devices/pci@89,2000/fibre-channel@1/hddv@2,0:c,raw
      Node:
      /dev/FJSVmphd/rdsk/mphd2s0
      /dev/FJSVmphd/rdsk/mphd2s1
      /dev/FJSVmphd/rdsk/mphd2s2
      /dev/FJSVmphd/rdsk/mphd2s3
      /dev/FJSVmphd/rdsk/mphd2s4
      /dev/FJSVmphd/rdsk/mphd2s5
      /dev/FJSVmphd/rdsk/mphd2s6
      /dev/FJSVmphd/rdsk/mphd2s7
      Function:
      MPmode=false
      AutoPath=true
      Block=true
      NeedSync=false
      :
    3. Disconnect from redundant system

      Disconnect the path between the target PCI card and I/O devices.

      The following procedure is described for MPHD/GRMPD/ETERNUS MPD. If you use the redundancy software other than MPHD/GRMPD/ETERNUS MPD, see the document for each product.

      Execute the following command specifying the logical path name of the connected LUN determined in procedure 3.b.
      Then confirm the state of that logical path of connected LUN is "unconfigured disconnected."

      This command only needs to be executed on the representation LUN, and does not need to be executed on each LUN under the same PCI card.

      If the operation path is disconnected on a redundant system, the standby path will automatically be switched to operation.

      If both MPHD/GRMPD or both MPHD/ETERNUS MPD controlled devices are connected to the target PCI card, the commands for both products need to be executed.

      MPHD
      # /usr/opt/FJSViomp/bin/iompadm -c mphd change adapter_disconnect /dev/rdsk/c3t14d0s2 <Return>
      8/ 8 LU was controlled.
      # /usr/opt/FJSViomp/bin/iompadm -c mphd info | grep /dev/rdsk/c3 <Return>
      /dev/rdsk/c3t14d0s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"
      /dev/rdsk/c3t14d1s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"
      /dev/rdsk/c3t14d2s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"
      /dev/rdsk/c3t14d3s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"
      /dev/rdsk/c3t14d4s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"
      /dev/rdsk/c3t14d5s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"
      /dev/rdsk/c3t14d6s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"
      /dev/rdsk/c3t14d7s2 unconfigured disconnected unblock "changing parts with power supply charged [DF400-0]"

      GRMPD/ETERNUS MPD
      # /usr/opt/FJSViomp/bin/iompadm -c mplb change adapter_disconnect /dev/rdsk/c13t21d0s2 <Return>

      # /usr/opt/FJSViomp/bin/iompadm -c mplb info | grep c13 <Return>
      /dev/rdsk/c13t21d0s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"
      /dev/rdsk/c13t21d1s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"
      /dev/rdsk/c13t21d2s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"
      /dev/rdsk/c13t21d3s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"
      /dev/rdsk/c13t21d4s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"
      /dev/rdsk/c13t21d5s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"
      /dev/rdsk/c13t21d6s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"
      /dev/rdsk/c13t21d7s2 unconfigured disconnected unblock "changing parts with power supply charged [GR7304942- 000004-00-00-30]"

  4. Disconnect the PCI card.

    Disconnect the defected PCI card with the following procedures.

    1. Determine the slot position of the PCI card from the interface name of the path connecting the target PCI card and I/O devices determined in procedure 3.a. (fjpfca3). In this example, the "Ap_Id" is "pcipsy21:R0B01-PCI#slot03".
      # /usr/sbin/FJSVmadm/inst2comp fjpfca3 <Return>
      pcipsy21:R0B01-PCI#slot03
    2. Specify the "Ap_Id" from 4.a. as a parameter and confirm that the slot status of the PCI card to disconnect is "connected configured".
      # cfgadm pcipsy21:R0B01-PCI#slot03 <Return>
      Ap_Id Type Receptacle Occupant Condition
      pcipsy21:R0B01-PCI#slot03 fibre/hp connected configured ok
    3. After executing the command to disconnect the PCI card specifying the "Ap_Id" from 4.b., confirm that the slot status has changed to "disconnected unconfigured."
      # cfgadm -c disconnect pcipsy21:R0B01-PCI#slot03 <Return>
      # cfgadm pcipsy21:R0B01-PCI#slot03 <Return>
      Ap_Id Type Receptacle Occupant Condition
      pcipsy21: R0B01-PCI#slot03 unknown disconnected unconfigured unknown

      Note:

      When error occurs during disconnect, cfgadm command unusually fail with following message. If cfgadm command fails, execute the command once again.

      cfgadm: Component system is busy, try again: disconnect failed

    4. To confirm the slot position at replacement operation, blink the ALARM LED of Ap_Id displayed in procedure 4.a.
      # cfgadm -x led=fault,mode=blink pcipsy21:R0B01-PCI#slot03 <Return>
  5. Replace the PCI card

    Replace the PCI card disconnected in 4. with a replacement card and connect cable to devices. This operation is performed by our customer support.

    When exchanging Fibre Channel cards, the following operations are also required.

    [ for PCI Fibre Channel(PW008FC3U/PW008FC2U/ GP7B8FC1U)]:

    If you use SAN management function of Systemwalker StorageMGR/Softek SANView for ETERNUS (except for Vixel) /SP5000 SRM Facility

    No procedure is necessary. Go to step 7.

    If you don't use SAN management function of the above products

    To replace PCI cards with the following configurations, Fibre Channel switch SN200 series and disk array device ETERNUS3000/ETERNUS6000/GR700/800 series need to be reconfigured individually.

    - If zone configuration by WWPN(World Wide Port Name) is done on SN200 series.

    - If the Host Affinity function of ETERNUS3000/ETERNUS6000/GR700/800 series is used.

    For details, see "SN200 Series Affinity User's Guide" or "ETERNUS3000/ETERNUS6000/GR700/800 series GRmgr User's Guide." If you use the Fibre Channel switch or disk array device other than described above, see the document of each product.

    To perform the above reconfiguration, the WWPN(a 16-digit number) of the replacement card is needed. The WWPN of the PCI card can be known from the eight characters shown on a label on the front plate of the card. These characters represent the bottom eight digit of the WWPN in hexadecimal form. The top eight digit are fixed to 10000000 in hexadecimal form.

    For example, if the following label is shown on the front plate of the card, the WWPN of the replacement card is 100000000e244061.

    0e24
    4061

    [ for Fibre Channel Card (PW028FC3*/PW028FC4*/PW028FC5*)]:

    To replace PCI cards with the following configurations, Fibre Channel switch and disk array device need to be reconfigured individually.

    - If zone configuration by WWPN (World Wide Port Name) is done on Fibre Channel switch.

    - If the Host Zoning function of disk array device is used.

    For details, see the document of each product.

    To perform the above reconfiguration, the WWPN(a 16-digit number) of the replacement card is needed. The WWPN of the PCI card can be known from the twelve characters shown on a label on the back of the card. These characters represent the bottom twelve digit of the WWPN in hexadecimal form. The top four digit are fixed to 1000 in hexadecimal form.

    For example, if the following label is shown on the back of the card, the WWPN of the replacement card is 10000000c9366037.

    IEEE:0000c9366037

    Note:

    When changing Affinity configuration on SN200 series or other Fibre Channel switch, I/O to other devices is effected by the change, and may result in temporal errors.

    I/O to disk array devices recovers normally because of retry processes, but on Fibre Channel tape devices, backup processes may end in errors. Stop backup before changing Affinity configuration.

  6. Connect the PCI card

    Connect the replaced PCI card using the cfgadm(1M) command with the configure option, or by pushing the button corresponding to the replacement slot position. Note that the push button is only effective in multiuser mode. After the new PCI card is connected, use the cfgadm(1M) command and confirm that the slot status has changed to "connected configured."

    If a large-scale configuration of I/O devices is connected to the PCI card in the target slot, command execution for status confirmation may take time.

    # cfgadm -c configure pcipsy3:C0M00-PCI#slot02 <Return>
    # cfgadm pcipsy3:C0M00-PCI#slot02 <Return>
    Ap_Id Type Receptacle Occupant Condition
    pcipsy3:C0M00-PCI#slot02 mult/hp connected configured ok

    When exchanging Fibre Channel cards, the following operations are also required.

    [ for PCI Fibre Channel(PW008FC3U/PW008FC2U/ GP7B8FC1U)]:

    - If you don't use SAN management function of the above products

    No procedure is necessary. Go to step 7.

    [ for Fibre Channel Card (PW028FC3*/PW028FC4*/PW028FC5*)]:

    No procedure is necessary. Go to step 7.

  7. Connect to redundant system

    The following procedure is described for MPHD/GRMPD/ETERNUS MPD. If you use the redundancy software other than MPHD/GRMPD/ETERNUS MPD, see the document for each product.

    Execute the following command specifying the logical path name from 3.b.
    Then confirm the state of that logical path of connected LUN is "online active" or "online standby."
    If the operating path switched in procedure 3., the operating path will be automatically switched back.

    If both MPHD/GRMPD or both MPHD/ETERNUS MPD controlled devices are connected to the PCI card to replace, the commands for both products need to be executed.

    MPHD
    # /usr/opt/FJSViomp/bin/iompadm -c mphd restart adapter_connect /dev/rdsk/c3t14d0s22 <Return>
    8/ 8 LU was controlled.

    # /usr/opt/FJSViomp/bin/iompadm -c mphd info | grep /dev/rdsk/c3 <Return>
    /dev/rdsk/c3t14d0s2 online active block "good status with active [DF400-0] (hddv1524)"
    /dev/rdsk/c3t14d1s2 online standby block "good status with standby [DF400-0] (hddv1525)"
    /dev/rdsk/c3t14d2s2 online active block "good status with active [DF400-0] (hddv1526)"
    /dev/rdsk/c3t14d3s2 online standby block "good status with standby [DF400-0] (hddv1527)"
    /dev/rdsk/c3t14d4s2 online active block "good status with active [DF400-0] (hddv1528)"
    /dev/rdsk/c3t14d5s2 online standby block "good status with standby [DF400-0] (hddv1529)"
    /dev/rdsk/c3t14d6s2 online active block "good status with active [DF400-0] (hddv1530)"
    /dev/rdsk/c3t14d7s2 online standby block "good status with standby [DF400-0] (hddv1531)"

    GRMPD/ETERNUS MPD
    # /usr/opt/FJSViomp/bin/iompadm -c mplb restart adapter_connect /dev/rdsk/c13t21d0s2 <Return>

    # /usr/opt/FJSViomp/bin/iompadm -c mplb info | grep c13 <Return>
    /dev/rdsk/c13t21d0s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv272)"
    /dev/rdsk/c13t21d1s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv273)"
    /dev/rdsk/c13t21d2s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv274)"
    /dev/rdsk/c13t21d3s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv275)"
    /dev/rdsk/c13t21d4s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv276)"
    /dev/rdsk/c13t21d5s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv277)"
    /dev/rdsk/c13t21d6s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv278)"
    /dev/rdsk/c13t21d7s2 online active block "good status with active [GR7304942- 000004-CM00-CA00-PORT30] (hddv279)"

  8. If you use the Fibre Channel Card (PW028FC3*/PW028FC4*/PW028FC5*), execute the following.

    The daemons will be started.

    # /etc/rc2.d/S99ElxRMSrv start <Return>
    # /etc/rc2.d/S99ElxDiscSrv start <Return>
  9. Update hardware configuration information of machine administration/ Start the hardware monitoring daemon.

    Execute the following commands to update hardware configuration information of machine administration and to restart the hardware monitoring daemon.

    # /usr/sbin/FJSVmadm/postphp <Return>

Contents PreviousNext

All Rights Reserved, Copyright (C) FUJITSU LIMITED 2005