Top
Interstage Big DataParallel Processing ServerV1.0.0 User's Guide
Interstage

4.1.3 DFS Setup

The DFS setup sequence is as follows:

  1. Create a management partition

  2. Create the file system

  3. Set the user ID for executing MapReduce

  4. Register the slave server, development server and collaboration server (DFS client) information

  5. Create the mount point and set fstab settings

  6. Mount

  7. Generate and distribute the DFS file system configuration information


4.1.3.1 Creating a Management Partition

Refer to "D.3 Management Partition Creation" in "Appendix D DFS Environment Construction" for the method for creating a management partition.

Example

  1. Create a management partition (implement at the master server (primary))

    # pdfssetup -c /dev/disk/by-id/scsi-1FUJITSU_300000370105 <Enter>
  2. Add the DFS management server information to the management partition (implement at the master server (primary) and implement the same operation at the master server (secondary server))

    # pdfssetup -a /dev/disk/by-id/scsi-1FUJITSU_300000370105 <Enter>
  3. Check that the DFS management server information was added (implement at the master server (primary) and implement the same operation at the master server (secondary server))

    # pdfssetup <Enter>
    HOSTID          CIPNAME         MP_PATH
    80000001        master1RMS      yes
    80000000        master2RMS      yes
    # pdfssetup -p <Enter>
    /dev/disk/by-id/scsi-1FUJITSU_300000370105
  4. Start the pdfsfrmd daemon (implement the same operation at the master server (primary) and the master server (secondary server))

    # pdfsfrmstart <Enter>

4.1.3.2 Creating the file system

Refer to "D.4 File System Creation" in "Appendix D DFS Environment Construction" for the method for creating a file system.

Example

  1. Create the file system (Master server (primary))

    • Representative partition:
      /dev/disk/by-id/scsi-1FUJITSU_300000370106

    • File data partitions:
      /dev/disk/by-id/scsi-1FUJITSU_300000370107
      /dev/disk/by-id/scsi-1FUJITSU_300000370108

    • Master server (primary):master1

    • Master server (secondary):master2

    # pdfsmkfs -o dataopt=y,blocksz=8388608,data=/dev/disk/by-id/scsi-1FUJITSU_300000370107,data=/dev/disk/by-id/scsi-1FUJITSU_300000370108,node=master1,master2 /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>
  2. Check the file system information

    # pdfsinfo /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>
    /dev/disk/by-id/scsi-1FUJITSU_300000370106:
    FSID special size Type mount
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)   25418 META -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)    5120  LOG -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)  232256 DATA -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370107 (880) 7341778 DATA -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370108 (896) 6578704 DATA -----

4.1.3.3 Setting the User ID for Executing MapReduce

Users must be set to the DFS in order for mapred users to execute Hadoop JobTracker and TaskTracker.

This section describes the procedure for setting mapred users to the DFS.

  1. If not logged in to the master server, log in with root permissions.

  2. Unmount the DFS if it is mounted.

    # pdfsumntgl /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>
  3. Set the user ID.

    Use the pdfsadm command to set the user ID for executing MapReduce in the MAPRED variable.

    # pdfsadm -o MAPRED=mapred /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>
  4. Check that the user ID has been set.

    # pdfsinfo -e /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>
    MAPRED=mapred

See

Refer to "pdfsadm" under "Appendix A Command Reference" in the "Primesoft Distributed File System for Hadoop V1 User's Guide" for the method for deleting a set MAPRED variable and other pdfsadm command details.


4.1.3.4 Registering the Slave Server, Development Server and Collaboration Server (DFS Client) Information

Register slave server, development server and collaboration server information in the connection authorization list. Register this connection approval list file on the master server (primary) then distribute it to the master server (secondary). Refer to "D.4.3 Registering Slave Server, Development Server and Collaboration Server Information" in "Appendix D DFS Environment Construction" for information on registering slave server, development server and collaboration server (DFS client) information.

Example

  1. Check the file system ID.

    # pdfsinfo /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>
    /dev/disk/by-id/scsi-1FUJITSU_300000370106:
    FSID special size Type mount
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)   25418 META -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)    5120  LOG -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)  232256 DATA -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370107 (880) 7341778 DATA -----
    1 /dev/disk/by-id/scsi-1FUJITSU_300000370108 (896) 6578704 DATA -----
  2. Create a file listing approved connections.

    # cd /etc/pdfs <Enter>
    # cp -p ./server.conf.sample ./server.conf.1 <Enter>
  3. Edit the connection authorization list file and define the slave server, development server and collaboration server information.
    Only one slave server is installed initially, but include all the slave servers, development servers and collaboration servers that will be connection targets and define them in advance.

    #
    # Copyright (c) 2012 FUJITSU LIMITED. All rights reserved.
    #
    #   /etc/pdfs/server.conf.<FSID>
    #
    # List of client hostnames of a file system.
    #
    # Notes:
    #   Do not describe hostnames of management servers.
    #
    # example:
    #CLIENT nodeac1
    #CLIENT nodeac2
    #CLIENT nodeac3
    #CLIENT nodeac4
    #CLIENT nodeac5
    CLIENT collaborate  <--  Collaboration server to be added
    CLIENT slave1        <--  Slave server to be added
    CLIENT slave2        <--  Slave server to be added
    CLIENT slave3        <--  Slave server to be added
    CLIENT slave4        <--  Slave server to be added
    CLIENT slave5        <--  Slave server to be added

4.1.3.5 Creating the Mount Point and Setting fstab Settings

Creating the mount point

Create the mount point for mounting the disk partitions on the storage system used as the DFS.

Mount points should be created on both the master server (primary) and the master server (secondary).

Example

Create the mount point "pdfs" under "/mnt".

# mkdir /mnt/pdfs <Enter>

fstab settings

At "/etc/fstab", define the mount points created above and the DFS representative partitions.

fstab settings should be implemented on both the master server (primary) and the master server (secondary).

Example

This example shows the mount point "/mnt/pdfs" and DFS representative partitions defined at "/etc/fstab".

LABEL=/                 /                       ext3    defaults        1 1
LABEL=/boot             /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
LABEL=SWAP-sda3         swap                    swap    defaults        0 0

/dev/disk/by-id/scsi-1FUJITSU_300000370106    /mnt/pdfs       pdfs    noauto,noatime   0 0

4.1.3.6 Mounting

Mount the DFS file system on the master server (primary). Refer to "D.4.6 Mount" in "Appendix D DFS Environment Construction" for information on mounting a DFS file system.

Example

Mount the DFS file system on the master server (primary).

# pdfsmntgl /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>


4.1.3.7 Generating and Distributing the DFS File System Configuration Information

Generate the DFS file system configuration file on the master server (primary).

The procedure for generating the file system configuration information is explained using the following environment as an example.

  • File system ID

:1

  • Logical file system name

:pdfs1

  • Slave server, development server and collaboration server

:slave1, slave2, slave3, slave4, slave5, develop, collaborate

  1. Log in to the master server (primary) with root permissions.

  2. Check the file system ID.

    Check the target file system ID in the file system information recorded in the management partition.

    # pdfsinfo /dev/disk/by-id/scsi-1FUJITSU_300000370106 <Enter>
    /dev/disk/by-id/scsi-1FUJITSU_300000370106:
    FSID special                                             size Type mount
       1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)   25418 META -----
       1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)    5120  LOG -----
       1 /dev/disk/by-id/scsi-1FUJITSU_300000370106 (864)  232256 DATA -----
       1 /dev/disk/by-id/scsi-1FUJITSU_300000370107 (880) 7341778 DATA -----
       1 /dev/disk/by-id/scsi-1FUJITSU_300000370108 (896) 6578704 DATA -----
  3. Generate the DFS configuration file with the pdfsmkconf command.
    Execute the pdfsmkconf command on the master server (primary).

    # pdfsmkconf <Enter>
  4. Convert the generated configuration file name from a file system ID to a logical file system name.

    # cd pdfsmkconf_out <Enter>
    # mv ./client.conf.1 client.conf.pdfs1 <Enter>

    Note

    The configuration file is created in the directory where the pdfsmkconf command was executed.

    Other than the file system ID part (client.conf), do not change the name of the configuration file.

  5. Distribute the configuration file to the master server (secondary).

    # scp ./client.conf.pdfs1 root@master2:/etc/pdfs/client.conf.pdfs1 <Enter>

    Note

    Put the configuration file into the "/etc/pdfs" directories on each server.

See

Refer to "pdfsmkconf" in the "Appendix A Command Reference" of the "Primesoft Distributed File System for Hadoop V1 User's Guide" for information on the pdfsmkconf command.