This section explains DFS setup. The DFS setup procedure is shown below.
Create the mount point and set fstab settings
Register the hadoop group and mapred user
Mount
Creating the mount point
Create the mount point for mounting the disk partitions on the storage system used as the DFS.
Example
Create the mount point "pdfs" under "/mnt".
# mkdir /mnt/pdfs <Enter>
fstab settings
At "/etc/fstab", define the mount points created above and the logical file system name.
The logical file system name is used to identify the DFS file system. Use the name defined when generating the configuration information of the file system on the master server.
Example
This example shows the mount point "/mnt/pdfs" and the logical file system name "pdfs1" defined at "/etc/fstab".
LABEL=/ / ext3 defaults 1 1 LABEL=/home /home ext3 defaults 1 2 LABEL=/boot /boot ext3 defaults 1 2 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 LABEL=SWAP-sda3 swap swap defaults 0 0 pdfs1 /mnt/pdfs pdfs _netdev 0 0
Register the hadoop group and mapred user.
The hadoop group GID and the mapred user UID must be the same values as the master server hadoop group GID and mapred user UID.
Check the master server "/etc/passwd" file and register the same values.
Example
Check the hadoop group amongst the groups registered at the maser server.
# cat /etc/group <Enter>
××× omitted ×××
hadoop:x:123:hbase
hbase:x:503:
bdppgroup:x:1500:
Check the mapred user amongst the users registered at the maser server.
# cat /etc/passwd <Enter>
××× omitted ×××
bdppuser1:x:1500:1500::/home/bdppuser1:/bin/bash
bdppuser2:x:1501:1500::/home/bdppuser2:/bin/bash
mapred:x:202:123:Hadoop MapReduce:/tmp:/bin/bash
hdfs:x:201:123:Hadoop HDFS:/tmp:/bin/bash
hbase:x:203:503::/tmp:/bin/bash
Register the hadoop group.
# groupadd -g 123 hadoop
Register the mapred user.
# useradd -g hadoop -u 202 -c "Hadoop MapReduce" -d /tmp -s /bin/bash mapred
Note
If the GID and UID need to be changed to different values because the GID and UID are already registered at the collaboration server, the hadoop group GID and the mapred user UID registered at the master server, slave servers and development server must all be changed to the same values.
The "DFS file configuration information" must be distributed to the collaboration server before the DFS file system is mounted. Refer to "3.1.5 Distributing the DFS File System Configuration Information" if the "DFS file configuration information" is not yet distributed.
Refer to "D.4.6 Mount" in "Appendix D DFS Environment Construction" for information on mounting a DFS file system.
Example
Mount the DFS file system at the slave server.
# mount pdfs1 <Enter>