The applications for performing processing under Hadoop include the following types:
MapReduce application
Java programs that operate in the Hadoop MapReduce framework are developed using the Hadoop API.
Hive query
These are queries written in an SQL-equivalent language (HiveQL) using Apache Hive, developed by The Apache Software Foundation, rather than using the Hadoop API.
Pig script
Like Hive, these scripts are written using the Pig Latin language without using the Hadoop API.
HBase application
The HBase API is used to develop Java programs that perform HBase data input-output and perform operations on the data in HBase.
Information
Installation directory
The applications are installed in the following directories on the master servers, slave servers, and development servers.
Application | Installation directory | Master server | Slave | Development |
---|---|---|---|---|
MapReduce | /usr/bin (command) /usr/share/hadoop (library) /etc/hadoop (setup file) | Y | Y | Y |
Hive | /usr/lib/hive-version /etc/hive (setup file) (*1) | N | N | Y |
Pig | /usr/bin (command) /usr/share/pig (library) /etc/pig (setup file) | N | N | Y |
Hbase | /usr/lib/hbase-version /etc/hbase (setup file) (*2) | Y | Y | Y |
Y: Installed.
N: Not installed.
*1:/etc/hive is a symbolic link to /usr/lib/hive-version/conf
*2:/etc/hbase is a symbolic link to /usr/lib/hbase-version/conf.
The development of MapReduce applications is described below. Refer to the website and similar of the Apache Hadoop project for information on developing other applications.