Design the application processing logic. The processing such as input file splitting, merge, and so on, which needs to be designed under conventional parallel distributed processing, does not need to be designed because that is executed by the Hadoop framework. Therefore, developers can concentrate on designing the logic required for jobs.
The application developer must understand the Hadoop API and design applications in accordance with the MapReduce framework. The main design tasks required are:
Determining items corresponding to Key and Value
Content of Map processing
Content of Reduce processing