• Reduce – it is nothing but mostly group by phase.

  • Combining – The last phase where all the data (individual result set from each cluster) is combined together to form a result.

  • Now Let’s See the Word Count Program in Java

    Fortunately, we don’t have to write all of the above steps, we only need to write the splitting parameter, Map function logic, and Reduce function logic. The rest of the remaining steps will execute automatically.

    Make sure that Hadoop is installed on your system with the Java SDK.

    Steps

    1. Start 32nd. Open Eclipse> File > New > Java Project >( Name it – MRProgramsDemo) > Finish.

    2. Right Click > New > Package ( Name it - PackageDemo) > Finish.

    3. Right Click on Package > New > Class (Name it - WordCount).

    4. Add Following Reference Libraries:

      1. Right Click on Project > Build Path> Add External

        1. /usr/lib/hadoop-0.20/hadoop-core.jar

        2. Usr/lib/hadoop-0.20/lib/Commons-cli-1.2.jar

    5. Type the following code:

    The above program consists of three classes:

    6. Make a jar file

    Right Click on Project> Export> Select export destination as Jar File > next> Finish.

    7. Take a text file and move it into HDFS format:

    To move this into Hadoop directly, open the terminal and enter the following commands:

    8. Run the jar file:

    (Hadoop jar jarfilename.jar packageName.ClassName PathToInputTextFile PathToOutputDirectry)

    Word Count 2 02 1

    9. Open the result:

    mapreduce,java,hadoop,big data,tutorial,wordcount

    Word Count 2 020

    Opinions expressed by DZone contributors are their own.

    Word Count 20 Minute Speech

    Popular on DZone