Hadoop streaming is a functionality that is included with the Hadoop distribution. It allows users to create and run Map and Reduce jobs using any executable or a script as a mapper and the reducer.

$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar
-input myInputDirs
-output myOutputDir
-mapper /bin/cat
-reducer /bin/wc

 

Suggest An Answer

No suggestions Available!