Monday, 4 January 2016

Running a sample Pi example

To run any application on top of YARN, you need to follow this Java command syntax:
$ yarn jar <application_jar.jar> <arg0> <arg1>

To run a sample example to calculate the value of PI with 16 maps and 10,000 samples, use the following command:
$ yarn jar $YARN_EXAMPLES/hadoop-mapreduce-examples-2.4.0.2.1.1.0-385.jar PI 16 10000

Note that we are using  hadoop-mapreduce-examples-2.4.0.2.1.1.0-385.jar  here.

The JAR version may change depending on your installed Hadoop distribution.

Once you hit the preceding command on the console, you will see the logs generated by the application on the console, as shown in the following command. The default logger configuration is displayed on the console. 

The default mode is INFO, and you may change it by overwriting the default logger settings by updating hadoop.root.logger=WARN,console in conf/log4j.properties:

Number of Maps  = 16
Samples per Map = 10000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15

Starting Job
11/09/14 21:12:02 INFO mapreduce.Job: map 0% reduce 0% 
11/09/14 21:12:09 INFO mapreduce.Job: map 25% reduce 0% 
11/09/14 21:12:11 INFO mapreduce.Job: map 56% reduce 0% 
11/09/14 21:12:12 INFO mapreduce.Job: map 100% reduce 0% 
11/09/14 21:12:12 INFO mapreduce.Job: map 100% reduce 100% 
11/09/14 21:12:12 INFO mapreduce.Job: Job job_1381790835497_0003 completed successfully 
11/09/14 21:12:19 INFO mapreduce.Job: Counters: 44        

File System Counters                
    FILE: Number of bytes read=358                
    FILE: Number of bytes written=1365080
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=4214                
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=67
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters
                Launched map tasks=16
                Launched reduce tasks=1
                Data-local map tasks=14
                Rack-local map tasks=2
                Total time spent by all maps in occupied slots  (ms)=184421
                Total time spent by all reduces in occupied slots (ms)=8542
        Map-Reduce Framework
                Map input records=16
                Map output records=32
                Map output bytes=288
                Map output materialized bytes=448
                Input split bytes=2326
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=448
                Reduce input records=32
                Reduce output records=0
                Spilled Records=64
                Shuffled Maps =16
                Failed Shuffles=0
                Merged Map outputs=16
                GC time elapsed (ms)=195 
                CPU time spent (ms)=7740
                Physical memory (bytes) snapshot=6143396896
                Virtual memory (bytes) snapshot=23142254400
                Total committed heap usage (bytes)=43340769024
  Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0 
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1848
        File Output Format Counters
                Bytes Written=98
Job Finished in 23.144 seconds 

Estimated value of Pi is 3.14127500000000000000

You can compare the example that runs over Hadoop 1.x and the one that runs over YARN. You can hardly differentiate by looking at the logs, but you can clearly identify the difference in performance. YARN has backward-compatibility support with MapReduce 1.x, without any code change.






Related Posts Plugin for WordPress, Blogger...