Hadoop基准测试（一）-白红宇

Hadoop基准测试（一）

阅读量：5272 次

发布时间：2019-06-14

本文共 31908 字，大约阅读时间需要 106 分钟。

测试对于验证系统的正确性、分析系统的性能来说非常重要，但往往容易被我们所忽视。为了能对系统有更全面的了解、能找到系统的瓶颈所在、能对系统性能做更好的改进，打算先从测试入手，学习Hadoop主要的测试手段。

TestDFSIO

TestDFSIO用于测试HDFS的IO性能，使用一个MapReduce作业来并发地执行读写操作，每个map任务用于读或写每个文件，map的输出用于收集与处理文件相关的统计信息，reduce用于累积统计信息，并产生summary。

NameNode的地址为：10.*.*.131:7180

输入命令 hadoop version，提示hadoop jar包所在路径

进入jar包所在路径，输入命令 hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar，返回如下信息：

An example program must be given as the first argument.Valid program names are: DFSCIOTest: Distributed i/o benchmark of libhdfs. DistributedFSCheck: Distributed checkup of the file system consistency. MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures TestDFSIO: Distributed i/o benchmark. dfsthroughput: measure hdfs throughput filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed) loadgen: Generic map/reduce load generator mapredtest: A map/reduce test check. minicluster: Single process HDFS and MR cluster. mrbench: A map/reduce benchmark that can create many small jobs nnbench: A benchmark that stresses the namenode. testarrayfile: A test for flat files of binary key/value pairs. testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce testfilesystem: A test for FileSystem read/write. testmapredsort: A map/reduce program that validates the map-reduce framework's sort. testrpc: A test for rpc. testsequencefile: A test for flat files of binary key value pairs. testsequencefileinputformat: A test for sequence file input format. testsetfile: A test for flat files of binary key/value pairs. testtextinputformat: A test for text input format. threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill

输入并执行命令 hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000

返回如下信息：

19/04/02 16:22:30 INFO fs.TestDFSIO: TestDFSIO.1.719/04/02 16:22:30 INFO fs.TestDFSIO: nrFiles = 1019/04/02 16:22:30 INFO fs.TestDFSIO: nrBytes (MB) = 1000.019/04/02 16:22:30 INFO fs.TestDFSIO: bufferSize = 100000019/04/02 16:22:30 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO19/04/02 16:22:31 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 filesjava.io.IOException: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

报错！　　java.io.IOException: Permission denied: user=root, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x

执行命令 su hdfs 切换用户为 hdfs

输入并执行命令 hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000

返回如下信息：

bash-4.2$ hadoop jar hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 100019/04/02 16:26:39 INFO fs.TestDFSIO: TestDFSIO.1.719/04/02 16:26:39 INFO fs.TestDFSIO: nrFiles = 1019/04/02 16:26:39 INFO fs.TestDFSIO: nrBytes (MB) = 1000.019/04/02 16:26:39 INFO fs.TestDFSIO: bufferSize = 100000019/04/02 16:26:39 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO19/04/02 16:26:40 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files19/04/02 16:26:40 INFO fs.TestDFSIO: created control files for: 10 files19/04/02 16:26:40 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/02 16:26:40 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/02 16:26:41 INFO mapred.FileInputFormat: Total input paths to process : 1019/04/02 16:26:41 INFO mapreduce.JobSubmitter: number of splits:1019/04/02 16:26:41 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum19/04/02 16:26:41 INFO Configuration.deprecation: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address19/04/02 16:26:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_000219/04/02 16:26:41 INFO impl.YarnClientImpl: Submitted application application_1552358721447_000219/04/02 16:26:41 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0002/19/04/02 16:26:41 INFO mapreduce.Job: Running job: job_1552358721447_000219/04/02 16:26:48 INFO mapreduce.Job: Job job_1552358721447_0002 running in uber mode : false19/04/02 16:26:48 INFO mapreduce.Job:  map 0% reduce 0%19/04/02 16:27:02 INFO mapreduce.Job:  map 30% reduce 0%19/04/02 16:27:03 INFO mapreduce.Job:  map 100% reduce 0%19/04/02 16:27:08 INFO mapreduce.Job:  map 100% reduce 100%19/04/02 16:27:08 INFO mapreduce.Job: Job job_1552358721447_0002 completed successfully19/04/02 16:27:08 INFO mapreduce.Job: Counters: 49        File System Counters                FILE: Number of bytes read=379                FILE: Number of bytes written=1653843                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=2310                HDFS: Number of bytes written=10485760082                HDFS: Number of read operations=43                HDFS: Number of large read operations=0                HDFS: Number of write operations=12        Job Counters                Launched map tasks=10                Launched reduce tasks=1                Data-local map tasks=10                Total time spent by all maps in occupied slots (ms)=128477                Total time spent by all reduces in occupied slots (ms)=2621                Total time spent by all map tasks (ms)=128477                Total time spent by all reduce tasks (ms)=2621                Total vcore-milliseconds taken by all map tasks=128477                Total vcore-milliseconds taken by all reduce tasks=2621                Total megabyte-milliseconds taken by all map tasks=131560448                Total megabyte-milliseconds taken by all reduce tasks=2683904        Map-Reduce Framework                Map input records=10                Map output records=50                Map output bytes=784                Map output materialized bytes=1033                Input split bytes=1190                Combine input records=0                Combine output records=0                Reduce input groups=5                Reduce shuffle bytes=1033                Reduce input records=50                Reduce output records=5                Spilled Records=100                Shuffled Maps =10                Failed Shuffles=0                Merged Map outputs=10                GC time elapsed (ms)=2657                CPU time spent (ms)=94700                Physical memory (bytes) snapshot=7229349888                Virtual memory (bytes) snapshot=32021716992                Total committed heap usage (bytes)=6717702144        Shuffle Errors                BAD_ID=0                CONNECTION=0                IO_ERROR=0                WRONG_LENGTH=0                WRONG_MAP=0                WRONG_REDUCE=0        File Input Format Counters                Bytes Read=1120        File Output Format Counters                Bytes Written=82java.io.FileNotFoundException: TestDFSIO_results.log (Permission denied)

报错！　　java.io.FileNotFoundException: TestDFSIO_results.log (Permission denied)

这是由于用户hdfs对当前所在文件夹没有足够的访问权限，参考：　　　　中的评论

解决：新建文件夹 ** （命令：mkdir **），并授予用户hdfs对文件夹**的访问权限（命令：sudo chmod -R 777 **），进入文件夹**，执行命令 hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 1000 ，返回如下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -write -nrFiles 10 -fileSize 100019/04/03 10:26:32 INFO fs.TestDFSIO: TestDFSIO.1.719/04/03 10:26:32 INFO fs.TestDFSIO: nrFiles = 1019/04/03 10:26:32 INFO fs.TestDFSIO: nrBytes (MB) = 1000.019/04/03 10:26:32 INFO fs.TestDFSIO: bufferSize = 100000019/04/03 10:26:32 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO19/04/03 10:26:32 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files19/04/03 10:26:33 INFO fs.TestDFSIO: created control files for: 10 files19/04/03 10:26:33 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 10:26:33 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 10:26:33 INFO mapred.FileInputFormat: Total input paths to process : 1019/04/03 10:26:33 INFO mapreduce.JobSubmitter: number of splits:1019/04/03 10:26:33 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum19/04/03 10:26:33 INFO Configuration.deprecation: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address19/04/03 10:26:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_000619/04/03 10:26:34 INFO impl.YarnClientImpl: Submitted application application_1552358721447_000619/04/03 10:26:34 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0006/19/04/03 10:26:34 INFO mapreduce.Job: Running job: job_1552358721447_000619/04/03 10:26:39 INFO mapreduce.Job: Job job_1552358721447_0006 running in uber mode : false19/04/03 10:26:39 INFO mapreduce.Job:  map 0% reduce 0%19/04/03 10:26:53 INFO mapreduce.Job:  map 30% reduce 0%19/04/03 10:26:54 INFO mapreduce.Job:  map 90% reduce 0%19/04/03 10:26:55 INFO mapreduce.Job:  map 100% reduce 0%19/04/03 10:27:00 INFO mapreduce.Job:  map 100% reduce 100%19/04/03 10:27:00 INFO mapreduce.Job: Job job_1552358721447_0006 completed successfully19/04/03 10:27:00 INFO mapreduce.Job: Counters: 49        File System Counters                FILE: Number of bytes read=392                FILE: Number of bytes written=1653853                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=2310                HDFS: Number of bytes written=10485760082                HDFS: Number of read operations=43                HDFS: Number of large read operations=0                HDFS: Number of write operations=12        Job Counters                Launched map tasks=10                Launched reduce tasks=1                Data-local map tasks=10                Total time spent by all maps in occupied slots (ms)=125653                Total time spent by all reduces in occupied slots (ms)=2636                Total time spent by all map tasks (ms)=125653                Total time spent by all reduce tasks (ms)=2636                Total vcore-milliseconds taken by all map tasks=125653                Total vcore-milliseconds taken by all reduce tasks=2636                Total megabyte-milliseconds taken by all map tasks=128668672                Total megabyte-milliseconds taken by all reduce tasks=2699264        Map-Reduce Framework                Map input records=10                Map output records=50                Map output bytes=783                Map output materialized bytes=1030                Input split bytes=1190                Combine input records=0                Combine output records=0                Reduce input groups=5                Reduce shuffle bytes=1030                Reduce input records=50                Reduce output records=5                Spilled Records=100                Shuffled Maps =10                Failed Shuffles=0                Merged Map outputs=10                GC time elapsed (ms)=1881                CPU time spent (ms)=78110                Physical memory (bytes) snapshot=6980759552                Virtual memory (bytes) snapshot=31983017984                Total committed heap usage (bytes)=6693060608        Shuffle Errors                BAD_ID=0                CONNECTION=0                IO_ERROR=0                WRONG_LENGTH=0                WRONG_MAP=0                WRONG_REDUCE=0        File Input Format Counters                Bytes Read=1120        File Output Format Counters                Bytes Written=8219/04/03 10:27:00 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write19/04/03 10:27:00 INFO fs.TestDFSIO:            Date & time: Wed Apr 03 10:27:00 CST 201919/04/03 10:27:00 INFO fs.TestDFSIO:        Number of files: 1019/04/03 10:27:00 INFO fs.TestDFSIO: Total MBytes processed: 10000.019/04/03 10:27:00 INFO fs.TestDFSIO:      Throughput mb/sec: 114.7763009893717219/04/03 10:27:00 INFO fs.TestDFSIO: Average IO rate mb/sec: 115.2963409423828119/04/03 10:27:00 INFO fs.TestDFSIO:  IO rate std deviation: 7.88001177729581819/04/03 10:27:00 INFO fs.TestDFSIO:     Test exec time sec: 27.0519/04/03 10:27:00 INFO fs.TestDFSIO:bash-4.2$

测试命令正确执行以后会在Hadoop File System中创建文件夹存放生成的测试文件，如下所示：

并生成了一系列小文件：

将小文件下载到本地，查看文件大小为1KB

用Notepad++打开后，查看内容为：

并不是可读的内容

执行命令：　　 hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -read -nrFiles 10 -fileSize 1000

返回如下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -read -nrFiles 10 -fileSize 100019/04/03 10:51:05 INFO fs.TestDFSIO: TestDFSIO.1.719/04/03 10:51:05 INFO fs.TestDFSIO: nrFiles = 1019/04/03 10:51:05 INFO fs.TestDFSIO: nrBytes (MB) = 1000.019/04/03 10:51:05 INFO fs.TestDFSIO: bufferSize = 100000019/04/03 10:51:05 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO19/04/03 10:51:05 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files19/04/03 10:51:06 INFO fs.TestDFSIO: created control files for: 10 files19/04/03 10:51:06 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 10:51:06 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 10:51:06 INFO mapred.FileInputFormat: Total input paths to process : 1019/04/03 10:51:06 INFO mapreduce.JobSubmitter: number of splits:1019/04/03 10:51:06 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum19/04/03 10:51:06 INFO Configuration.deprecation: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address19/04/03 10:51:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_000719/04/03 10:51:07 INFO impl.YarnClientImpl: Submitted application application_1552358721447_000719/04/03 10:51:07 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0007/19/04/03 10:51:07 INFO mapreduce.Job: Running job: job_1552358721447_000719/04/03 10:51:12 INFO mapreduce.Job: Job job_1552358721447_0007 running in uber mode : false19/04/03 10:51:12 INFO mapreduce.Job:  map 0% reduce 0%19/04/03 10:51:19 INFO mapreduce.Job:  map 100% reduce 0%19/04/03 10:51:25 INFO mapreduce.Job:  map 100% reduce 100%19/04/03 10:51:25 INFO mapreduce.Job: Job job_1552358721447_0007 completed successfully19/04/03 10:51:25 INFO mapreduce.Job: Counters: 49        File System Counters                FILE: Number of bytes read=345                FILE: Number of bytes written=1653774                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=10485762310                HDFS: Number of bytes written=81                HDFS: Number of read operations=53                HDFS: Number of large read operations=0                HDFS: Number of write operations=2        Job Counters                Launched map tasks=10                Launched reduce tasks=1                Data-local map tasks=10                Total time spent by all maps in occupied slots (ms)=50265                Total time spent by all reduces in occupied slots (ms)=2630                Total time spent by all map tasks (ms)=50265                Total time spent by all reduce tasks (ms)=2630                Total vcore-milliseconds taken by all map tasks=50265                Total vcore-milliseconds taken by all reduce tasks=2630                Total megabyte-milliseconds taken by all map tasks=51471360                Total megabyte-milliseconds taken by all reduce tasks=2693120        Map-Reduce Framework                Map input records=10                Map output records=50                Map output bytes=774                Map output materialized bytes=1020                Input split bytes=1190                Combine input records=0                Combine output records=0                Reduce input groups=5                Reduce shuffle bytes=1020                Reduce input records=50                Reduce output records=5                Spilled Records=100                Shuffled Maps =10                Failed Shuffles=0                Merged Map outputs=10                GC time elapsed (ms)=1310                CPU time spent (ms)=35780                Physical memory (bytes) snapshot=6365962240                Virtual memory (bytes) snapshot=31838441472                Total committed heap usage (bytes)=6873415680        Shuffle Errors                BAD_ID=0                CONNECTION=0                IO_ERROR=0                WRONG_LENGTH=0                WRONG_MAP=0                WRONG_REDUCE=0        File Input Format Counters                Bytes Read=1120        File Output Format Counters                Bytes Written=8119/04/03 10:51:25 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read19/04/03 10:51:25 INFO fs.TestDFSIO:            Date & time: Wed Apr 03 10:51:25 CST 201919/04/03 10:51:25 INFO fs.TestDFSIO:        Number of files: 1019/04/03 10:51:25 INFO fs.TestDFSIO: Total MBytes processed: 10000.019/04/03 10:51:25 INFO fs.TestDFSIO:      Throughput mb/sec: 897.424391994974419/04/03 10:51:25 INFO fs.TestDFSIO: Average IO rate mb/sec: 898.684448242187519/04/03 10:51:25 INFO fs.TestDFSIO:  IO rate std deviation: 33.6862358781003719/04/03 10:51:25 INFO fs.TestDFSIO:     Test exec time sec: 19.03519/04/03 10:51:25 INFO fs.TestDFSIO:bash-4.2$

执行命令： hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -clean

返回如下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar TestDFSIO -clean19/04/03 11:17:25 INFO fs.TestDFSIO: TestDFSIO.1.719/04/03 11:17:25 INFO fs.TestDFSIO: nrFiles = 119/04/03 11:17:25 INFO fs.TestDFSIO: nrBytes (MB) = 1.019/04/03 11:17:25 INFO fs.TestDFSIO: bufferSize = 100000019/04/03 11:17:25 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO19/04/03 11:17:26 INFO fs.TestDFSIO: Cleaning up test filesbash-4.2$

同时Hadoop File System中删除了TestDFSIO文件夹

nnbench

nnbench用于测试NameNode的负载，它会生成很多与HDFS相关的请求，给NameNode施加较大的压力。这个测试能在HDFS上模拟创建、读取、重命名和删除文件等操作。

nnbench命令的参数说明如下：

NameNode Benchmark 0.4Usage: nnbench 
     
      Options:-operation 
      
       * NOTE: The open_read, rename and delete operations assume that the files they operate on, are already available. The create_write operation must be run before running the other operations.-maps 
       
        -reduces 
        
         -startTime 
         . default is launch time + 2 mins. This is not mandatory-blockSize 
          
           -bytesToWrite 
           
            -bytesPerChecksum 
            
             -numberOfFiles 
             
              -replicationFactorPerFile 
              
               -baseDir 
               -readFileAfterOpen 
               
                -help: Display the help statement

为了使用12个mapper和6个reducer来创建1000个文件，执行命令： hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar nnbench -operation create_write -maps 12 -reduces 6 -blockSize 1 -bytesToWrite 0 -numberOfFiles 1000 -replicationFactorPerFile 3 -readFileAfterOpen true -baseDir /benchmarks/NNBench

返回如下信息：

bash-4.2$ hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar nnbench -operation create_write -maps 12 -reduces 6 -blockSize 1 -bytesToWrite 0 -numberOfFiles 1000 -replicationFactorPerFile 3 hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar nnbench -operation create_write -maps 12 -reduces 6 -blockSize 1 -bytesToWrite 0 -numberOfFiles 1000 -replicationFactorPerFile 3 -readFileAfterOpen true -baseDir /benchmarks/NNBenchNameNode Benchmark 0.419/04/03 16:11:22 INFO hdfs.NNBench: Test Inputs:19/04/03 16:11:22 INFO hdfs.NNBench:            Test Operation: create_write19/04/03 16:11:22 INFO hdfs.NNBench:                Start time: 2019-04-03 16:13:22,75519/04/03 16:11:22 INFO hdfs.NNBench:            Number of maps: 1219/04/03 16:11:22 INFO hdfs.NNBench:         Number of reduces: 619/04/03 16:11:22 INFO hdfs.NNBench:                Block Size: 119/04/03 16:11:22 INFO hdfs.NNBench:            Bytes to write: 019/04/03 16:11:22 INFO hdfs.NNBench:        Bytes per checksum: 119/04/03 16:11:22 INFO hdfs.NNBench:           Number of files: 100019/04/03 16:11:22 INFO hdfs.NNBench:        Replication factor: 319/04/03 16:11:22 INFO hdfs.NNBench:                  Base dir: /benchmarks/NNBench19/04/03 16:11:22 INFO hdfs.NNBench:      Read file after open: true19/04/03 16:11:23 INFO hdfs.NNBench: Deleting data directory19/04/03 16:11:23 INFO hdfs.NNBench: Creating 12 control files19/04/03 16:11:24 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum19/04/03 16:11:24 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 16:11:24 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 16:11:24 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.19/04/03 16:11:24 INFO mapred.FileInputFormat: Total input paths to process : 1219/04/03 16:11:24 INFO mapreduce.JobSubmitter: number of splits:1219/04/03 16:11:24 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum19/04/03 16:11:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_000919/04/03 16:11:24 INFO impl.YarnClientImpl: Submitted application application_1552358721447_000919/04/03 16:11:24 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0009/19/04/03 16:11:24 INFO mapreduce.Job: Running job: job_1552358721447_000919/04/03 16:11:31 INFO mapreduce.Job: Job job_1552358721447_0009 running in uber mode : false19/04/03 16:11:31 INFO mapreduce.Job:  map 0% reduce 0%19/04/03 16:11:48 INFO mapreduce.Job:  map 50% reduce 0%19/04/03 16:11:49 INFO mapreduce.Job:  map 67% reduce 0%19/04/03 16:13:26 INFO mapreduce.Job:  map 100% reduce 0%19/04/03 16:13:31 INFO mapreduce.Job:  map 100% reduce 17%19/04/03 16:13:32 INFO mapreduce.Job:  map 100% reduce 100%19/04/03 16:13:32 INFO mapreduce.Job: Job job_1552358721447_0009 completed successfully19/04/03 16:13:32 INFO mapreduce.Job: Counters: 49        File System Counters                FILE: Number of bytes read=519                FILE: Number of bytes written=2736365                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=2908                HDFS: Number of bytes written=170                HDFS: Number of read operations=66                HDFS: Number of large read operations=0                HDFS: Number of write operations=12012        Job Counters                Launched map tasks=12                Launched reduce tasks=6                Data-local map tasks=12                Total time spent by all maps in occupied slots (ms)=1363711                Total time spent by all reduces in occupied slots (ms)=18780                Total time spent by all map tasks (ms)=1363711                Total time spent by all reduce tasks (ms)=18780                Total vcore-milliseconds taken by all map tasks=1363711                Total vcore-milliseconds taken by all reduce tasks=18780                Total megabyte-milliseconds taken by all map tasks=1396440064                Total megabyte-milliseconds taken by all reduce tasks=19230720        Map-Reduce Framework                Map input records=12                Map output records=84                Map output bytes=2016                Map output materialized bytes=3276                Input split bytes=1418                Combine input records=0                Combine output records=0                Reduce input groups=7                Reduce shuffle bytes=3276                Reduce input records=84                Reduce output records=7                Spilled Records=168                Shuffled Maps =72                Failed Shuffles=0                Merged Map outputs=72                GC time elapsed (ms)=2335                CPU time spent (ms)=35880                Physical memory (bytes) snapshot=9088864256                Virtual memory (bytes) snapshot=52095377408                Total committed heap usage (bytes)=11191975936        Shuffle Errors                BAD_ID=0                CONNECTION=0                IO_ERROR=0                WRONG_LENGTH=0                WRONG_MAP=0                WRONG_REDUCE=0        File Input Format Counters                Bytes Read=1490        File Output Format Counters                Bytes Written=17019/04/03 16:13:32 INFO hdfs.NNBench: -------------- NNBench -------------- :19/04/03 16:13:32 INFO hdfs.NNBench:                                Version: NameNode Benchmark 0.419/04/03 16:13:32 INFO hdfs.NNBench:                            Date & time: 2019-04-03 16:13:32,47519/04/03 16:13:32 INFO hdfs.NNBench:19/04/03 16:13:32 INFO hdfs.NNBench:                         Test Operation: create_write19/04/03 16:13:32 INFO hdfs.NNBench:                             Start time: 2019-04-03 16:13:22,75519/04/03 16:13:32 INFO hdfs.NNBench:                            Maps to run: 1219/04/03 16:13:32 INFO hdfs.NNBench:                         Reduces to run: 619/04/03 16:13:32 INFO hdfs.NNBench:                     Block Size (bytes): 119/04/03 16:13:32 INFO hdfs.NNBench:                         Bytes to write: 019/04/03 16:13:32 INFO hdfs.NNBench:                     Bytes per checksum: 119/04/03 16:13:32 INFO hdfs.NNBench:                        Number of files: 100019/04/03 16:13:32 INFO hdfs.NNBench:                     Replication factor: 319/04/03 16:13:32 INFO hdfs.NNBench:             Successful file operations: 019/04/03 16:13:32 INFO hdfs.NNBench:19/04/03 16:13:32 INFO hdfs.NNBench:         # maps that missed the barrier: 019/04/03 16:13:32 INFO hdfs.NNBench:                           # exceptions: 019/04/03 16:13:32 INFO hdfs.NNBench:19/04/03 16:13:32 INFO hdfs.NNBench:                TPS: Create/Write/Close: 019/04/03 16:13:32 INFO hdfs.NNBench: Avg exec time (ms): Create/Write/Close: 0.019/04/03 16:13:32 INFO hdfs.NNBench:             Avg Lat (ms): Create/Write: NaN19/04/03 16:13:32 INFO hdfs.NNBench:                    Avg Lat (ms): Close: NaN19/04/03 16:13:32 INFO hdfs.NNBench:19/04/03 16:13:32 INFO hdfs.NNBench:                  RAW DATA: AL Total #1: 019/04/03 16:13:32 INFO hdfs.NNBench:                  RAW DATA: AL Total #2: 019/04/03 16:13:32 INFO hdfs.NNBench:               RAW DATA: TPS Total (ms): 019/04/03 16:13:32 INFO hdfs.NNBench:        RAW DATA: Longest Map Time (ms): 0.019/04/03 16:13:32 INFO hdfs.NNBench:                    RAW DATA: Late maps: 019/04/03 16:13:32 INFO hdfs.NNBench:              RAW DATA: # of exceptions: 019/04/03 16:13:32 INFO hdfs.NNBench:bash-4.2$

任务执行完以后可以到页面 http://*.*.*.*:19888/jobhistory/job/job_1552358721447_0009 查看任务执行详情，如下：

并且在Hadoop File System中生成文件夹NNBench存储任务产生的文件：

进入目录/benchmarks/NNBench/control，查看某个文件的元信息，发现文件存在三个节点上：

下载下来用Notepad++打开，发现内容是乱码：

mrbench

mrbench会多次重复执行一个小作业，用于检查在机群上小作业的运行是否可重复以及运行是否高效。mrbench的用法如下：

Usage: mrbench [-baseDir ] [-jar 
     
      ] [-numRuns 
      
       ] [-maps 
       
        ] [-reduces 
        
         ] [-inputLines 
         
          ] [-inputType 
          
           ] [-verbose]

执行命令： hadoop jar ../jars/hadoop-test-2.6.0-mr1-cdh5.16.1.jar mrbench -numRuns 50

返回如下信息：

……        Shuffle Errors                BAD_ID=0                CONNECTION=0                IO_ERROR=0                WRONG_LENGTH=0                WRONG_MAP=0                WRONG_REDUCE=0        File Input Format Counters                Bytes Read=3        File Output Format Counters                Bytes Written=319/04/03 17:10:15 INFO mapred.MRBench: Running job 49: input=hdfs://node1:8020/benchmarks/MRBench/mr_input output=hdfs://node1:8020/benchmarks/MRBench/mr_output/output_29973931619/04/03 17:10:15 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 17:10:15 INFO client.RMProxy: Connecting to ResourceManager at node1/10.200.101.131:803219/04/03 17:10:15 INFO mapred.FileInputFormat: Total input paths to process : 119/04/03 17:10:15 INFO mapreduce.JobSubmitter: number of splits:219/04/03 17:10:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552358721447_005919/04/03 17:10:15 INFO impl.YarnClientImpl: Submitted application application_1552358721447_005919/04/03 17:10:15 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1552358721447_0059/19/04/03 17:10:15 INFO mapreduce.Job: Running job: job_1552358721447_005919/04/03 17:10:21 INFO mapreduce.Job: Job job_1552358721447_0059 running in uber mode : false19/04/03 17:10:21 INFO mapreduce.Job:  map 0% reduce 0%19/04/03 17:10:25 INFO mapreduce.Job:  map 100% reduce 0%19/04/03 17:10:30 INFO mapreduce.Job:  map 100% reduce 100%19/04/03 17:10:30 INFO mapreduce.Job: Job job_1552358721447_0059 completed successfully19/04/03 17:10:30 INFO mapreduce.Job: Counters: 49        File System Counters                FILE: Number of bytes read=27                FILE: Number of bytes written=450422                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=239                HDFS: Number of bytes written=3                HDFS: Number of read operations=9                HDFS: Number of large read operations=0                HDFS: Number of write operations=2        Job Counters                Launched map tasks=2                Launched reduce tasks=1                Data-local map tasks=2                Total time spent by all maps in occupied slots (ms)=5134                Total time spent by all reduces in occupied slots (ms)=2562                Total time spent by all map tasks (ms)=5134                Total time spent by all reduce tasks (ms)=2562                Total vcore-milliseconds taken by all map tasks=5134                Total vcore-milliseconds taken by all reduce tasks=2562                Total megabyte-milliseconds taken by all map tasks=5257216                Total megabyte-milliseconds taken by all reduce tasks=2623488        Map-Reduce Framework                Map input records=1                Map output records=1                Map output bytes=5                Map output materialized bytes=39                Input split bytes=236                Combine input records=0                Combine output records=0                Reduce input groups=1                Reduce shuffle bytes=39                Reduce input records=1                Reduce output records=1                Spilled Records=2                Shuffled Maps =2                Failed Shuffles=0                Merged Map outputs=2                GC time elapsed (ms)=196                CPU time spent (ms)=2550                Physical memory (bytes) snapshot=1503531008                Virtual memory (bytes) snapshot=8690847744                Total committed heap usage (bytes)=1791492096        Shuffle Errors                BAD_ID=0                CONNECTION=0                IO_ERROR=0                WRONG_LENGTH=0                WRONG_MAP=0                WRONG_REDUCE=0        File Input Format Counters                Bytes Read=3        File Output Format Counters                Bytes Written=3DataLines       Maps    Reduces AvgTime (milliseconds)1               2       1       15357bash-4.2$

以上结果表示平均作业完成时间是15秒

打开网址 http://*.*.*.*:8088/cluster ，可以查看执行的任务信息：

Hadoop File System也生成了相应的目录，但是目录里面的内容是空的，如下：

参考内容：　　　　；　　　　；　　

转载于:https://www.cnblogs.com/ratels/p/10641587.html

你可能感兴趣的文章