jar包在Hadoop集群上測試(MapReduce)

本片使用MapReduce——統計輸出給定的文本文檔每一個單詞出現的總次數的案例進行，jar包在集群上測試

1、添加打包插件依賴

    <build><plugins><plugin><artifactId>maven-compiler-plugin</artifactId><version>3.6.2</version>	//這里換成對應版本<configuration><source>1.8</source><target>1.8</target></configuration></plugin><plugin><artifactId>maven-assembly-plugin </artifactId><configuration><descriptorRefs><descriptorRef>jar-with-dependencies</descriptorRef></descriptorRefs><archive><manifest><mainClass>com.lizhengi.mr.WordcountDriver</mainClass>  // 此處要換成自己工程的名字</manifest></archive></configuration><executions><execution><id>make-assembly</id><phase>package</phase><goals><goal>single</goal></goals></execution></executions></plugin></plugins></build>

2、更改WcDriver

將

FileInputFormat.setInputPaths(job, "/Users/marron27/test/input");
FileOutputFormat.setOutputPath(job, new Path("/Users/marron27/test/output"));

更改為

 FileInputFormat.setInputPaths(job, new Path(args[0]));FileOutputFormat.setOutputPath(job, new Path(args[1]));

3、將程序打成jar包，然后拷貝到Hadoop集群中

選中maven工程
選擇Hadoop_API>>Lifecycle>>package

完成打包

4、修改不帶依賴的jar包名稱為wc.jar，并拷貝該jar包到Hadoop集群

mv Hadoop-API-1.0-SNAPSHOT.jar wc.jar
scp wc.jar root@Carlota1:/root/test/input

5、新建測試用例，并上傳到HDFS

ssh root@Carlota1
hadoop fs -copyFromLocal hello.txt /demo/test/input

6、執行WordCount程序

hadoop jar wc.jar com.lizhengi.mapreduce.WcDriver /demo/test/input /demo/test/output
這里我是遇到了一個卡在INFO mapreduce.Job: Running job: job_1595222530661_0003的問題，然后通過修改 mapred-site.xml解決
執行結束后，下載結果到本地hadoop fs -copyToLocal /demo/test/output /root/test/output
cat /root/test/output part-r-00000

flume	2
hadoop	2
hdfs	1
hive	1
kafka	2
mapreduce	1
spark 	1
spring	1
take	2
tomcat		2

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/535810.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/535810.shtml
英文地址，請注明出處：http://en.pswp.cn/news/535810.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！