啟動hdfs
略
http://blog.csdn.net/zengmingen/article/details/53006541
啟動spark
略
安裝:http://blog.csdn.net/zengmingen/article/details/72123717
spark-shell:http://blog.csdn.net/zengmingen/article/details/72162821
準備數據
vi wordcount.txt
hello zeng
hello miao
hello gen
hello zeng
hello wen
hello biao
zeng miao gen
zeng wen biao
lu ting ting
zhang xiao zhu
chang sheng xiang qi lai
zhu ye su ai ni
上傳到hdfs
hdfs dfs -put wordcount.txt /
編寫代碼
用scala語言,在spark-shell命令窗下
sc.textFile("hdfs://nbdo1:9000/wordcount.txt")
.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_)
.saveAsTextFile("hdfs://nbdo1:9000/out")
運行結果
補充:
將運行結果保存到一個文件。點擊閱讀擴展
代碼:
sc.textFile("hdfs://nbdo1:9000/wordcount.txt")
.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_)
.coalesce(1,true).saveAsTextFile("hdfs://nbdo1:9000/out2")
運行結果
-------------
更多的Java,Android,大數據,J2EE,Python,數據庫,Linux,Java架構師,教程,視頻請訪問:
http://www.cnblogs.com/zengmiaogen/p/7083694.html