目錄
一、準備工作
1、安裝jdk(每個節點都執行)
2、修改主機配置?(每個節點都執行)
3、配置ssh無密登錄?(每個節點都執行)
二、安裝Hadoop(每個節點都執行)
三、集群啟動配置(每個節點都執行)
1、core-site.xml
2、hdfs-site.xml
3、yarn-site.xml?
4、mapred-site.xml
5、workers
四、啟動集群和測試(每個節點都執行)
1、配置java環境
2、指定root啟動用戶?
3、啟動
3.1、如果集群是第一次啟動
3.2、啟動HDFS 在hadoop1節點
3.3、啟動YARN在配置ResourceManager的hadoop2節點
3.4、查看 HDFS的NameNode
3.5、查看YARN的ResourceManager
4、 測試
?4.1、測試
?4.2、文件存儲路徑
?4.3、統計文本個數
五、配置Hadoop腳本
1、啟動腳本hadoop.sh
2、查看進程腳本jpsall.sh
3、拷貝到其他服務器
一、準備工作
hadoop1 | hadoop2 | hadoop3 | |
IP | 192.168.139.176 | 192.168.139.214 | 192.168.139.215 |
HDFS | NameNode DataNode | DataNode | SecondaryNameNode DataNode |
YARN | NodeManager | ResourceManager NodeManager | NodeManager |
1、安裝jdk(每個節點都執行)
tar -zxf jdk-8u431-linux-x64.tar.gz
mv jdk1.8.0_431 /usr/local/java#進入/etc/profile.d目錄
vim java_env.sh#編輯環境變量
#java
JAVA_HOME=/usr/local/java
JRE_HOME=/usr/local/java/jre
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
PATH=$JAVA_HOME/bin:$PATH
export PATH JAVA_HOME CLASSPATH#刷新
source /etc/profile
2、修改主機配置?(每個節點都執行)
vim /etc/hosts192.168.139.176 hadoop1
192.168.139.214 hadoop2
192.168.139.215 hadoop3#修改主機名(每個節點對應修改)
vim /etc/hostname
hadoop1
注意:這里本地的host文件也要修改一下 ,后面訪問配置的是主機名,如果不配置,需修改為ip
3、配置ssh無密登錄?(每個節點都執行)
#生成密鑰
ssh-keygen -t rsa#復制到其他節點
ssh-copy-id hadoop1
ssh-copy-id hadoop2
ssh-copy-id hadoop3
二、安裝Hadoop(每個節點都執行)
tar -zxf hadoop-3.4.0.tar.gz
mv hadoop-3.4.0 /usr/local/#配置環境變量進入/etc/profile.d目錄vim hadoop_env.sh#添加如下內容
#hadoop
export HADOOP_HOME=/usr/local/hadoop-3.4.0
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin#查看版本
hadoop version
三、集群啟動配置(每個節點都執行)
修改/usr/local/hadoop-3.4.0/etc/hadoop目錄下
1、core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
--><!-- Put site-specific property overrides in this file. --><configuration><!-- 指定NameNode的地址 --><property><name>fs.defaultFS</name><value>hdfs://hadoop1:8020</value></property><!-- 指定hadoop數據的存儲目錄 --><property><name>hadoop.tmp.dir</name><value>/usr/local/hadoop-3.4.0/data</value></property><!-- 配置HDFS網頁登錄使用的靜態用戶為root ,實際生產請創建新用戶--><property><name>hadoop.http.staticuser.user</name><value>root</value></property></configuration>
2、hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
--><!-- Put site-specific property overrides in this file. --><configuration>
<!-- nn web端訪問地址--><property><name>dfs.namenode.http-address</name><value>hadoop1:9870</value></property><!-- 2nn web端訪問地址--><property><name>dfs.namenode.secondary.http-address</name><value>hadoop3:9868</value></property></configuration>
3、yarn-site.xml?
<?xml version="1.0"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
-->
<configuration><!-- Site specific YARN configuration properties --><!-- 指定MR走shuffle --><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><!-- 指定ResourceManager的地址--><property><name>yarn.resourcemanager.hostname</name><value>hadoop2</value></property><!-- 環境變量的繼承 --><property><name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value></property><!-- 開啟日志聚集功能 --><property><name>yarn.log-aggregation-enable</name><value>true</value></property><!-- 設置日志聚集服務器地址 --><property><name>yarn.log.server.url</name><value>http://hadoop102:19888/jobhistory/logs</value></property><!-- 設置日志保留時間為7天 --><property><name>yarn.log-aggregation.retain-seconds</name><value>604800</value></property>
</configuration>
4、mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration><!-- 指定MapReduce程序運行在Yarn上 --><property><name>mapreduce.framework.name</name><value>yarn</value></property><!-- 歷史服務器端地址 --><property><name>mapreduce.jobhistory.address</name><value>hadoop1:10020</value></property><!-- 歷史服務器web端地址 --><property><name>mapreduce.jobhistory.webapp.address</name><value>hadoop1:19888</value></property>
</configuration>
5、workers
hadoop1
hadoop2
hadoop3注意:該文件中添加的內容結尾不允許有空格,文件中不允許有空行
四、啟動集群和測試(每個節點都執行)
1、配置java環境
#修改這個文件/usr/local/hadoop/etc/hadoop/hadoop-env.shexport JAVA_HOME=/usr/local/java
2、指定root啟動用戶?
#在start-dfs.sh,stop-dfs.sh 添加如下內容 方法上面HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root在 start-yarn.sh stop-yarn.sh 添加如下內容 方法上面
YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root
注:hadoop默認情況下的是不支持root賬戶啟動的,在實際生產請創建用戶組和用戶,并且授予該用戶root的權限
3、啟動
3.1、如果集群是第一次啟動
需要在hadoop1節點格式化NameNode(注意:格式化NameNode,會產生新的集群id,導致NameNode和DataNode的集群id不一致,集群找不到已往數據。如果集群在運行過程中報錯,需要重新格式化NameNode的話,一定要先停止namenode和datanode進程,并且要刪除所有機器的data和logs目錄,然后再進行格式化。)
hdfs namenode -format
3.2、啟動HDFS 在hadoop1節點
/usr/local/hadoop-3.4.0/sbin/start-dfs.sh
3.3、啟動YARN在配置ResourceManager的hadoop2節點
/usr/local/hadoop-3.4.0/sbin/start-yarn.sh
3.4、查看 HDFS的NameNode
http://192.168.139.176:9870/
?
3.5、查看YARN的ResourceManager
http://192.168.139.214:8088
?
4、 測試
4.1、測試
#創建文件
hadoop fs -mkdir /input#創建文件
touch text.txt#上傳文件
hadoop fs -put text.txt /input#刪除
hadoop fs -rm -r /output
?
?4.2、文件存儲路徑
/usr/local/hadoop-3.4.0/data/dfs/data/current/BP-511066843-192.168.139.176-1734965488199/current/finalized/subdir0/subdir0
?4.3、統計文本個數
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.4.0.jar wordcount /input ?/output
五、配置Hadoop腳本
1、啟動腳本hadoop.sh
#!/bin/bashif [ $# -lt 1 ]
thenecho "No Args Input..."exit ;
ficase $1 in
"start")echo " =================== 啟動 hadoop集群 ==================="echo " --------------- 啟動 hdfs ---------------"ssh hadoop1 "/usr/local/hadoop-3.4.0/sbin/start-dfs.sh"echo " --------------- 啟動 yarn ---------------"ssh hadoop2 "/usr/local/hadoop-3.4.0/sbin/start-yarn.sh"echo " --------------- 啟動 historyserver ---------------"ssh hadoop1 "/usr/local/hadoop-3.4.0/bin/mapred --daemon start historyserver"
;;
"stop")echo " =================== 關閉 hadoop集群 ==================="echo " --------------- 關閉 historyserver ---------------"ssh hadoop1 "/usr/local/hadoop-3.4.0/bin/mapred --daemon stop historyserver"echo " --------------- 關閉 yarn ---------------"ssh hadoop2 "/usr/local/hadoop-3.4.0/sbin/stop-yarn.sh"echo " --------------- 關閉 hdfs ---------------"ssh hadoop1 "/usr/local/hadoop-3.4.0/sbin/stop-dfs.sh"
;;
*)echo "Input Args Error..."
;;
esac
#授權
chmod +x hadoop.sh
2、查看進程腳本jpsall.sh
#!/bin/bashfor host in hadoop1 hadoop2 hadoop3
doecho =============== $host ===============ssh $host jps
done
3、拷貝到其他服務器
scp root@hadoop1:/usr/local/hadoop-3.4.0 hadoop.sh jpsall.sh root@hadoop2:/usr/local/hadoop-3.4.0/scp root@hadoop1:/usr/local/hadoop-3.4.0 hadoop.sh jpsall.sh root@hadoop3:/usr/local/hadoop-3.4.0/