環境win7+vamvare10+centos7
一、新建三臺centos7 64位的虛擬機
master 192.168.137.100 root/123456 node1 192.168.137.101 root/123456 node2 192.168.137.102 root/123456
?
二、關閉三臺虛擬機的防火墻,在每臺虛擬機里面執行:
systemctl stop firewalld.service
systemctl disable firewalld.service
?
三、在三臺虛擬機里面的/etc/hosts添加三行
192.168.137.100 master 192.168.137.101 node1 192.168.137.102 node2
?
四、為三臺機器設置ssh免密登錄
? 1、CentOS7默認沒有啟動ssh無密登錄,去掉/etc/ssh/sshd_config其中1行的注釋,每臺服務器都要設置
#PubkeyAuthentication yes
?然后重啟ssh服務
systemctl restart sshd
?
2、在master機器的/root執行:ssh-keygen -t rsa命令,一直按回車。三臺機器都要執行。
[root@master ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:aMUO8b/EkylqTMb9+71ePnQv0CWQohsaMeAbMH+t87M root@master The key's randomart image is: +---[RSA 2048]----+ | o ... . | | = o= . o | | + oo=. . . | | =.Boo o . .| | . OoSoB . o | | =.+.+ o. ...| | + o o .. +| | . o . ..+.| | E ....+oo| +----[SHA256]-----+
? 3、在master上合并公鑰到authorized_keys文件
[root@master ~]# cd /root/.ssh/ [root@master .ssh]# ll total 8 -rw-------. 1 root root 1679 Apr 19 11:10 id_rsa -rw-r--r--. 1 root root 393 Apr 19 11:10 id_rsa.pub [root@master .ssh]# cat id_rsa.pub>> authorized_keys
? 4、將master的authorized_keys復制到node1和node2節點
scp /root/.ssh/authorized_keys root@192.168.137.101:/root/.ssh/ scp /root/.ssh/authorized_keys root@192.168.137.102:/root/.ssh/
? 5、測試:
[root@master ~]# ssh root@192.168.137.101 Last login: Thu Apr 19 11:41:23 2018 from 192.168.137.100 [root@node1 ~]#
?
[root@master ~]# ssh root@192.168.137.102 Last login: Mon Apr 23 10:40:38 2018 from 192.168.137.1 [root@node2 ~]#
?
五、為三臺機器安裝jdk
1、jdk下載地址:https://pan.baidu.com/s/1-fhy_zbGbEXR1SBK8V7aNQ
2、創建目錄:/home/java
mkdir -p /home/java
3、將下載的文件jdk-7u79-linux-x64.tar.gz,放到/home/java底下,并執行以下命令:
tar -zxf jdk-7u79-linux-x64.tar.gz
rm -rf tar -zxf jdk-7u79-linux-x64.tar.gz
4、配置環境變量:
?vi /etc/profile,添加以下內容
export JAVA_HOME=/home/java/jdk1.7.0_79 export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin
然后:source /etc/profile
測試:
[root@master jdk1.7.0_79]# java -version java version "1.7.0_79" Java(TM) SE Runtime Environment (build 1.7.0_79-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode) [root@master jdk1.7.0_79]#
?
六、安裝hadoop 2.7(只在Master服務器解壓,再復制到Slave服務器)
?1、創建/home/hadoop目錄
mkdir -p /home/hadoop
2、將hadoop-2.7.0.tar.gz放到/home/hadoop下并解壓
tar -zxf hadoop-2.7.0.tar.gz
3、在/home/hadoop目錄下創建數據存放的文件夾,tmp、hdfs/data、hdfs/name
[root@master hadoop]# mkdir tmp [root@master hadoop]# mkdir -p hdfs/data [root@master hadoop]# mkdir -p hdfs/name
4、配置配置/home/hadoop/hadoop-2.7.0/etc/hadoop目錄下的core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration><property><!-- 這里的值指的是默認的HDFS路徑。當有多個HDFS集群同時工作時,用戶如果不寫集群名稱,那么默認使用哪個哪?在這里指定!該值來自于hdfs-site.xml中的配置 --><name>fs.defaultFS</name><value>hdfs://192.168.137.100:9000</value></property><property><name>fs.default.name</name><value>hdfs://192.168.137.100:9000</value></property><property><name>hadoop.tmp.dir</name><value>file:/home/hadoop/tmp</value></property><property><!--流文件的緩沖區--><name>io.file.buffer.size</name><value>131702</value></property> </configuration>
?
5、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目錄下的hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><!-- 設置secondarynamenode的http通訊地址 --><property><name>dfs.namenode.secondary.http-address</name><value>192.168.137.100:9001</value></property><property><name>dfs.namenode.name.dir</name><value>file:/home/hadoop/dfs/name</value></property><property><name>dfs.datanode.data.dir</name><value>file:/home/hadoop/dfs/data</value></property> <!--指定DataNode存儲block的副本數量--><property><name>dfs.replication</name><value>2</value></property> <!--這里抽象出兩個NameService實際上就是給這個HDFS集群起了個別名--><property><name>dfs.nameservices</name><value>mycluster</value></property><property><name>dfs.webhdfs.enabled</name><value>true</value></property> </configuration>
6、將/home/hadoop/hadoop-2.7.0/etc/hadoop目錄下的mapred-site.xml.template復制一份,并命名成mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.jobhistory.address</name><value>192.168.137.100:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>192.168.137.100:19888</value></property> </configuration>
7、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目錄下的yarn-site.xml
<?xml version="1.0"?> <configuration> <!--NodeManager上運行的附屬服務。需配置成mapreduce_shuffle,才可運行MapReduce程序,默認值:“”--><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property> <!--ResourceManager 對客戶端暴露的地址。客戶端通過該地址向RM提交應用程序,殺死應用程序等默認值:${yarn.resourcemanager.hostname}:8032--><property><name>yarn.resourcemanager.address</name><value>192.168.137.100:8032</value></property> <!--ResourceManager 對ApplicationMaster暴露的訪問地址。ApplicationMaster通過該地址向RM申請資源、釋放資源等。默認值:${yarn.resourcemanager.hostname}:8030--><property><name>yarn.resourcemanager.scheduler.address</name><value>192.168.137.100:8030</value></property> <!--ResourceManager 對NodeManager暴露的地址。NodeManager通過該地址向RM匯報心跳,領取任務等。默認值:${yarn.resourcemanager.hostname}:8031--><property><name>yarn.resourcemanager.resource-tracker.address</name><value>192.168.137.100:8031</value></property> <!--ResourceManager 對管理員暴露的訪問地址。管理員通過該地址向RM發送管理命令等。默認值:${yarn.resourcemanager.hostname}:8033--><property><name>yarn.resourcemanager.admin.address</name><value>192.168.137.100:8033</value></property> <!--ResourceManager對外web ui地址。用戶可通過該地址在瀏覽器中查看集群各類信息。默認值:${yarn.resourcemanager.hostname}:8088--><property><name>yarn.resourcemanager.webapp.address</name><value>192.168.137.100:8088</value></property> <!--NodeManager總的可用物理內存。注意,該參數是不可修改的,一旦設置,整個運行過程中不可動態修改。另外,該參數的默認值是8192MB,即使你的機器內存不夠8192MB,YARN也會按照這些內存來使用(傻不傻?),因此,這個值通過一定要配置。--><property><name>yarn.nodemanager.resource.memory-mb</name><value>768</value></property> </configuration>
8、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目錄下hadoop-env.sh、yarn-env.sh的JAVA_HOME,不設置的話,啟動不了
export JAVA_HOME=/home/java/jdk1.7.0_79
9、配置/home/hadoop/hadoop-2.7.0/etc/hadoop目錄下的slaves,刪除默認的localhost,增加2個從節點
192.168.137.101 192.168.137.102
10、將配置好的Hadoop復制到各個節點對應位置上
scp -r /home/hadoop 192.168.137.101:/home/ scp -r /home/hadoop 192.168.137.102:/home/
11、在Master服務器啟動hadoop,從節點會自動啟動,進入/home/hadoop/hadoop-2.7.0
? 1)初始化:bin/hdfs namenode -format
? 2)全部啟動sbin/start-all.sh,也可以分開sbin/start-dfs.sh、sbin/start-yarn.sh
?
[root@master hadoop-2.7.0]# sbin/start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [master] master: starting namenode, logging to /home/hadoop/hadoop-2.7.0/logs/hadoop-root-namenode-master.out 192.168.137.101: starting datanode, logging to /home/hadoop/hadoop-2.7.0/logs/hadoop-root-datanode-node1.out 192.168.137.102: starting datanode, logging to /home/hadoop/hadoop-2.7.0/logs/hadoop-root-datanode-node2.out starting yarn daemons starting resourcemanager, logging to /home/hadoop/hadoop-2.7.0/logs/yarn-root-resourcemanager-master.out 192.168.137.101: starting nodemanager, logging to /home/hadoop/hadoop-2.7.0/logs/yarn-root-nodemanager-node1.out 192.168.137.102: starting nodemanager, logging to /home/hadoop/hadoop-2.7.0/logs/yarn-root-nodemanager-node2.out
?3)停止的話,輸入命令,sbin/stop-all.sh
?4)輸入命令,jps,可以看到相關信息
[root@master hadoop-2.7.0]# jps 1765 ResourceManager 2025 Jps
12、瀏覽器查看
? 1)resourcemanager.webapp.address的界面
?
?
?2)namenode的界面
?