Hadoop前面安裝的集群是2.6版本,現在升級到2.7版本。
注意,這個集群上有運行Hbase,所以,升級前后,需要啟停Hbase。
更多安裝步驟,請參考:
Hadoop集群(一) Zookeeper搭建
Hadoop集群(二) HDFS搭建
Hadoop集群(三) Hbase搭建
升級步驟如下:
集群IP列表
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | Namenode: 192.168.143.46 192.168.143.103 Journalnode: 192.168.143.101 192.168.143.102 192.168.143.103 Datanode&Hbase?regionserver: 192.168.143.196 192.168.143.231 192.168.143.182 192.168.143.235 192.168.143.41 192.168.143.127 Hbase?master: 192.168.143.103 192.168.143.101 Zookeeper: 192.168.143.101 192.168.143.102 192.168.143.103 |
1. 首先確定hadoop運行的路徑,將新版本的軟件分發到每個節點的這個路徑下,并解壓。
1 2 3 4 5 6 7 8 | #?ll?/usr/ local /hadoop/ total?493244 drwxrwxr-x?9?root?root??????4096?Mar?21??2017?hadoop-release?->hadoop-2.6.0-EDH-0u1-SNAPSHOT-HA-SECURITY drwxr-xr-x?9?root?root??????4096?Oct?11?11:06?hadoop-2.7.1 -rw-r --r--?1?root?root?194690531?Oct??9?10:55?hadoop-2.7.1.tar.gz drwxrwxr-x?7?root?root??????4096?May?21??2016?hbase-1.1.3 -rw-r --r--?1?root?root?128975247?Apr?10??2017?hbase-1.1.3.tar.gz lrwxrwxrwx?1?root?root????????29?Apr?10??2017?hbase-release?->?/usr/ local /hadoop/hbase-1.1.3 |
由于是升級,配置文件完全不變,將原hadoop-2.6.0下的etc/hadoop路徑完全拷貝/替換到hadoop-2.7.1下。
至此,升級前的準備就已經完成了。
?
下面開始升級操作過程。全程都是在一個中轉機上執行的命令,通過shell腳本執行,省去頻繁ssh登陸的操作。
## 停止hbase,hbase用戶執行?
2. 停止Hbase master,hbase用戶執行
狀態檢查,確認master,先停standby master
1 | http://192.168.143.101:16010/master-status |
1 2 3 4 5 | master: ssh?-t?-q?192.168.143.103??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?master" ssh?-t?-q?192.168.143.103??sudo?su?-l?hbase?-c? "jps" ssh?-t?-q?192.168.143.101??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?master" ssh?-t?-q?192.168.143.101??sudo?su?-l?hbase?-c? "jps" |
3. 停止Hbase regionserver,hbase用戶執行
1 2 3 4 5 6 | ssh?-t?-q?192.168.143.196??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?regionserver" ssh?-t?-q?192.168.143.231??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?regionserver" ssh?-t?-q?192.168.143.182??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?regionserver" ssh?-t?-q?192.168.143.235??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?regionserver" ssh?-t?-q?192.168.143.41???sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?regionserver" ssh?-t?-q?192.168.143.127??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?stop\?regionserver" |
檢查運行狀態
1 2 3 4 5 6 | ssh?-t?-q?192.168.143.196??sudo?su?-l?hbase?-c? "jps" ?ssh?-t?-q?192.168.143.231??sudo?su?-l?hbase?-c? "jps" ssh?-t?-q?192.168.143.182??sudo?su?-l?hbase?-c? "jps" ssh?-t?-q?192.168.143.235??sudo?su?-l?hbase?-c? "jps" ssh?-t?-q?192.168.143.41???sudo?su?-l?hbase?-c? "jps" ssh?-t?-q?192.168.143.127??sudo?su?-l?hbase?-c? "jps" |
## 停止服務--HDFS
4. 先確認,active的namenode,網頁確認.后續要先啟動這個namenode
1 | https://192.168.143.46:50470/dfshealth.html#tab-overview |
5. 停止NameNode,hdfs用戶執行
NN: 先停standby namenode
1 2 3 4 5 | ssh?-t?-q?192.168.143.103??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?namenode" ssh?-t?-q?192.168.143.46???sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?namenode" 檢查狀態 ssh?-t?-q?192.168.143.103??sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.46???sudo?su?-l?hdfs?-c? "jps" |
6. 停止DataNode,hdfs用戶執行
1 2 3 4 5 6 | ssh?-t?-q?192.168.143.196??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?datanode" ssh?-t?-q?192.168.143.231??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?datanode" ssh?-t?-q?192.168.143.182??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?datanode" ssh?-t?-q?192.168.143.235??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?datanode" ssh?-t?-q?192.168.143.41???sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?datanode" ssh?-t?-q?192.168.143.127??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?datanode" |
7. 停止ZKFC,hdfs用戶執行
1 2 | ssh?-t?-q?192.168.143.46???sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?zkfc" ssh?-t?-q?192.168.143.103??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?zkfc" |
8.停止JournalNode,hdfs用戶執行
1 2 3 4 | JN: ssh?-t?-q?192.168.143.101??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?journalnode" ssh?-t?-q?192.168.143.102??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?journalnode" ssh?-t?-q?192.168.143.103??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?stop\?journalnode" |
### 備份NameNode的數據,由于生產環境,原有的數據需要備份。以備升級失敗回滾。
9. 備份namenode1
1 2 | ssh?-t?-q?192.168.143.46? "cp?-r?/data1/dfs/name????/data1/dfs/name.bak.20171011-2;ls?-al?/data1/dfs/;du?-sm?/data1/dfs/*" ?ssh?-t?-q?192.168.143.46? "cp?-r?/data2/dfs/name????/data2/dfs/name.bak.20171011-2;ls?-al?/data1/dfs/;du?-sm?/data1/dfs/*" |
10. 備份namenode2
1 2 | ssh?-t?-q?192.168.143.103? "cp?-r?/data1/dfs/name /data1/dfs/name.bak.20171011-2;ls?-al?/data1/dfs/;du?-sm?/data1/dfs/*" |
11. 備份journal
1 2 3 | ssh?-t?-q?192.168.143.101? "cp?-r?/data1/journalnode???/data1/journalnode.bak.20171011;ls?-al?/data1/dfs/;du?-sm?/data1/*" ssh?-t?-q?192.168.143.102? "cp?-r?/data1/journalnode???/data1/journalnode.bak.20171011;ls?-al?/data1/dfs/;du?-sm?/data1/*" ssh?-t?-q?192.168.143.103? "cp?-r?/data1/journalnode???/data1/journalnode.bak.20171011;ls?-al?/data1/dfs/;du?-sm?/data1/*" |
journal路徑,可以查看hdfs-site.xml文件
1 2 | dfs.journalnode.edits.dir:?? /data1/journalnode |
### 升級相關
12. copy文件(已提前處理,參考第一步)
切換軟連接到2.7.1版本
1 | ssh?-t?-q?$h? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" |
13. 切換文件軟鏈接,root用戶執行
1 2 3 4 5 6 7 8 9 10 | ssh?-t?-q?192.168.143.46??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.103??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.101??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.102??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.196??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.231??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.182??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.235??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.41???? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" ssh?-t?-q?192.168.143.127??? "cd?/usr/local/hadoop;?rm?hadoop-release;?ln?-s?hadoop-2.7.1?hadoop-release" |
確認狀態
1 2 3 4 5 6 7 8 9 10 | ssh?-t?-q?192.168.143.46???? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.103??? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.101??? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.102??? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.196??? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.231??? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.182??? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.235??? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.41???? "cd?/usr/local/hadoop;?ls?-al" ssh?-t?-q?192.168.143.127??? "cd?/usr/local/hadoop;?ls?-al" |
### 啟動HDFS,hdfs用戶執行
14. 啟動JournalNode?
1 2 3 4 | JN: ssh?-t?-q?192.168.143.101??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?journalnode" ssh?-t?-q?192.168.143.102??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?journalnode" ssh?-t?-q?192.168.143.103??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?journalnode" |
1 2 3 | ssh?-t?-q?192.168.143.101??sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.102??sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.103??sudo?su?-l?hdfs?-c? "jps" |
15. 啟動第一個NameNode
1 2 3 | ssh?192.168.143.46 su?-?hdfs /usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh?start?namenode?-upgrade |
16. 確認狀態,在狀態完全OK之后,才可以啟動另一個namenode
1 | https://192.168.143.46:50470/dfshealth.html#tab-overview |
17. 啟動第一個ZKFC
1 2 3 | su?-?hdfs /usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh?start?zkfc 192.168.143.46 |
18. 啟動第二個NameNode
1 2 3 4 | ssh?192.168.143.103 su?-?hdfs /usr/ local /hadoop/hadoop-release/bin/hdfs?namenode?-bootstrapStandby /usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh?start?namenode |
19. 啟動第二個ZKFC
1 2 3 | ssh?192.168.143.103 su?-?hdfs /usr/ local /hadoop/hadoop-release/sbin/hadoop-daemon.sh?start?zkfc |
20. 啟動DataNode
1 2 3 4 5 6 | ssh?-t?-q?192.168.143.196??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?datanode" ssh?-t?-q?192.168.143.231??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?datanode" ssh?-t?-q?192.168.143.182??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?datanode" ssh?-t?-q?192.168.143.235??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?datanode" ssh?-t?-q?192.168.143.41???sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?datanode" ssh?-t?-q?192.168.143.127??sudo?su?-l?hdfs?-c? "/usr/local/hadoop/hadoop-release/sbin/hadoop-daemon.sh\?start\?datanode" |
確認狀態
1 2 3 4 5 6 | ssh?-t?-q?192.168.143.196??sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.231??sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.182??sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.235??sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.41???sudo?su?-l?hdfs?-c? "jps" ssh?-t?-q?192.168.143.127??sudo?su?-l?hdfs?-c? "jps" |
21. 一切正常之后,啟動hbase, hbase用戶執行
啟動hbase master,最好先啟動原來的active master。
1 2 | ssh?-t?-q?192.168.143.101??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?master" ssh?-t?-q?192.168.143.103??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?master" |
啟動Hbase regionserver
1 2 3 4 5 6 | ssh?-t?-q?192.168.143.196??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?regionserver" ssh?-t?-q?192.168.143.231??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?regionserver" ssh?-t?-q?192.168.143.182??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?regionserver" ssh?-t?-q?192.168.143.235??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?regionserver" ssh?-t?-q?192.168.143.41???sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?regionserver" ssh?-t?-q?192.168.143.127??sudo?su?-l?hbase?-c? "/usr/local/hadoop/hbase-release/bin/hbase-daemon.sh\?start\?regionserver" |
22. Hbase region需要手動Balance開啟、關閉
需要登錄HBase Shell運行如下命令
開啟
balance_switch true
關閉
balance_switch false
?
23. 本次不執行,系統運行一周,確保系統運行穩定,再執行Final。
注意:這期間,磁盤空間可能會快速增長。在執行完final之后,會釋放一部分空間。
Finallize upgrade: hdfs dfsadmin -finalizeUpgrade ?
http://blog.51cto.com/hsbxxl/1976472
?