Hadoop集群部署教程-P6
Hadoop集群部署教程(續)
第二十一章:監控與告警系統集成
21.1 Prometheus監控體系搭建
-
Exporter部署:
# 部署HDFS Exporter wget https://github.com/prometheus/hdfs_exporter/releases/download/v1.1.6/hdfs_exporter-1.1.6.linux-amd64.tar.gz tar -xzf hdfs_exporter-1.1.6.linux-amd64.tar.gz nohup ./hdfs_exporter --namenode.address=master:9870 &
-
關鍵監控指標:
- HDFS存儲容量使用率
- DataNode存活狀態
- YARN資源分配率
21.2 Grafana可視化配置
-
儀表盤模板導入:
# 導入Hadoop官方模板(ID: 12239) grafana-cli plugins install grafana-piechart-panel
-
告警規則示例:
# alert_rules.yml groups: - name: HDFS-Alertsrules:- alert: HDFSStorageCriticalexpr: hdfs_capacity_used_percent > 90for: 5mlabels:severity: critical
第二十二章:備份與災難恢復
22.1 元數據備份方案
-
NameNode元數據備份:
# 創建檢查點備份 hdfs dfsadmin -fetchImage /backup/namenode/latest.fsimage # 定期合并edits日志[^1] hdfs dfsadmin -rollEdits
-
自動化備份腳本:
#!/bin/bash BACKUP_DIR="/backup/<span class="katex--inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>d</mi><mi>a</mi><mi>t</mi><mi>e</mi><mo>+</mo><mi>m</mi><mi>k</mi><mi>d</mi><mi>i</mi><mi>r</mi><mo>?</mo><mi>p</mi></mrow><annotation encoding="application/x-tex">(date +%Y%m%d)" mkdir -p </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"><span class="mopen">(</span><span class="mord mathnormal">d</span><span class="mord mathnormal">a</span><span class="mord mathnormal">t</span><span class="mord mathnormal">e</span><span class="mspace" style="margin-right:0.2222em;"><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="base"><span class="strut" style="height:0.7778em;vertical-align:-0.0833em;"><span class="mord mathnormal" style="margin-right:0.03148em;">mk</span><span class="mord mathnormal">d</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mspace" style="margin-right:0.2222em;"><span class="mbin">?</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"><span class="mord mathnormal">p</span></span></span></span></span>BACKUP_DIR hdfs dfsadmin -fetchImage <span class="katex--inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>B</mi><mi>A</mi><mi>C</mi><mi>K</mi><mi>U</mi><msub><mi>P</mi><mi>D</mi></msub><mi>I</mi><mi>R</mi><mi>s</mi><mi>c</mi><mi>p</mi><mo>?</mo><mi>r</mi></mrow><annotation encoding="application/x-tex">BACKUP_DIR scp -r </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"><span class="mord mathnormal" style="margin-right:0.05017em;">B</span><span class="mord mathnormal">A</span><span class="mord mathnormal" style="margin-right:0.07153em;">C</span><span class="mord mathnormal" style="margin-right:0.07153em;">K</span><span class="mord mathnormal" style="margin-right:0.10903em;">U</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.13889em;">P</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.02778em;">D</span></span></span></span><span class="vlist-s">?</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.07847em;">I</span><span class="mord mathnormal" style="margin-right:0.00773em;">R</span><span class="mord mathnormal">sc</span><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em;"><span class="mbin">?</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="base"><span class="strut" style="height:0.4306em;"><span class="mord mathnormal" style="margin-right:0.02778em;">r</span></span></span></span></span>BACKUP_DIR secondary_nn:/remote_backup/ </span></span></span></span></span></span></span></span></span></span></span></span></span>
22.2 數據恢復流程
-
災難恢復步驟:
graph TD A[停止集群服務] --> B[恢復fsimage] B --> C[應用edits日志] C --> D[啟動NameNode] D --> E[驗證數據完整性]
第二十三章:性能調優實戰
23.1 MapReduce參數優化
-
內存配置公式:
mapreduce.map.memory.mb = min(yarn.nodemanager.resource.memory-mb / containers-per-node,8GB # 經驗值上限 )
-
Shuffle階段優化:
<!-- mapred-site.xml --> <property><name>mapreduce.task.io.sort.mb</name><value>512</value> <!-- 提高排序內存 --> </property> <property><name>mapreduce.reduce.shuffle.parallelcopies</name><value>20</value> <!-- 增加并行拷貝數 --> </property>
23.2 硬件級優化建議
-
|磁盤配置方案:|||
磁盤類型 適用場景 RAID級別 SSD JournalNode RAID1 HDD DataNode存儲 JBOD -
網絡拓撲優化:
# 配置機架感知 /etc/hadoop/conf/topology.sh
第二十四章:版本遷移指南
24.1 滾動升級流程
-
兼容性檢查清單:
# 驗證HDFS版本 hdfs dfsadmin -report | grep 'Storage type' # 檢查API兼容性 hadoop checknative
-
分階段升級步驟:
# 第一階段:升級工具節點 sudo yum upgrade hadoop-client # 第二階段:升級DataNodes pdsh -w datanode[1-10] "sudo yum upgrade hadoop-hdfs-datanode"
24.2 回滾機制
-
版本回退操作:
# 停止服務 systemctl stop hadoop-yarn-resourcemanager # 降級安裝 yum downgrade hadoop-3.3.1 -y # 恢復配置 cp /backup/core-site.xml /etc/hadoop/conf/