基于CFSSL構建高可用ETCD集群全指南(含TLS證書管理)
摘要:本文深入講解使用CFSSL工具簽發TLS證書,并部署生產級高可用ETCD集群的完整流程。涵蓋證書全生命周期管理、集群配置優化及安全加固方案,適用于Kubernetes、分布式系統等場景。
一、環境規劃與架構設計
1.1 節點信息
節點IP | 角色 | 主機名 | 證書SAN擴展 |
---|---|---|---|
192.167.14.228 | ETCD Master | etcd-1 | IP:228,229,246 |
192.167.14.229 | ETCD Backup | etcd-2 | DNS:etcd-cluster |
192.167.14.246 | ETCD Backup | etcd-3 |
1.2 端口規劃
端口 | 協議 | 用途 |
---|---|---|
2379 | HTTPS | 客戶端通信 |
2380 | HTTPS | 節點間Peer通信 |
二、CFSSL證書管理全流程
2.1 安裝CFSSL工具鏈
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 \https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 \https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64chmod +x cfssl* && mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
2.2 生成根證書機構(CA)
mkdir -p ~/etcd_tls && cd ~/etcd_tls# CA配置文件
cat > ca-config.json <<EOF
{"signing": {"default": {"expiry": "876000h"},"profiles": {"kubernetes": {"expiry": "876000h","usages": ["signing", "key encipherment", "server auth", "client auth"]}}}
}
EOF# CA CSR請求文件
cat > ca-csr.json <<EOF
{"CN": "Kubernetes","key": {"algo": "rsa", "size": 2048},"names": [{"C": "CN", "L": "Xi'an", "O": "k8s", "OU": "Cluster"}]
}
EOF# 生成CA證書
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
2.3 簽發ETCD服務證書
cat > etcd-csr.json <<EOF
{"CN": "etcd","hosts": ["192.167.14.228","192.167.14.229", "192.167.14.246","etcd-cluster.local"],"key": {"algo": "rsa", "size": 2048},"names": [{"C": "CN", "L": "Xi'an", "O": "k8s", "OU": "ETCD"}]
}
EOFcfssl gencert -ca=ca.pem -ca-key=ca-key.pem \-config=ca-config.json -profile=kubernetes \etcd-csr.json | cfssljson -bare etcd
三、ETCD集群部署實戰
3.1 安裝ETCD二進制
ETCD_VER=v3.5.9
wget https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gztar -zxvf etcd-${ETCD_VER}-linux-amd64.tar.gz
mkdir -p /opt/etcd/{bin,cfg,ssl}
mv etcd-${ETCD_VER}-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/
3.2 節點配置模板(以etcd-1為例)
cat > /opt/etcd/cfg/etcd.conf <<EOF
[Member]
name = "etcd-1"
data-dir = "/var/lib/etcd"
listen-peer-urls = "https://192.167.14.228:2380"
listen-client-urls = "https://192.167.14.228:2379,https://127.0.0.1:2379"[Cluster]
initial-advertise-peer-urls = "https://192.167.14.228:2380"
advertise-client-urls = "https://192.167.14.228:2379"
initial-cluster = "etcd-1=https://192.167.14.228:2380,etcd-2=https://192.167.14.229:2380,etcd-3=https://192.167.14.246:2380"
initial-cluster-token = "etcd-cluster"
initial-cluster-state = "new"
EOF
3.3 Systemd服務配置
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=ETCD KeyValue Store
Documentation=https://etcd.io
After=network.target[Service]
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \--cert-file=/opt/etcd/ssl/etcd.pem \--key-file=/opt/etcd/ssl/etcd-key.pem \--peer-cert-file=/opt/etcd/ssl/etcd.pem \--peer-key-file=/opt/etcd/ssl/etcd-key.pem \--trusted-ca-file=/opt/etcd/ssl/ca.pem \--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536[Install]
WantedBy=multi-user.target
EOF
四、集群初始化與驗證
4.1 啟動集群
systemctl daemon-reload
systemctl enable --now etcd
4.2 集群健康檢查
ETCDCTL_API=3 /opt/etcd/bin/etcdctl \--cacert=/opt/etcd/ssl/ca.pem \--cert=/opt/etcd/ssl/etcd.pem \--key=/opt/etcd/ssl/etcd-key.pem \--endpoints="https://192.167.14.228:2379,https://192.167.14.229:2379,https://192.167.14.246:2379" \endpoint health --write-out=table
預期輸出:
+---------------------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+---------------------------+--------+-------------+-------+
| https://192.167.14.228:2379 | true | 14.567345ms | |
| https://192.167.14.229:2379 | true | 15.234512ms | |
| https://192.167.14.246:2379 | true | 16.789123ms | |
+---------------------------+--------+-------------+-------+
五、生產級優化建議
5.1 安全加固
# 啟用客戶端證書認證
--client-cert-auth=true# 定期輪換證書(每年)
openssl x509 -in /opt/etcd/ssl/etcd.pem -noout -dates
5.2 性能調優
# 調整后端存儲配額
--quota-backend-bytes=8589934592 # 8GB# 優化日志配置
--log-level=warn
--logger=zap
六、防火墻策略(生產必配)
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="192.167.14.0/24" port port="2379-2380" protocol="tcp" accept'
firewall-cmd --reload
七、故障排查指南
現象 | 排查命令 | 解決方案 |
---|---|---|
節點無法加入集群 | journalctl -u etcd -f | 檢查證書SAN與節點IP是否匹配 |
客戶端連接超時 | telnet <IP> 2379 | 驗證防火墻和SELinux策略 |
存儲空間不足 | du -sh /var/lib/etcd/member/ | 清理快照或擴容存儲 |
證書過期 | cfssl-certinfo -cert etcd.pem | 重新簽發證書并滾動重啟集群 |
擴展工具推薦:
- etcd-browser:Web管理界面
- etcd-backup-operator:自動化備份工具
通過本文,您已掌握企業級ETCD集群的構建與維護技能。建議定期進行災難恢復演練確保集群高可用!
如果本教程幫助您解決了問題,請點贊??收藏?支持!歡迎在評論區留言交流技術細節!欲了解密碼學知識,請訂閱《密碼學實戰》專欄 → 密碼學實戰