引言:從單機到分布式容器架構的演進
在傳統Web應用部署中,我們常常面臨環境不一致、部署效率低下等問題。我曾經維護過一個需要手動在5臺服務器上重復部署的游戲項目,每次發布都如同走鋼絲。本文將詳細分享如何基于CentOS系統,構建完整的分布式Docker架構,實現GitLab+Jenkins+生產環境的三節點CI/CD流水線,最終成功部署Web游戲項目的全過程。
第一部分:架構設計與環境規劃
1.1 分布式節點規劃
??三節點架構??:
??GitLab節點??:192.168.1.101(4核8G內存,200G存儲)
??Jenkins節點??:192.168.1.102(4核8G內存)
??生產環境節點??:192.168.1.103(8核16G內存,NVIDIA T4 GPU)
# 各節點基礎環境準備(CentOS 7.9)
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install -y docker-ce docker-ce-cli containerd.io
sudo systemctl enable --now docker
1.2 網絡拓撲設計
圖:三節點Docker Swarm網絡拓撲
關鍵配置:
使用Overlay網絡實現跨主機容器通信
為每個服務配置獨立的子網
通過Nginx實現服務發現和負載均衡
# 初始化Docker Swarm集群(在生產節點)
docker swarm init --advertise-addr 192.168.1.103# 在其他節點加入集群
docker swarm join --token SWMTKN-1-xxx 192.168.1.103:2377
第二部分:核心組件部署
2.1 GitLab容器化部署(192.168.1.101)
# 創建數據卷目錄
mkdir -p /gitlab/{config,logs,data}# 啟動GitLab容器
docker run -d \--hostname gitlab.example.com \--publish 8443:443 --publish 8080:80 --publish 8022:22 \--name gitlab \--restart always \--volume /gitlab/config:/etc/gitlab \--volume /gitlab/logs:/var/log/gitlab \--volume /gitlab/data:/var/opt/gitlab \--shm-size 256m \gitlab/gitlab-ce:15.11.8-ce.0
性能調優??:
修改/gitlab/config/gitlab.rb
unicorn['worker_processes'] = 4
postgresql['shared_buffers'] = "256MB"
sidekiq['concurrency'] = 10
2.2 Jenkins容器化部署(192.168.1.102)
# 自定義Jenkins Dockerfile
FROM jenkins/jenkins:2.414.3-lts-jdk11
USER root
RUN apt-get update && \apt-get install -y docker.io python3-pip && \pip3 install docker-compose
COPY plugins.txt /usr/share/jenkins/ref/plugins.txt
RUN jenkins-plugin-cli -f /usr/share/jenkins/ref/plugins.txt
USER jenkins
# 啟動Jenkins容器
docker run -d \--name jenkins \-p 8081:8080 -p 50000:50000 \-v /jenkins_home:/var/jenkins_home \-v /var/run/docker.sock:/var/run/docker.sock \--restart unless-stopped \my-jenkins-image
關鍵插件??:
Docker Pipeline
Blue Ocean
GitLab Plugin
SSH Pipeline Steps
第三部分:Web游戲項目容器化
3.1 游戲架構分析
項目采用前后端分離架構:
?前端??:Unity WebGL構建??
后端??:Node.js游戲服務器
??數據庫??:MongoDB分片集群
??實時通信??:WebSocket
3.2 多服務Docker Compose編排
version: '3.8'services:game-frontend:image: registry.example.com/game-webgl:${TAG}deploy:replicas: 3update_config:parallelism: 1delay: 10srestart_policy:condition: on-failurenetworks:- game-networkgame-server:image: registry.example.com/game-server:${TAG}environment:- NODE_ENV=production- MONGO_URI=mongodb://mongo1:27017,mongo2:27017,mongo3:27017/game?replicaSet=rs0deploy:replicas: 2networks:- game-networkdepends_on:- mongo1- mongo2- mongo3mongo1:image: mongo:5.0command: mongod --replSet rs0 --bind_ip_allvolumes:- mongo1-data:/data/dbnetworks:- game-network# mongo2和mongo3配置類似...nginx:image: nginx:1.23ports:- "80:80"- "443:443"volumes:- ./nginx.conf:/etc/nginx/nginx.confdepends_on:- game-frontend- game-servernetworks:- game-networknetworks:game-network:driver: overlayvolumes:mongo1-data:mongo2-data:mongo3-data:
3.3 Nginx關鍵配置
# nginx.conf
upstream game_servers {server game-server:3000;
}server {listen 80;server_name game.example.com;location / {root /usr/share/nginx/html;try_files $uri /index.html;}location /api {proxy_pass http://game_servers;proxy_http_version 1.1;proxy_set_header Upgrade $http_upgrade;proxy_set_header Connection "upgrade";}
}
第四部分:CI/CD流水線實現
4.1 GitLab Runner配置
# 在Jenkins節點注冊GitLab Runner
docker run -d --name gitlab-runner \-v /var/run/docker.sock:/var/run/docker.sock \-v /gitlab-runner/config:/etc/gitlab-runner \gitlab/gitlab-runner:v15.11.0docker exec -it gitlab-runner gitlab-runner register
4.2 完整的Jenkinsfile
pipeline {agent {docker {image 'node:18'args '-v $HOME/.npm:/root/.npm'}}environment {DOCKER_REGISTRY = 'registry.example.com'PROJECT = 'web-game'DEPLOY_NODE = '192.168.1.103'SSH_CREDS = credentials('prod-ssh-key')}stages {stage('Checkout') {steps {git branch: 'main', url: 'http://192.168.1.101:8080/game/web-game.git',credentialsId: 'gitlab-cred'}}stage('Build Frontend') {steps {dir('webgl-build') {sh 'npm install'sh 'npm run build'sh 'docker build -t $DOCKER_REGISTRY/$PROJECT-webgl:$BUILD_NUMBER .'}}}stage('Build Server') {steps {dir('server') {sh 'npm install --production'sh 'docker build -t $DOCKER_REGISTRY/$PROJECT-server:$BUILD_NUMBER .'}}}stage('Push Images') {steps {withCredentials([usernamePassword(credentialsId: 'docker-registry',usernameVariable: 'DOCKER_USER',passwordVariable: 'DOCKER_PASS')]) {sh 'echo $DOCKER_PASS | docker login -u $DOCKER_USER --password-stdin $DOCKER_REGISTRY'sh 'docker push $DOCKER_REGISTRY/$PROJECT-webgl:$BUILD_NUMBER'sh 'docker push $DOCKER_REGISTRY/$PROJECT-server:$BUILD_NUMBER'}}}stage('Deploy to Production') {steps {sshagent(['prod-ssh-key']) {sh """ssh -o StrictHostKeyChecking=no ubuntu@$DEPLOY_NODE \"export TAG=$BUILD_NUMBER && \docker stack deploy -c docker-compose.prod.yml game""""}}}}post {failure {slackSend channel: '#game-alerts',message: "構建失敗: ${env.JOB_NAME} #${env.BUILD_NUMBER}"}success {slackSend channel: '#game-deploy',message: "新版本已上線: ${env.BUILD_NUMBER}"}}
}
4.3 關鍵優化點
??構建緩存??:復用node_modules目錄加速構建
??安全憑證??:使用Jenkins Credential管理SSH密鑰
??回滾機制??:保留最近5個可用鏡像版本
??通知系統??:集成Slack實現構建狀態實時通知
第五部分:監控與運維方案
5.1 分布式監控體系
# docker-compose.monitor.yml
version: '3.8'services:prometheus:image: prom/prometheusports:- "9090:9090"volumes:- ./prometheus.yml:/etc/prometheus/prometheus.ymldeploy:placement:constraints: [node.role == manager]grafana:image: grafana/grafanaports:- "3000:3000"volumes:- grafana-data:/var/lib/grafanadepends_on:- prometheusnode-exporter:image: prom/node-exporterdeploy:mode: globalvolumes:- /proc:/host/proc:ro- /sys:/host/sys:ro- /:/rootfs:rovolumes:grafana-data:
第六部分:踩坑經驗與進階思考
6.1 典型問題解決方案
??問題1??:跨主機容器網絡不通
??現象??:Swarm集群中容器無法通過服務名互相訪問
??解決方案??:檢查防火墻規則:
sudo firewall-cmd --permanent --add-port=2377/tcp
sudo firewall-cmd --permanent --add-port=7946/tcp
sudo firewall-cmd --permanent --add-port=7946/udp
sudo firewall-cmd --permanent --add-port=4789/udp
sudo firewall-cmd --reload
驗證Overlay網絡狀態:
docker network inspect game-network
優化方案??:調整Runner配置:
[[runners]]name = "game-runner"url = "http://192.168.1.101:8080"executor = "docker"[runners.docker]tls_verify = falseimage = "alpine:3.16"privileged = truedisable_cache = falsevolumes = ["/cache", "/var/run/docker.sock:/var/run/docker.sock"]shm_size = "512m"
增加Runner并發數
6.2 性能優化成果
指標 | 優化前 | 優化后 |
---|---|---|
構建時間 | 23分鐘 | 8分鐘 |
部署時間 | 15分鐘 | 45秒 |
鏡像大小 | 1.8GB | 420MB |
啟動時間 | 30秒 | 3秒 |
結語:從實踐到生產
這套基于CentOS的分布式Docker架構已經穩定運行6個月,支撐了日均50萬PV的游戲服務。關鍵收獲包括:
??基礎設施即代碼??:所有環境配置版本化控制
??不可變基礎設施??:通過鏡像而非修改運行環境來變更應用
??自動化一切??:從代碼提交到生產部署的全流程自動化
未來規劃:
遷移到Kubernetes實現更高級的編排能力
引入服務網格(Service Mesh)管理微服務通信
實現基于Prometheus的自動擴縮容
希望這篇結合實戰經驗的詳細分享,能為你的分布式容器化之路提供參考。歡迎在評論區交流你在CI/CD實踐中遇到的挑戰和解決方案!