#作者:朱雷
文章目錄
- 一、部署排坑
- 1.1. configmap配置文件
- 1.2. pv文件
- 1.3. sc文件
- 1.4. serviceAccount文件
- 1.5. headless-service文件
- 1.6. sts文件
- 二、RabbitMQ集群部署關鍵問題總結
一、部署排坑
1.1. configmap配置文件
編輯cm.yaml 文件
apiVersion: v1
kind: ConfigMap
metadata:name: rabbitmq-confignamespace: rabbitmq-clu-9
data:rabbitmq.conf: |# 基礎配置listeners.tcp.default = 5672# management.listener.port = 15672# management.listener.ssl = falsedisk_free_limit.absolute = 1GBcluster_formation.peer_discovery_backend = k8s #指定集群發現通過k8s插件# cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8scluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname
# 后綴中rabbitmq-clu-9 為namespace 名稱保持和sts 文件中的namespace一致cluster_formation.k8s.hostname_suffix = .rabbitmq-headless.rabbitmq-clu-9.svc.cluster.localcluster_formation.discovery_retry_limit = 10
cluster_formation.discovery_retry_interval = 3000
# service_name與headless 中保持一致cluster_formation.k8s.service_name = rabbitmq-headlesscluster_formation.node_cleanup.interval = 30cluster_formation.node_cleanup.only_log_warning = falsecluster_formation.etcd.ssl_options.verify = verify_none# 內存配置vm_memory_high_watermark.relative = 0.6vm_memory_high_watermark_paging_ratio = 0.5# 日志配置log.console = truelog.console.level = debuglog.file = false# 臨時啟用調試日志log.connection.level = debuglog.channel.level = debuglog.queue.level = debug
1.2. pv文件
apiVersion: v1
kind: PersistentVolume
metadata:name: rabbitmq-cluster-pv-0
spec:capacity:storage: 1GiaccessModes:- ReadWriteOncestorageClassName: hostpath-storagehostPath:path: /tmp/rabbitmq/0type: DirectoryOrCreatenodeAffinity:required:nodeSelectorTerms:- matchExpressions:- key: kubernetes.io/hostnameoperator: Invalues:- 192.168.88.201 #修改為綁定的node的hostname
---
apiVersion: v1
kind: PersistentVolume
metadata:name: rabbitmq-cluster-pv-1
spec:capacity:storage: 1GiaccessModes:- ReadWriteOncestorageClassName: hostpath-storagehostPath:path: /tmp/rabbitmq/1type: DirectoryOrCreatenodeAffinity:required:nodeSelectorTerms:- matchExpressions:- key: kubernetes.io/hostnameoperator: Invalues:- 192.168.88.202 #修改為綁定的node的hostname
---
apiVersion: v1
kind: PersistentVolume
metadata:name: rabbitmq-cluster-pv-2
spec:capacity:storage: 1GiaccessModes:- ReadWriteOncestorageClassName: hostpath-storagehostPath:path: /tmp/rabbitmq/2type: DirectoryOrCreatenodeAffinity:required:nodeSelectorTerms:- matchExpressions:- key: kubernetes.io/hostnameoperator: Invalues:- 192.168.88.203 #修改為綁定的node的hostname
坑1:node選擇器綁定時使用的是集群node 的hostname,如node 的IP 和Hostname 不一致,填寫IP 會導致pod 為運行pending狀態。
1.3. sc文件
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:name: hostpath-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
1.4. serviceAccount文件
apiVersion: v1
kind: ServiceAccount
metadata:name: rabbitmqnamespace: rabbitmq-clu-9
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:name: rabbitmq-peer-discovery
rules:
- apiGroups: [""]resources: ["nodes", "pods", "endpoints"]verbs: ["get", "list", "watch"]
- apiGroups: ["discovery.k8s.io"]resources: ["endpointslices"] verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: rabbitmq-peer-discovery
subjects:
- kind: ServiceAccountname: rabbitmqnamespace: rabbitmq-clu-9
roleRef:kind: ClusterRolename: rabbitmq-peer-discoveryapiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:name: rabbitmq-configmapnamespace: rabbitmq-clu-9
rules:
- apiGroups: [""]resources: ["configmaps"]verbs: ["get", "update"]resourceNames: ["rabbitmq-config"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:name: rabbitmq-configmapnamespace: rabbitmq-clu-9
subjects:
- kind: ServiceAccountname: rabbitmq
roleRef:kind: Rolename: rabbitmq-configmapapiGroup: rbac.authorization.k8s.io
坑2:在啟動pod 的過程中如果集群角色rabbitmq-peer-discovery未授權nodes資源則pod一直報錯啟動失敗
1.5. headless-service文件
apiVersion: v1
kind: Service
metadata:name: rabbitmq-headlesslabels:app: rabbitmq
spec:clusterIP: Noneports:- name: amqpport: 5672targetPort: 5672- name: managementport: 15672 targetPort: 15672- name: epmdport: 4369targetPort: 4369- name: distport: 25672targetPort: 25672selector:app: rabbitmqpublishNotReadyAddresses: true
坑3:上面這幾個端口都需要暴漏出來,否則集群創建失敗
1.6. sts文件
apiVersion: apps/v1
kind: StatefulSet
metadata:name: rabbitmqnamespace: rabbitmq-clu-9labels:app: rabbitmq
spec:serviceName: rabbitmq-headlessreplicas: 3#podManagementPolicy: "Parallel"selector:matchLabels:app: rabbitmqtemplate:metadata:labels:app: rabbitmqspec:terminationGracePeriodSeconds: 10affinity:podAntiAffinity:preferredDuringSchedulingIgnoredDuringExecution:- weight: 100podAffinityTerm:labelSelector:matchExpressions:- key: appoperator: Invalues:- rabbitmqtopologyKey: kubernetes.io/hostnameserviceAccountName: rabbitmq containers:- name: rabbitmqimage: rabbitmq:3.8.27-managementimagePullPolicy: IfNotPresentports:- containerPort: 5672name: amqp- containerPort: 15672name: httpenv:- name: RABBITMQ_USE_LONGNAMEvalue: "true" - name: RABBITMQ_ERLANG_COOKIEvalue: "secret-cookie"- name: RABBITMQ_DEFAULT_USERvalue: "admin"- name: RABBITMQ_DEFAULT_PASSvalue: "admin123"volumeMounts:- name: configmountPath: /etc/rabbitmq/rabbitmq.confsubPath: rabbitmq.conf- name: datamountPath: /var/lib/rabbitmqreadOnly: false#readinessProbe:# exec:# command: ["rabbitmq-diagnostics", "status"]# initialDelaySeconds: 20# periodSeconds: 30livenessProbe:exec:command: ["rabbitmq-diagnostics", "ping"]initialDelaySeconds: 60periodSeconds: 30volumes:- name: configconfigMap:name: rabbitmq-configvolumeClaimTemplates:- metadata:name: datanamespace: rabbitmq-clu-9spec:accessModes: [ "ReadWriteOnce" ]resources:requests:storage: 1000MstorageClassName: hostpath-storage
坑4:RABBITMQ_USE_LONGNAME 環境變量需要指定為true,指定使用FQDN格式避免截全節點通信失敗,集群建立失敗。
坑5:如果podManagementPolicy 策略不為 “Parallel”, 則readiness 探針需要關閉,避免集群啟動失敗。
二、RabbitMQ集群部署關鍵問題總結
以上總結了RabbitMQ集群部署中的五大核心問題,建議在實施前逐項核查配置,可顯著提升部署成功率。
- 節點選擇器綁定:必須使用集群節點的Hostname而非IP,否則Pod會陷入Pending狀態。
- 角色授權:確保rabbitmq-peer-discovery角色已授權nodes資源,否則Pod啟動報錯。
- 端口暴露:必須開放4369(EPMD)、25672(Erlang分布式通信)等核心端口,否則集群初始化失敗。
- 長名稱配置:環境變量RABBITMQ_USE_LONGNAME需設為true,強制使用FQDN格式避免節點通信截斷。
- Pod管理策略:若未采用Parallel策略,需關閉readiness探針,防止集群啟動阻塞。