目錄
問題背景:
日志分析如下:
原因和解決方案總結:
問題背景:
keepalived-master和keepalived-slave同時出現了VIP,出現了非對稱路由和雙主現象
日志分析如下:
- master能夠接受到來自slave的通告消息,并且master優先級100高于slave80,這是正常的。
- slave日志沒有與master替換相關的信息記錄,smtp持續報錯,并且沒有接受到來自master的通告消息,無法確認master狀態即slave認為master宕機,所以就up上vip給自己。
[root@test1 ~]# journalctl -fu keepalived
-- Logs begin at Wed 2024-10-30 18:47:13 CST. --
May 28 21:46:28 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:29 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:30 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:31 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:32 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:33 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:34 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100[root@test2 ~]# journalctl -fu keepalived
-- Logs begin at Wed 2024-10-30 18:43:15 CST. --
May 28 21:39:30 test2 Keepalived_healthcheckers[22487]: Removing service [192.168.2.193]:tcp:80 from VS [192.168.2.100]:tcp:80
May 28 21:39:30 test2 Keepalived_healthcheckers[22487]: Lost quorum 1-0=1 > 0 for VS [192.168.2.100]:tcp:80
May 28 21:39:30 test2 Keepalived_healthcheckers[22487]: smtp fd 10 returned write error
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: TCP connection to [192.168.2.192]:tcp:80 success.
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: Adding service [192.168.2.192]:tcp:80 to VS [192.168.2.100]:tcp:80
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: Gained quorum 1+0=1 <= 1 for VS [192.168.2.100]:tcp:80
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: smtp fd 10 returned write error
May 28 21:43:56 test2 Keepalived_healthcheckers[22487]: TCP connection to [192.168.2.193]:tcp:80 success.
May 28 21:43:56 test2 Keepalived_healthcheckers[22487]: Adding service [192.168.2.193]:tcp:80 to VS [192.168.2.100]:tcp:80
May 28 21:43:56 test2 Keepalived_healthcheckers[22487]: smtp fd 10 returned write error
原因和解決方案總結:
- 查看防火墻狀態,按以上日志的話主要查看slave設備的防火墻,將其關閉。(本人所遇問題為防火墻問題)。
- 在產生切換master行為后,網絡故障或延遲導致通告消息不及時,產生雙主,建議手動停止keepalived,刪除一側VIP地址,再按照優先級配置從大到小順序進行啟動master和slave。
- 主從keepalived配置文件中的vrrp_instance配置錯誤,著重關注網卡名稱interface是否配置正確。
- 查看主從服務的時間是否同步,若不同步可能需要使用chrony進行統一時間。