什么是redis sentinel
參考文檔:https://redis.io/topics/sentinel?
簡單的來說,就是Redis Sentinel 為redis 提供高可用性,主要體現在下面幾個方面:
1.監控:redis sentinel會不間斷的監控主服務器和從服務器是否正常工作
2.通知:當出現問題時,sentinel可以通過API通知系統管理員以及另外的服務器
3.自動故障轉移:如果主服務器出現故障,sentinel可以啟動故障轉移,將其中一臺從服務器升級為主服務器,其他的從服務器會重新配置為新主服務器 4.提供配置:sentinel充當客戶端發現權限來源,客戶端連接到sentinel詢問負責給定服務器當前redis主服務器地址,如果發生故障,sentinel將報告新地址
redis sentinel 模擬環境
模擬環境為:1主2從
?========redis=================sentinel==========
master:127.0.0.1 6379 127.0.0.1 26379
slave1:127.0.0.1 6380 127.0.0.1 26380
slave2:127.0.0.1 6381 127.0.0.1 26381
環境搭建
redis.conf配置
6379
# cat redis-6379.conf | grep -Ev "^$|^#" bind 127.0.0.1 port 6379 daemonize yes pidfile /var/run/redis_6379.pid logfile "/root/redis/redis-6379.log" dbfilename dump-6379.rdb dir /root/redis ... #
6380
# cat redis-6380.conf | grep -Ev "^$|^#" bind 127.0.0.1 port 6380 daemonize yes pidfile /var/run/redis_6380.pid logfile "/root/redis/redis-6380.log" dbfilename dump-6380.rdb dir /root/redis ... #
6381
# cat redis-6381.conf | grep -Ev "^$|^#" bind 127.0.0.1 port 6381 daemonize yes pidfile /var/run/redis_6381.pid logfile "/root/redis/redis-6381.log" dbfilename dump-6381.rdb dir /root/redis ... #
sentinel.conf配置
6379/6380/6381
# cat sentinel-*.conf | grep -Ev "^#|^$" port 26379 daemonize yes logfile "/root/redis/sentinel-6379.log" dir "/tmp" sentinel monitor mymaster 127.0.0.1 6379 2 sentinel down-after-milliseconds mymaster 30000 sentinel parallel-syncs mymaster 1 sentinel failover-timeout mymaster 180000 #
啟動redis server 和 sentinel
redis: # redis-server /etc/redis_6379.conf # redis-server /etc/redis_6380.conf # redis-server /etc/redis_6381.confsentinel: # redis-sentinel /etc/sentinel-6379.conf # redis-sentinel /etc/sentinel-6380.conf # redis-sentinel /etc/sentinel-6381.conf
配置主從復制
# redis-cli -p 6380 127.0.0.1:6380> SLAVEOF 127.0.0.1 6379 OK 127.0.0.1:6380> exit# redis-cli -p 6381 127.0.0.1:6381> SLAVEOF 127.0.0.1 6379 OK 127.0.0.1:6381> exit
模擬故障遷移?
首先,kill 掉redis master進程?
# for n in `ps aux | grep redis-server | grep 6379 | awk '{print $2}'`;do kill -9 $n ;done;
分析log
首先,redis 從服務器首先發現redis master 服務器無法連接,報錯如下:
# tail -F redis-63*.log ==> redis-6380.log <== 2851:S 13 Nov 14:48:54.235 # Connection with master lost. 2851:S 13 Nov 14:48:54.235 * Caching the disconnected master state.==> redis-6381.log <== 3695:S 13 Nov 14:48:54.466 * Connecting to MASTER 127.0.0.1:6379 3695:S 13 Nov 14:48:54.466 * MASTER <-> SLAVE sync started 3695:S 13 Nov 14:48:54.467 # Error condition on socket for SYNC: Connection refused==> redis-6380.log <== 2851:S 13 Nov 14:48:54.781 * Connecting to MASTER 127.0.0.1:6379 2851:S 13 Nov 14:48:54.782 * MASTER <-> SLAVE sync started 2851:S 13 Nov 14:48:54.782 # Error condition on socket for SYNC: Connection refused ...
緊接著,redis sentinel 完成故障切換,從log來看,當6379主節點掛了之后,redis重新提了一個從節點6380為主節點,log 如下:?
# tail -F sentinel-63*.log ==> sentinel-6379.log <== 3225:X 13 Nov 14:49:24.322 # +sdown master mymaster 127.0.0.1 6379==> sentinel-6381.log <== 3235:X 13 Nov 14:49:24.327 # +sdown master mymaster 127.0.0.1 6379==> sentinel-6380.log <== 3230:X 13 Nov 14:49:24.332 # +sdown master mymaster 127.0.0.1 6379==> sentinel-6381.log <== 3235:X 13 Nov 14:49:24.386 # +odown master mymaster 127.0.0.1 6379 #quorum 2/2 3235:X 13 Nov 14:49:24.386 # +new-epoch 1 3235:X 13 Nov 14:49:24.386 # +try-failover master mymaster 127.0.0.1 6379==> sentinel-6380.log <== 3230:X 13 Nov 14:49:24.388 # +odown master mymaster 127.0.0.1 6379 #quorum 3/2 3230:X 13 Nov 14:49:24.388 # +new-epoch 1 3230:X 13 Nov 14:49:24.388 # +try-failover master mymaster 127.0.0.1 6379==> sentinel-6381.log <== 3235:X 13 Nov 14:49:24.409 # +vote-for-leader 06f94705a99df53e468af594737913ce7c6287d5 1==> sentinel-6380.log <== 3230:X 13 Nov 14:49:24.416 # +vote-for-leader 858e250193e7f985bd7d63569a158f52a9cb9e0c 1==> sentinel-6381.log <== 3235:X 13 Nov 14:49:24.416 # 858e250193e7f985bd7d63569a158f52a9cb9e0c voted for 858e250193e7f985bd7d63569a158f52a9cb9e0c 1==> sentinel-6380.log <== 3230:X 13 Nov 14:49:24.417 # 06f94705a99df53e468af594737913ce7c6287d5 voted for 06f94705a99df53e468af594737913ce7c6287d5 1==> sentinel-6379.log <== 3225:X 13 Nov 14:49:24.422 # +new-epoch 1 3225:X 13 Nov 14:49:24.432 # +vote-for-leader 06f94705a99df53e468af594737913ce7c6287d5 1==> sentinel-6381.log <== 3235:X 13 Nov 14:49:24.432 # d0e6638165ba8f8186562da586f4e0789dd4abd1 voted for 06f94705a99df53e468af594737913ce7c6287d5 1==> sentinel-6380.log <== 3230:X 13 Nov 14:49:24.432 # d0e6638165ba8f8186562da586f4e0789dd4abd1 voted for 06f94705a99df53e468af594737913ce7c6287d5 1==> sentinel-6381.log <== 3235:X 13 Nov 14:49:24.468 # +elected-leader master mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:24.468 # +failover-state-select-slave master mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:24.545 # +selected-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:24.545 * +failover-state-send-slaveof-noone slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:24.608 * +failover-state-wait-promotion slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:25.295 # +promoted-slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:25.295 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:25.345 * +slave-reconf-sent slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379==> sentinel-6379.log <== 3225:X 13 Nov 14:49:25.345 # +config-update-from sentinel 06f94705a99df53e468af594737913ce7c6287d5 127.0.0.1 26381 @ mymaster 127.0.0.1 6379 3225:X 13 Nov 14:49:25.345 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380 3225:X 13 Nov 14:49:25.345 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380 3225:X 13 Nov 14:49:25.345 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380==> sentinel-6380.log <== 3230:X 13 Nov 14:49:25.346 # +config-update-from sentinel 06f94705a99df53e468af594737913ce7c6287d5 127.0.0.1 26381 @ mymaster 127.0.0.1 6379 3230:X 13 Nov 14:49:25.346 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380 3230:X 13 Nov 14:49:25.346 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380 3230:X 13 Nov 14:49:25.346 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380==> sentinel-6381.log <== 3235:X 13 Nov 14:49:25.561 # -odown master mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:25.814 * +slave-reconf-inprog slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:26.893 * +slave-reconf-done slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:26.954 # +failover-end master mymaster 127.0.0.1 6379 3235:X 13 Nov 14:49:26.954 # +switch-master mymaster 127.0.0.1 6379 127.0.0.1 6380 3235:X 13 Nov 14:49:26.955 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6380 3235:X 13 Nov 14:49:26.955 * +slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380==> sentinel-6379.log <== 3225:X 13 Nov 14:49:55.349 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380==> sentinel-6380.log <== 3230:X 13 Nov 14:49:55.397 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380==> sentinel-6381.log <== 3235:X 13 Nov 14:49:57.014 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380
再返回過來看redis server的log,此時可以看到6381為從節點已經向主節點6380請求并且完成了復制操作
==> redis-6380.log <== 2851:M 13 Nov 14:49:25.823 * Slave 127.0.0.1:6381 asks for synchronization 2851:M 13 Nov 14:49:25.823 * Partial resynchronization request from 127.0.0.1:6381 accepted. Sending 422 bytes of backlog starting from offset 124407.==> redis-6381.log <== 3695:S 13 Nov 14:49:25.823 * Successful partial resynchronization with master. 3695:S 13 Nov 14:49:25.823 # Master replication ID changed to 0288d040464ebccbb56dc56d54455434a406bcb2 3695:S 13 Nov 14:49:25.823 * MASTER <-> SLAVE sync: Master accepted a Partial Resynchronization.
當我們再啟動6379服務器時,sentinel會讓6379成為從庫并且連接6380服務器,log如下:?
啟動6379服務器 # redis-server /root/redis/redis-6379.conf# tail -F sentinel-63*.log ... ==> sentinel-6379.log <== 3225:X 13 Nov 16:05:00.384 * +convert-to-slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 6380 ...# tail -F redis-63*.log ... ==> redis-6379.log <== 7493:S 13 Nov 16:05:00.566 * MASTER <-> SLAVE sync: receiving 194 bytes from master 7493:S 13 Nov 16:05:00.566 * MASTER <-> SLAVE sync: Flushing old data 7493:S 13 Nov 16:05:00.566 * MASTER <-> SLAVE sync: Loading DB in memory 7493:S 13 Nov 16:05:00.566 * MASTER <-> SLAVE sync: Finished with success==> redis-6381.log <== 3695:S 13 Nov 16:05:36.467 * 1 changes in 900 seconds. Saving... 3695:S 13 Nov 16:05:36.468 * Background saving started by pid 7519 7519:C 13 Nov 16:05:36.486 * DB saved on disk 7519:C 13 Nov 16:05:36.487 * RDB: 8 MB of memory used by copy-on-write 3695:S 13 Nov 16:05:36.569 * Background saving terminated with success ...
未完待續。。。
?