MHA高可用

manager 組件
masterha_manger             # 啟動MHA 
masterha_check_ssh      	  # 檢查MHA的SSH配置狀況 
masterha_check_repl         # 檢查MySQL復制狀況，配置信息
masterha_master_monitor     # 檢測master是否宕機 
masterha_check_status       # 檢測當前MHA運行狀態 
masterha_master_switch     	# 控制故障轉移（自動或者手動）
masterha_conf_host      	  # 添加或刪除配置的server信息node 組件
save_binary_logs            # 保存和復制master的二進制日志 
apply_diff_relay_logs       # 識別差異的中繼日志事件并將其差異的事件應用于其他的
purge_relay_logs            # 清除中繼日志（不會阻塞SQL線程）

MHA在發生切換的過程中，從庫的恢復過程中依賴于relay log的相關信息，所以這里要將relay log的自動清除設置為OFF，采用手動清除relay log的方式。在默認情況下，從服務器上的中繼日志會在SQL線程執行完畢后被自動刪除。但是在MHA環境中，這些中繼日志在恢復其他從服務器時可能會被用到，因此需要禁用中繼日志的自動刪除功能。定期清除中繼日志需要考慮到復制延時的問題。在ext3的文件系統下，刪除大的文件需要一定的時間，會導致嚴重的復制延時。為了避免復制延時，需要暫時為中繼日志創建硬鏈接，因為在linux系統中通過硬鏈接刪除大文件速度會很快。（在mysql數據庫中，刪除大表時，通常也采用建立硬鏈接的方式）

MHA節點中包含了pure_relay_logs命令工具，它可以為中繼日志創建硬鏈接，執行SET GLOBAL relay_log_purge=1,等待幾秒鐘以便SQL線程切換到新的中繼日志，再執行SET GLOBAL relay_log_purge=0。

pure_relay_logs腳本參數如下所示：

--user mysql                      用戶名
--password mysql                  密碼
--port                            端口號
--workdir                         指定創建relay log的硬鏈接的位置，默認是/var/tmp，由于系統不同分區創建硬鏈接文件會失敗，故需要執行硬鏈接具體位置，成功執行腳本后，硬鏈接的中繼日志文件被刪除
--disable_relay_log_purge         默認情況下，如果relay_log_purge=1，腳本會什么都不清理，自動退出，通過設定這個參數，當relay_log_purge=1

[root@192.168.0.60 ~]# cat purge_relay_log.sh 
#!/bin/bash
user=root
passwd=123456
port=3306
log_dir='/data/masterha/log'
work_dir='/data'
purge='/usr/local/bin/purge_relay_logs'if [ ! -d $log_dir ]
thenmkdir $log_dir -p
fi$purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
[root@192.168.0.60 ~]#

#!/bin/sh
BACKUP_BIN=/usr/local/mysql/bin/mysqlbinlog
LOCAL_BACKUP_DIR=/data/backup/binlog_bk
BACKUP_LOG=/data/backup/bakbinlog.log
REMOTE_HOST=192.168.56.100
#REMOTE_PORT=3306
SERVER_ID=20003306
REMOTE_USER=wanbin
REMOTE_PASS=mysql
#time to wait before reconnecting after failure
SLEEP_SECONDS=10
##create local_backup_dir if necessary
##mkdir -p ${LOCAL_BACKUP_DIR}
cd ${LOCAL_BACKUP_DIR}
## 運行while循環，連接斷開后等待指定時間，重新連接
while :
FIRST_BINLOG=$(mysql --host=${REMOTE_HOST} --user=${REMOTE_USER} --password=${REMOTE_PASS} -e 'show binary logs'|grep -v "Log_name"|awk '{print $1}'|head -n 1)
doif [ `ls -A "${LOCAL_BACKUP_DIR}" |wc -l` -eq 0 ];thenLAST_FILE=${FIRST_BINLOG} ##如果備份目錄中沒有備份文件則 LAST_FILE=FIRST_FILEelseLAST_FILE=`ls -l ${LOCAL_BACKUP_DIR} |tail -n 1 |awk '{print $9}'` ##last_file取序列最大的binlog文件fi${BACKUP_BIN} -R --raw --host=${REMOTE_HOST} --user=${REMOTE_USER} --password=${REMOTE_PASS} ${LAST_FILE} --stop-never --stop-never-slave-server-id=${SERVER_ID} echo "`date +"%Y/%m/%d %H:%M:%S"` mysqlbinlog停止，返回代碼：$?" | tee -a ${BACKUP_LOG}echo "${SLEEP_SECONDS}秒后再次連接并繼續備份" | tee -a ${BACKUP_LOG}  sleep ${SLEEP_SECONDS}
done

#!/bin/bash[ -e /etc/profile ] && source /etc/profile || exit 0#本地binlog路徑local_binlog_dir=/data/3306/247binlog[ ! -d "$local_binlog_dir" ] && mkdir -p "$local_binlog_dir"cd "$local_binlog_dir"#遠程服務器ssh端口ssh_port=22#遠程服務器ipremote_host=192.168.0.68#本地binlog文件名local_logfile=`ls -al "$local_binlog_dir" | grep 'mysql-bin\.[0-9]\+' |tail -n 1 | awk '{print $NF}'`#遠程服務器binlog路徑remote_binlog_dir=/data/mysql3306/#遠程服務器第一個binlog文件名first_remote_lofile=`ssh -p ${ssh_port} -o StrictHostKeyChecking=no ${remote_host} " cat \${remote_binlog_dir}/mysql-bin.index | head -n 1 | awk -F'/' '{print \\$NF}'"`last_remote_logfile=`ssh -p ${ssh_port} -o StrictHostKeyChecking=no ${remote_host} " cat \${remote_binlog_dir}/mysql-bin.index | tail -n 1 | awk -F'/' '{print \\$NF}'"`#遠程mysql用戶remote_user=root#遠程mysql用戶密碼remote_password=xxfunction start() {
running=`ps uax | grep 'mysqlbinlog -R --raw' | grep -v grep|grep raw | awk '{print $2}'`if [ "$running" != "" ];thenecho "mysqlbinlog server is running"exitfiif [ "$local_logfile" == "" ];then#echo "the binlogserver is first start "mysqlbinlog -R --raw --host=$remote_host --user="$remote_user" --password="$remote_password" --stop-never  $first_remote_lofile &elseif ! ssh -p ${ssh_port} -o StrictHostKeyChecking=no ${remote_host} "ls -lh ${remote_binlog_dir}/${local_logfile}" &> /dev/null;thenlocal_logfile_num=`ll /data/3306/247binlog/ |tail -1 |awk '{print $NF}' |grep -o '[1?9][1?9]\+[0?9][0?9]\+'`binlogs=(`ssh -p ${ssh_port} -o StrictHostKeyChecking=no ${remote_host} "ls -lh ${remote_binlog_dir}/mysql-bin.* |grep -v index |awk -F'/' '{print \\$NF}' |wc -l"`)for binlog in `seq 1 $binlogs`dolocal_logfile_num=`expr $local_logfile_num + 1`if [ "$local_logfile_num" -lt 10 ];thenlocal_logfile=mysql-bin.00000${local_logfile_num}elif [ "$local_logfile_num" -lt 100 ];thenlocal_logfile=mysql-bin.0000${local_logfile_num}elif [ "$local_logfile_num" -lt 1000 ];thenlocal_logfile=mysql-bin.000${local_logfile_num}elif [ "$local_logfile_num" -lt 10000 ];thenlocal_logfile=mysql-bin.00${local_logfile_num}elif [ "$local_logfile_num" -lt 100000 ];thenlocal_logfile=mysql-bin.0${local_logfile_num}elselocal_logfile=mysql-bin.${local_logfile_num}fiif ssh -p ${ssh_port} -o StrictHostKeyChecking=no ${remote_host} "ls -lh ${remote_binlog_dir}/${local_logfile}" &> /dev/null;thenbreakfidonemysqlbinlog -R --raw --host=$remote_host --user="$remote_user" --password="$remote_password" --stop-never  $local_logfile &elsemysqlbinlog -R --raw --host=$remote_host --user="$remote_user" --password="$remote_password" --stop-never  $local_logfile &fifi}function stop() {
ps uax | grep mysqlbinlog | grep raw | awk '{print $2}' | xargs kill}case $1 instart)start;;stop)stop;;*)# usagebasename=`basename "$0"`echo "Usage: $basename  {start|stop}  [ MySQL BinlogServer options ]"exit 1;;esac

Phase 1: Configuration Check Phase
init_config(): 初始化配置
MHA::ServerManager::init_binlog_server: 初始化binlog server
check_settings()
?? ?a. check_node_version(): 查看MHA的版本
?? ?b. connect_all_and_read_server_status(): 檢測確認各個Node節點MySQL是否可以連接
?? ?c. get_dead_servers()，get_alive_servers()，get_alive_slaves()：再次檢測一次node節點的狀態
?? ?d. print_dead_servers(): 是否掛掉的master是否是當前的master
?? ?e. MHA::DBHelper::check_connection_fast_util : 快速判斷dead server，是否真的掛了，如果ping_type=insert，不會double check
?? ?f. MHA::NodeUtil::drop_file_if($_failover_error_file|$_failover_complete_file): 檢測上次的failover文件
?? ?g. 如果上次failover的時間在8小時以內，那么這次就不會failover,除非配置了額外的參數
?? ?h. start_sql_threads_if(): 查看所有slave的Slave_SQL_Running是否為Yes，若不是則啟動SQL thread
?? ?is_gtid_auto_pos_enabled(): 判斷是否是GTID模式
Phase 2: Dead Master Shutdown Phase..
?? ?force_shutdown($dead_master):
?? ?a. stop_io_thread(): stop所有slave的IO_thread
?? ?b. force_shutdown_internal($dead_master):
?? ??? ? ?b_1. master_ip_failover_script: 如果有這個腳本，則執行里面的邏輯(比如：切換vip)
?? ??? ? ?b_2. shutdown_script：如果有這個腳本，則執行里面的邏輯（比如：Power off 服務器）
Phase 3: Master Recovery Phase..
?? ?Phase 3.1: Getting Latest Slaves Phase..
?? ?* check_set_latest_slaves()
?? ??? ? ?a. read_slave_status(): 獲取所有show slave status 信息
?? ??? ? ?b. identify_latest_slaves(): 找到最新的slave是哪個
?? ??? ? ?c. identify_oldest_slaves(): 找到最老的slave是哪個
?? ?Phase 3.2: Saving Dead Master's Binlog Phase..
?? ??? ?* save_master_binlog($dead_master);
?? ??? ? ?-> 如果dead master可以ssh，那么
?? ??? ? ? ? ? b_1_1. save_master_binlog_internal: 用node節點save_binary_logs腳本拷貝相應binlog到manager
?? ??? ? ? ? ? ? ? diff_binary_log 生產差異binlog日志
?? ??? ? ? ? ? b_1_2. file_copy: 將差異binlog拷貝到manager節點的 manager_workdir目錄下
?? ??? ? ?-> 如果dead master不可以ssh
?? ??? ? ? ? ? b_1_3. 那么差異日志就會丟失
?? ?Phase 3.3: Determining New Master Phase..
?? ?b. 如果GTID auto_pos沒有打開，調用find_latest_base_slave()
?? ? ? ?b_1. find_latest_base_slave_internal: 尋找擁有所有relay-log的最新slave，如果沒有，則failover失敗
?? ? ? ? ? ? ? ?b_1_1. find_slave_with_all_relay_logs:
?? ? ? ? ? ? ? ? ? ? ? ?b_1_1_1. apply_diff_relay_logs: 查看最新的slave是否有其他slave缺失的relay-log
?? ?
?? ?c. select_new_master: 選舉new master
?? ? ? ?c_1. MHA::ServerManager::select_new_master:
?? ? ? ? ? #If preferred node is specified, one of active preferred nodes will be new master.
?? ? ? ? ? #If the latest server behinds too much (i.e. stopping sql thread for online backups), we should not use it as a new master, but we should fetch relay log there
?? ? ? ? ? #Even though preferred master is configured, it does not become a master if it's far behind
?? ?
?? ? ? ? ? get_candidate_masters(): 獲取配置中候選節點
?? ? ? ? ? get_bad_candidate_masters(): 以下條件不能成為候選master
?? ? ? ? ? ? ? # dead server
?? ? ? ? ? ? ? # no_master >= 1
?? ? ? ? ? ? ? # log_bin=0
?? ? ? ? ? ? ? # oldest_major_version=0
?? ? ? ? ? ? ? # check_slave_delay: 檢查是否延遲非常厲害（可以通過設置no_check_delay忽略）
?? ? ? ? ? ? ? ? ? {Exec_Master_Log_Pos} + 100000000 只要binlog position不超過100000000 就行
?? ? ? ? ? 選舉流程：先看candidate_master，然后找 latest slave, 然后再隨機挑選
?? ?Phase 3.3(3.4): New Master Diff Log Generation Phase..
?? ?* recover_master_internal
?? ? ? ? ? ? recover_relay_logs:
?? ? ? ? ? ? ? ? ? 判斷new master是否為最新的slave，如果不是，則生產差異relay logs，并發送給新master
?? ? ? ? ? ? recover_master_internal:
?? ? ? ? ? ? ? ? ? 將之前生產的dead master上的binlog傳送給new master
?? ?Phase 3.4: Master Log Apply Phase..
?? ?* apply_diff:
?? ? ? ? ? a. wait_until_relay_log_applied: 直到new master完成所有relay log，否則一直等待
?? ? ? ? ? b. 判斷Exec_Master_Log_Pos == Read_Master_Log_Pos, 如果不等，那么生產差異日志：
?? ? ? ? ? ? ? ? ? ? ? save_binary_logs --command=save
?? ? ? ? ? c. apply_diff_relay_logs --command=apply：對new master進行恢復
?? ? ? ? ? ? ? ? ? ? ? c_1. exec_diff:Exec_Master_Log_Pos和Read_Master_Log_Pos的差異日志
?? ? ? ? ? ? ? ? ? ? ? c_2. read_diff:new master與lastest slave的relay log的差異日志
?? ? ? ? ? ? ? ? ? ? ? c_3. binlog_diff:lastest slave與daed master之間的binlog差異日志
?? ?* 如果設置了master_ip_failover_script腳本，那么會執行這里面的腳本（一般用來漂移vip）
?? ?
?? ?* disable_read_only(): 允許new master可寫
Phase 4: Slaves Recovery Phase..
?? ?recover_slaves_internal
?? ?
?? ?Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
?? ??? ?recover_all_slaves_relay_logs: 生成Slave與New Slave之間的差異日志，并將該日志拷貝到各Slave的工作目錄下
?? ?Phase 4.2: Starting Parallel Slave Log Apply Phase..
?? ??? ?* recover_slave:
?? ??? ? ? ?對每個slave進行恢復，跟以上Phase 3.4: Master Log Apply Phase中的 apply_diff一樣
?? ??? ?* change_master_and_start_slave：
?? ??? ? ? ?重新指向到new master，并且start slave
Phase 5: New master cleanup phase..
?? ?reset_slave_on_new_master
?? ?在new master上執行reset slave all;

?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?
?? ?Phase 1: Configuration Check Phase
?? ?init_config(): 初始化配置
?? ?MHA::ServerManager::init_binlog_server: 初始化binlog server
?? ?check_settings()
?? ??? ?a. check_node_version(): 查看MHA的版本
?? ??? ?b. connect_all_and_read_server_status(): 檢測確認各個Node節點MySQL是否可以連接
?? ??? ?c. get_dead_servers()，get_alive_servers()，get_alive_slaves()：再次檢測一次node節點的狀態
?? ??? ?d. print_dead_servers(): 是否掛掉的master是否是當前的master
?? ??? ?e. MHA::DBHelper::check_connection_fast_util : 快速判斷dead server，是否真的掛了，如果ping_type=insert，不會double check
?? ??? ?f. MHA::NodeUtil::drop_file_if($_failover_error_file|$_failover_complete_file): 檢測上次的failover文件
?? ??? ?g. 如果上次failover的時間在8小時以內，那么這次就不會failover,除非配置了額外的參數
?? ??? ?h. start_sql_threads_if(): 查看所有slave的Slave_SQL_Running是否為Yes，若不是則啟動SQL thread
?? ?is_gtid_auto_pos_enabled(): 判斷是否是GTID模式
Phase 2: Dead Master Shutdown Phase completed.
?? ?force_shutdown($dead_master):
?? ??? ?a. stop_io_thread(): stop所有slave的IO_thread
?? ??? ?b. force_shutdown_internal($dead_master):
?? ??? ? ?b_1. master_ip_failover_script: 如果有這個腳本，則執行里面的邏輯(比如：切換vip)
?? ??? ? ?b_2. shutdown_script：如果有這個腳本，則執行里面的邏輯（比如：Power off 服務器）
Phase 3: Master Recovery Phase..
?? ?Phase 3.1: Getting Latest Slaves Phase..
?? ??? ?* check_set_latest_slaves()
?? ??? ??? ? ?a. read_slave_status(): 獲取所有show slave status 信息
?? ??? ??? ? ?b. identify_latest_slaves(): 找到最新的slave是哪個
?? ??? ??? ? ?c. identify_oldest_slaves(): 找到最老的slave是哪個
?? ?Phase 3.2: Saving Dead Master's Binlog Phase.. (GTID 模式下沒有這一步)
?? ?Phase 3.3: Determining New Master Phase..
?? ??? ?get_most_advanced_latest_slave(): 獲取最新的slave
?? ?
?? ?c. select_new_master: 選舉new master
?? ? ? ?c_1. MHA::ServerManager::select_new_master:
?? ? ? ? ? #If preferred node is specified, one of active preferred nodes will be new master.
?? ? ? ? ? #If the latest server behinds too much (i.e. stopping sql thread for online backups), we should not use it as a new master, but we should fetch relay log there
?? ? ? ? ? #Even though preferred master is configured, it does not become a master if it's far behind
?? ?
?? ? ? ? ? get_candidate_masters(): 獲取配置中候選節點
?? ? ? ? ? get_bad_candidate_masters(): 以下條件不能成為候選master
?? ? ? ? ? ? ? # dead server
?? ? ? ? ? ? ? # no_master >= 1
?? ? ? ? ? ? ? # log_bin=0
?? ? ? ? ? ? ? # oldest_major_version=0
?? ? ? ? ? ? ? # check_slave_delay: 檢查是否延遲非常厲害（可以通過設置no_check_delay忽略）
?? ? ? ? ? ? ? ? ? {Exec_Master_Log_Pos} + 100000000 只要binlog position不超過100000000 就行
?? ? ? ? ? 選舉流程：先看candidate_master，然后找 latest slave, 然后再隨機挑選
?? ?
?? ?Phase 3.3: New Master Recovery Phase..
?? ??? ?* recover_master_gtid_internal:
?? ??? ? ? ?wait_until_relay_log_applied: 候選master等待所有relay-log都應用完
?? ??? ? ? ?如果候選master不是最新的slave：
?? ??? ? ? ? ? ?$latest_slave->wait_until_relay_log_applied($log): 最新的slave應用完所有的relay-log
?? ??? ? ? ? ? ?change_master_and_start_slave ：讓候選master同步到latest master，追上latest slave
?? ??? ? ? ? ? ?獲取候選master此時此刻的日志信息，以便后面切換
?? ??? ? ? ?如果候選master是最新的slave：
?? ??? ? ? ? ? ?獲取候選master此時此刻的日志信息，以便后面切換
?? ??? ? ? ?save_from_binlog_server:
?? ??? ? ? ? ? ?如果配置了binlog server，那么在binlogsever 能連的情況下，將binlog 拷貝到Manager，并生成差異日志diff_binlog(save_binary_logs --command=save)
?? ??? ? ? ?apply_binlog_to_master:
?? ??? ? ? ? ? ?Applying differential binlog：應用差異的binlog到new master

Phase 4: Slaves Recovery Phase..
?? ?Phase 4.1: Starting Slaves in parallel..
?? ??? ?* recover_slaves_gtid_internal:
?? ??? ? ? ?change_master_and_start_slave: 因為master已經恢復，那么slave直接change master auto_pos=1 的模式就可以恢復
?? ??? ? ? ?gtid_wait：用此等待同步全部追上
Phase 5: New master cleanup phase..
?? ?reset_slave_on_new_master
?? ?在new master上執行reset slave all;
?? ?