本方案說明
- PostgreSQL + repmgr:實現主從自動故障檢測與切換(Failover)。
- PgBouncer:作為連接池,屏蔽后端數據庫變動,提供透明連接。
- 動態配置更新:通過repmgr組件的
promote_command
階段觸發腳本自動更新 PgBouncer 的[databases]
配置,指向新主庫。 - 需要開發語言支持連接串寫多個IP,來連接多個pgbouncer
節點規劃
主機 | hostname | 角色 | 組件 |
---|---|---|---|
10.0.0.41 | repmgr01 | Leader | PostgreSQL 15.5、repmgr 5.5.0、pgbouncer-1.24.0 |
10.0.0.42 | repmgr02 | standby1 | PostgreSQL 15.5、repmgr 5.5.0、pgbouncer-1.24.0 |
10.0.0.43 | repmgr03 | standby2 | PostgreSQL 15.5、repmgr 5.5.0、pgbouncer-1.24.0 |
目錄文件說明
/data/pgsql/data | 后端postgresql的數據目錄 |
---|---|
/data/pgsql/log | 后端postgresql的日志目錄 |
/data/pgsql/data/postgres.conf | 后端postgresql的配置文件 |
/data/pgsql/data/pg_hba.conf | 后端postgresql的訪問控制文件 |
/data/repmgr | 高可用組件repmgr的家目錄 |
/data/repmgr/repmgr.conf | 高可用組件repmgr的配置文件 |
/data/repmgr/promte_standby_pgbouncer.sh | repmgr監測到后端主節點故障后觸發的腳本 |
/data/pgbouncer/pgbouncer.template | pgbouncer配置模板,被觸發腳本引用 |
/data/pgbouncer/pgbouncer.ini | pgbouncer配置文件 |
1.集群準備
準備一套1主2從的repmgr集群,部署過程參考我的另一篇文章PostgreSQL高可用架構Repmgr部署流程
1.1 安裝依賴
# 集群中所有節點root用戶執行安裝# 觸發腳本之后會將新配置文件同步到所有節點
yum install -y install rsync
# PgBouncer是基于Libevent開發的,所以需要先安裝Libevent的開發包
yum install -y install libevent-devel
1.2 安裝PgBouncer
# 集群中所有節點root用戶執行,安裝PgBouncer[root@repmgr01 ~]# cd /opt
[root@repmgr01 opt]# wget http://www.pgbouncer.org/downloads/files/1.24.0/pgbouncer-1.24.0.tar.gz
[root@repmgr01 opt]# tar xvf pgbouncer-1.24.0.tar.gz
[root@repmgr01 opt]# cd pgbouncer-1.24.0
[root@repmgr01 pgbouncer-1.24.0]# ./configure
[root@repmgr01 pgbouncer-1.24.0]# make && make install
# 默認PgBouncer是安裝到 /usr/local/bin 目錄下的
2.配置pgbouncer
2.1 后端數據庫創建業務用戶
# 后端主庫操作
[postgres@postgres-01 data]$ psql
psql (15.5)
Type "help" for help.postgres=#
create database erpdb;
CREATE USER erpuser WITH PASSWORD 'Erp@123';
ALTER USER erpuser WITH LOGIN;
GRANT CONNECT ON DATABASE erpdb TO erpuser;
\c erpdb
GRANT USAGE ON SCHEMA public TO erpuser;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO erpuser;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO erpuser;# 所有數據庫配置pg_hba.conf,追加以下內容
[postgres@repmgr01 ~]$ vim /data/pgsql/data/pg_hba.conf
host erpdb erpuser 127.0.0.1/32 scram-sha-256
host erpdb erpuser 10.0.0.0/24 scram-sha-256# 所有數據庫,重新加載配置
[postgres@postgres-01 data]$ psql
psql (15.5)
Type "help" for help.postgres=# select pg_reload_conf();
2.2 編輯pgbouncer配置文件
用于初次部署集群時,啟動pgbouncer服務
# 所有節點postgres用戶執行
vim /data/pgbouncer/pgbouncer.ini
[databases]
# "postgres=host=localhost..."中的"postgres"表示外部用戶連接PgBouncer時的數據庫名稱
# 這個數據庫名稱與后端的實際數據庫名稱可以不同
# PgBouncer → PostgreSQL:使用此處配置的 user/password
erpdb = host=10.0.0.41 port=5432 dbname=erpdb user=erpuser password=Erp@123
[pgbouncer]
admin_users = admin
logfile = /data/pgbouncer/pgbouncer.log
pidfile = /data/pgbouncer/pgbouncer.pid
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /data/pgbouncer/userlist.txt
pool_mode = session
# 每個(數據庫+用戶)組合的后端連接數
# 例如:用戶A訪問DB1和用戶B訪問DB1會有不同的連接池
default_pool_size = 20
# 最多允許用戶建多少個連接到PgBouncer,示例計算:
# 有3個應用用戶訪問2個數據庫 → 6個用戶數據庫組合,設 default_pool_size=20
# max_client_conn ≈ 1.2 × (20 × 6) = 144 (可設為150)
max_client_conn = 150
server_idle_timeout = 600
2.3 編輯pgbouncer配置模板
此模板是為了在故障轉移時生成pgbouncer配置文件的[pgbouncer]部分
# 所有節點postgres用戶執行
[postgres@repmgr01 ~]$ mkdir /data/pgbouncer
[postgres@repmgr01 ~]$ vim /data/pgbouncer/pgbouncer.template
[pgbouncer]
admin_users = admin
logfile = /data/pgbouncer/pgbouncer.log
pidfile = /data/pgbouncer/pgbouncer.pid
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /data/pgbouncer/userlist.txt
pool_mode = session
# 最多允許用戶建多少個連接到PgBouncer,示例計算:
# 有3個應用用戶訪問2個數據庫 → 6個用戶數據庫組合,設 default_pool_size=20
# max_client_conn ≈ 1.2 × (20 × 6) = 144 (可設為150)
max_client_conn = 100
# 每個(數據庫+用戶)組合的后端連接數
# 例如:用戶A訪問DB1和用戶B訪問DB1會有不同的連接池
default_pool_size = 20
server_idle_timeout = 600
2.4 配置pgbouncer認證文件
# 所有節點編輯認證文件,admin用戶是pgbouncer的管理用戶
vim /data/pgbouncer/userlist.txt
"erpuser" "Erp@123"
"admin" "Admin@123"
2.5 啟動PgBouncer
#所有節點
# 啟動PgBouncer
# "-d" 表示 "daemon" ,也就是讓PgBouncer以后臺的方式運行
pgbouncer -d /data/pgbouncer/pgbouncer.ini# 驗證是否可以通過pgbouncer登錄后端pg數據庫
[postgres@repmgr01 ~]$ PGPASSWORD="Erp@123" psql -h 10.0.0.41 -p 6432 -U erpuser -d erpdb
psql (15.5)
Type "help" for help.erpdb=> SELECT inet_server_addr() AS backend_host,inet_server_port() AS backend_port,current_database(),current_user;
3.編輯repmgr觸發腳本
#所有節點
vim /data/repmgr/promte_standby_pgbouncer.sh
#!/usr/bin/env bash
set -u
set -o xtrace
#pgbouncer服務的主機列表
PGBOUNCER_HOSTS="10.0.0.41 10.0.0.42 10.0.0.43"
#pgbouncer服務的配置文件位置
PGBOUNCER_DATABASE_INI="/data/pgbouncer/pgbouncer.ini"
#pgbouncer連接的后端數據庫的別名
PGBOUNCER_DATABASE="erpdb"
#pgbouncer服務的管理庫
PGBOUNCER_DATABASE_ADMIN_DB="pgbouncer"
#pgbouncer服務的管理用戶
PGBOUNCER_DATABASE_USER="admin"
PGBOUNCER_DATABASE_PASSWORD="Admin@123"
#pgbouncer服務端口
PGBOUNCER_PORT=6432
#后端postgresql的端口
PORT=5432
#pgbouncer連接的后端數據庫名
DBNAME="erpdb"
PG_HOME=/usr/local/pgsql
HOSTNAME=`hostname -i`
REPMGR_DB="repmgr"
REPMGR_USER="repmgr"
REPMGR_PASSWD="repmgr"
REPMGR_CONF="/data/repmgr/repmgr.conf"
STEP1="Promote ${HOSTNAME} from standby to primary"
STEP2="Recreate the pgbouncer config file on node ${HOSTNAME}"
STEP3="Resync the pgbouncer config file"
STEP4="Reload the pgbouncer config file"
PGBOUNCER_DATABASE_INI_NEW="/tmp/pgbouncer.ini"
PGBOUNCER_DATABASE_INI_TEMPLATE='/data/pgbouncer/pgbouncer.template'# STEP1. Promote this node from standby to primary
${PG_HOME}/bin/repmgr standby promote -f ${REPMGR_CONF} --log-to-file
if [ $? -ne 0 ]; thenecho promte_standby_pgbounce.sh: ${STEP1} on ${HOSTNAME} failed !!! exit 1
fi
#流復制的標志,f表示主庫,t表示從庫
standby_flg=`PGPASSWORD=${REPMGR_PASSWD} ${PG_HOME}/bin/psql -p ${PORT} -U ${REPMGR_USER} -h localhost -At -c "SELECT pg_is_in_recovery();"`
if [ ${standby_flg} == 'f' ]; thenecho promte_standby_pgbounce.sh: ${STEP1} on ${HOSTNAME} successfully !!!
elif [ ${standby_flg} == 't' ]; thenecho promte_standby_pgbounce.sh: ${STEP1} on ${HOSTNAME} failed !!!exit 1
fi
# STEP2. Reconfigure pgbouncer instances
for HOST in $PGBOUNCER_HOSTS
do# Recreate the pgbouncer config file# 生成pgbouncer配置文件中關于[databases]標簽下的內容echo -e "[databases]\n" > $PGBOUNCER_DATABASE_INI_NEW#生成pgbouncer配置文件中關于后端數據庫的conninfoPGPASSWORD=${REPMGR_PASSWD} ${PG_HOME}/bin/psql -p ${PORT} -U ${REPMGR_USER} -h localhost -At -c "SELECT '${PGBOUNCER_DATABASE} = '|| split_part(conninfo,' ',1) ||' port=${PORT}'||' dbname=${DBNAME} ' ||' application_name=pgbouncer_${HOST}' FROM repmgr.nodes WHERE active = TRUE AND type='primary'" >> $PGBOUNCER_DATABASE_INI_NEW# 生成pgbouncer配置文件中[pgbouncer]標簽下的內容cat $PGBOUNCER_DATABASE_INI_TEMPLATE >> $PGBOUNCER_DATABASE_INI_NEW echo promte_standby_pgbounce.sh: ${STEP2} on ${HOSTNAME} successfully !!!# STEP3. Resync the pgbouncer config filersync $PGBOUNCER_DATABASE_INI_NEW $HOST:$PGBOUNCER_DATABASE_INIif [ $? -ne 0 ]; thenecho promte_standby_pgbounce.sh: ${STEP3} on ${HOSTNAME} failed !!!elseecho promte_standby_pgbounce.sh: ${STEP3} on ${HOSTNAME} successfully !!! fi# STEP4. Reload the pgbouncer config file PGPASSWORD=${PGBOUNCER_DATABASE_PASSWORD} ${PG_HOME}/bin/psql -tc "reload" -h $HOST -p $PGBOUNCER_PORT -d ${PGBOUNCER_DATABASE_ADMIN_DB} -U ${PGBOUNCER_DATABASE_USER}if [ $? -ne 0 ]; thenecho promte_standby_pgbounce.sh: ${STEP4} on ${HOSTNAME} failed !!! elseecho promte_standby_pgbounce.sh: ${STEP4} on ${HOSTNAME} successfully !!! fi
done# Clean up generated file
rm -rf $PGBOUNCER_DATABASE_INI_NEW
echo "Reconfiguration of pgbouncer complete"
# 授予腳本可執行權限
chmod +x /data/repmgr/promte_standby_pgbouncer.sh
4.修改repmgr配置文件
# 修改promote_command參數,執行我們新建的腳本
vim /data/repmgr/repmgr.conf
promote_command='/data/repmgr/promte_standby_pgbouncer.sh >> /data/repmgr/repmgrd.log'
5.重新啟動repmgrd守護進程
kill $(pgrep -f repmgrd)
repmgrd -f /data/repmgr/repmgr.conf --daemonize
6.驗證failover
查看當前集群狀態
[postgres@repmgr01 ~]$ repmgr -f /data/repmgr/repmgr.conf cluster show
查看pgbouncer指向的后端數據庫
[postgres@repmgr01 ~]$ PGPASSWORD="Erp@123" psql -p 6432 -U erpuser -h 127.0.0.1 -d erpdb
psql (15.5)
Type "help" for help.erpdb=> SELECT inet_server_addr() AS backend_host,inet_server_port() AS backend_port,current_database(),current_user;
手動關閉主節點
[postgres@postgres-01 data]$ pg_ctl stop -D $PGDATA
waiting for server to shut down.... done
server stopped
查看關閉主節點后各節點日志
[postgres@repmgr01 ~]$ tail -f /data/repmgr/repmgrd.log
repmgr01節點失去連接
repmgr02節點,在該節點上執行了觸發腳本,成功被提升為主庫,并且node-3節點作為從節點
repmgr03節點,作為repmgr02節點的STANDBY成功連接
再次查看pgbouncer指向的后端數據庫
[postgres@postgres-01 data]$ PGPASSWORD="Erp@123" psql -p 6432 -U erpuser -h 127.0.0.1 -d erpdb
psql (15.5)
Type "help" for help.erpdb=> SELECT inet_server_addr() AS backend_host,inet_server_port() AS backend_port,current_database(),current_user;
所有節點的pgbouncer配置文件已被修改為指向新的主節點
[postgres@repmgr01 data]$ cat /data/pgbouncer/pgbouncer.ini | grep "erpdb"
erpdb = host=10.0.0.42 port=5432 dbname=erpdb application_name=pgbouncer_10.0.0.41[postgres@repmgr02 data]$ cat /data/pgbouncer/pgbouncer.ini | grep "erpdb"
erpdb = host=10.0.0.42 port=5432 dbname=erpdb application_name=pgbouncer_10.0.0.42[postgres@repmgr03 data]$ cat /data/pgbouncer/pgbouncer.ini | grep "erpdb"
erpdb = host=10.0.0.42 port=5432 dbname=erpdb application_name=pgbouncer_10.0.0.43