本文最初于2016年底發表在我的個人微信公眾號里面,現略有修訂后發布在這里。本文的技術信息針對的是mysql-5.7.x和mariadb-10.1.9。
MariaDB和MySQL兩者對分布式事務的支持有所不同,總的來說MySQL的支持更好,是完備的和可靠的(盡管后來也陸續發現了不少bug),而MariaDB的支持還有諸多問題,先舉例說明。
本文測試用例使用MySQL-5.7.16和MariaDB-10.1.9 進行測試。
MySQL/MariaDB對分布式事務的支持是根據 X/Open CAE document Distributed Transaction Processing: The XA Specification (http://www.opengroup.org/public/pubs/catalog/c193.htm) ,主要包括下面這幾個語句:
xa start 'gtid';
xa end 'gtid';
xa prepare 'gtid';
xa commit 'gtid';
xa rollback 'gtid';
xa recover;
外部的分布式事務管理器(transaction manager, TM) 可以使用這套XA命令來操作mysql 數據庫,按照XA標準的兩階段提交算法(2PC) 完成分布式事務提交。各個語句的意義見官方文檔。
其中mariadb的問題在于 xa prepare 'gtid' 之后,這個事務分支(transaction branch) 可能會丟失。詳見下文。
事先創建1個簡單的表并且插入4行數據:
create table t1(a int primary key);
insert into t1 values(1),(2),(3),(4);
一。兩者的公共行為(相同點)
本節的查詢是要說明,xa prepare之前(無論是xa end 之前還是之后),xa事務的改動對外部不可見,也不持久化,退出連接就會丟失修改。xa事務信息本身也會在退出連接后消失,重新連接后xa recover不會顯示它。
這些行為mysql與mariadb相同,都是正確的。
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> xa start '124';
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t1 values(5);
Query OK, 1 row affected (0.00 sec)
mysql> insert into t1 values(6);
Query OK, 1 row affected (0.00 sec)
mysql> quit;
Bye
david@david-VirtualBox:~/mysql_installs/mysql-5.7.16$ ./bin/mysql -S ../../mysql-instances/a/mysql.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.16-debug-log Source distribution
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> xa recover;
Empty set (0.00 sec)
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> xa start '124';
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t1 values(5);
Query OK, 1 row affected (0.00 sec)
mysql> insert into t1 values(6);
Query OK, 1 row affected (0.00 sec)
mysql> xa end '124';
Query OK, 0 rows affected (0.00 sec)
mysql> quit;
Bye
david@david-VirtualBox:~/mysql_installs/mysql-5.7.16$ ./bin/mysql -S ../../mysql-instances/a/mysql.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.7.16-debug-log Source distribution
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> xa recover;
Empty set (0.00 sec)
mysql> select*from t1;
ERROR 1046 (3D000): No database selected
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select*from t1;
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
+------+
4 rows in set (0.00 sec)
二。MySQL的XA PREPARE 的行為
1. 對MySQL來說,在一個事務中執行xa prepare之后退出連接,xa事務不丟失,它的更新也不會丟失。
mysql> create table t1(a int);
Query OK, 0 rows affected (0.05 sec)
mysql> xa start '123';
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t1 values(1),(2);
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> insert into t1 values(3),(4);
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> xa end '123';
Query OK, 0 rows affected (0.00 sec)
mysql> xa prepare '123';
Query OK, 0 rows affected (0.07 sec)
mysql> quit;
Bye
david@david-VirtualBox:~/mysql_installs/mysql-5.7.16$ ./bin/mysql -S ../../mysql-instances/a/mysql.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.16-debug-log Source distribution
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> xa recover;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
| 1 | 3 | 0 | 123 |
+----------+--------------+--------------+------+
1 row in set (0.00 sec)
mysql> xa commit '123';
Query OK, 0 rows affected (0.01 sec)
mysql> select*From test.t1;
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
+------+
4 rows in set (0.00 sec)
2. 對mysql來說,在一個事務中執行xa prepare后,kill掉 mysqld ,xa 事務的改動及其元數據不丟失,mysql binlog系統與innodb做了協調一致的事務恢復,非常完美。重啟mysqld后,客戶端連接上去,執行xa recover可以看到這個prepared事務,執行 xa commit 可以完成該事務的提交。
mysql> xa start '124';
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t1 values(5);
Query OK, 1 row affected (0.00 sec)
mysql> insert into t1 values(6);
Query OK, 1 row affected (0.00 sec)
mysql> xa end '124';
Query OK, 0 rows affected (0.00 sec)
mysql> xa prepare '124';
Query OK, 0 rows affected (0.00 sec)
mysql> quit;
Bye
david@david-VirtualBox:~/mysql_installs/mysql-5.7.16$ ./bin/mysql -S ../../mysql-instances/a/mysql.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.16-debug-log Source distribution
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> xa recover;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
| 1 | 3 | 0 | 124 |
+----------+--------------+--------------+------+
1 row in set (0.00 sec)
mysql> quit
Bye
[1]+ Killed ./bin/mysqld_safe --defaults-file=../../mysql-instances/a/my.cnf
david@david-VirtualBox:~/mysql_installs/mysql-5.7.16$ ./bin/mysqld_safe --defaults-file=../../mysql-instances/a/my.cnf &
[1] 22109
david@david-VirtualBox:~/mysql_installs/mysql-5.7.16$ 2016-12-02T06:53:49.336023Z mysqld_safe Logging to '/home/david/mysql-instances/a/datadir/david-VirtualBox.err'.
2016-12-02T06:53:49.350561Z mysqld_safe Starting mysqld daemon with databases from /home/david/mysql-instances/a/datadir
david@david-VirtualBox:~/mysql_installs/mysql-5.7.16$ ./bin/mysql -S ../../mysql-instances/a/mysql.sock
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.16-debug-log Source distribution
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> xa recover;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
| 1 | 3 | 0 | 124 |
+----------+--------------+--------------+------+
1 row in set (0.00 sec)
mysql> select*from t1;
ERROR 1046 (3D000): No database selected
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select*from t1;
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
+------+
4 rows in set (0.00 sec)
mysql> xa commit '124';
Query OK, 0 rows affected (0.00 sec)
mysql> select*from t1;
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
+------+
6 rows in set (0.00 sec)
mysql>
3. mysql之所以能做到上述行為,是因為mysql在xa prepare 做了binlog刷盤,并且允許其后的xa commit被其他事務的binlog隔開。
本例中,新啟動一個xa事務 T1,在xa prepare 之后,另啟動一個連接在其中執行2個普通本地事務,然后再做xa commit T1,可以看到這個xa commit 的binlog與xa prepare被那兩個新事務的binlog隔開了。對MySQL來說這不是問題。
三。MariaDB的XA支持
mariadb沒有把xa start到xa prepare這一塊操作作為一個單獨的binlog事務并且在xa prepare時刻做binlog刷盤,因此xa prepare之后一旦數據庫連接斷開或者mysqld重啟,那么binlog子系統中prepared狀態的事務的binlog就消失了,但是innodb當中這個prepared事務是存在的也就是說binlog與innodb數據不一致了。
1. xa prepare 后連接斷開,那么這個prepared事務就消失了,包括事務修改數據和元數據。binlog上面沒有它的binlog,xa recover也不能列出這個事務。這就錯的更加離譜了,可能是連接斷開后mariadb把innodb中prepared的事務也回滾了。
MariaDB [(none)]> xa start '124';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> insert into t1 values(5);
ERROR 1046 (3D000): No database selected
MariaDB [(none)]> use test;
Database changed
MariaDB [test]> insert into t1 values(5);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> insert into t1 values(6);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> xa end '124';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> xa prepare '124';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> xa recover;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
| 1 | 3 | 0 | 124 |
+----------+--------------+--------------+------+
1 row in set (0.00 sec)
MariaDB [test]> quit;
Bye
[tdengine@TENCENT64 ~/davie/mysql_installs/tdsql-mariadb-10.1.9-bin-release3/install]$./jmysql.sh 6678
cmd: e
/data/6678/prod/mysql.sock
/data/home/tdengine/davie/mysql_installs/tdsql-mariadb-10.1.9-bin-release3
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.1.9-MariaDBV1.0R030D002-20161123-1511 Source distribution
Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> xa recover;
Empty set (0.00 sec)
MariaDB [(none)]> use test;
Database changed
MariaDB [test]> select*from t1;
Empty set (0.00 sec)
MariaDB [test]>
2. xa prepare 后kill掉mysqld,那么這個prepared事務在binlog上面不存在,不過xa recover能列出這個事務,并且它的改動也存在。
在innodb當中,執行了XA PREPARE之后,這個事務已經prepare了,但是它的binlog沒有寫到binlog文件中。mysqld被kill掉重新拉起后,innodb做了事務恢復,所以可以看到它之前的改動;但是binlog上面卻沒有這個事務。不過做xa commit 卻又可以提交這個事務。之后,這個事務的改動就可見了。但是,innodb中的數據與binlog已經不一致了。備機上面沒有事務內容
MariaDB [test]> xa start '125';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> insert into t1 values(9);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> insert into t1 values(10);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> xa end '125';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> xa prepare '125';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> quit
Bye
[tdengine@TENCENT64 ~/davie/mysql_installs/tdsql-mariadb-10.1.9-bin-release3/install]$./jmysql.sh 6678
cmd: e
/data/6678/prod/mysql.sock
/data/home/tdengine/davie/mysql_installs/tdsql-mariadb-10.1.9-bin-release3
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 10.1.9-MariaDBV1.0R030D002-20161123-1511 Source distribution
Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> xa recover;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
| 1 | 3 | 0 | 125 |
+----------+--------------+--------------+------+
1 row in set (0.00 sec)
MariaDB [(none)]> select*from t1;
ERROR 1046 (3D000): No database selected
MariaDB [(none)]> use test;
Database changed
MariaDB [test]> select*from t1;
+---+
| a |
+---+
| 5 |
| 6 |
| 7 |
| 8 |
+---+
4 rows in set (0.01 sec)
MariaDB [test]> xa commit '125';
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> select*from t1;
+----+
| a |
+----+
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+----+
6 rows in set (0.00 sec)
MariaDB [test]>
從最初的測例開始,一直沒有任何插入數據行的binlog(write_row)寫入。