事務結構 struct trx_t
寫在前面
InnoDB是MySQL的一個存儲引擎,支持事務,支持非堵塞的一致性讀,物理存儲結構是Page,每個事務都有回滾日志,重做日志,事務還會有死鎖檢測,各種各樣不同的鎖等等等等
翻看innodb的源碼,發現數下來開啟一個事務的時候,InnoDB需要處理63個變量,變量類型紛繁復雜,結構體,自定義的數據類型等等。
這次整理,我翻看了不少同行們寫的博客,包括InnoDB官方博客,有一個心得是不同人的翻譯不一樣,盡管是官方寫的博客,也會因為一些篇幅問題,減掉了一些細節的介紹。這次的整理,我特意保留了原來的英文備注,然后加上自己的理解進去。
MySQL的源碼文件很多,InnoDB也不少,不過我覺得事務的結構體代碼還是要理解透的,透了才能理解透事務的實現細節,更加深入認識數據庫系統為了保證數據的一致性,做了多少事情....
文件地址
文件地址
storage/innobase/include/trx0trx.h
變量1 magic_n
變量類型 ulint #define ulint unsigned long
變量2 mutex
變量類型 ib_mutex_t
mutex的代碼備注
Mutex protecting the fields state and lock (except some fields of lock, which are protected by lock_sys->mutex) Mutex保護字段的狀態和鎖定(除了某些鎖定字段,由lock_sys-> mutex保護)
ib_mutex_t對應的結構體
/** InnoDB mutex */
struct ib_mutex_t {os_event_t event; /*!< Used by sync0arr.cc for the wait queue */volatile lock_word_t lock_word; /*!< lock_word is the targetof the atomic test-and-set instruction whenatomic operations are enabled. */
這里面又有另外一個結構體 os_event_t
/** An asynchronous signal sent between threads */
struct os_event {
#ifdef __WIN__HANDLE handle; /*!< kernel event object, slow,used on older Windows */
#endifos_fast_mutex_t os_mutex; /*!< this mutex protects the nextfields */ibool is_set; /*!< this is TRUE when the event isin the signaled state, i.e., a threaddoes not stop if it tries to wait forthis event */ib_int64_t signal_count; /*!< this is incremented each timethe event becomes signaled */os_cond_t cond_var; /*!< condition variable is used inwaiting for the event */UT_LIST_NODE_T(os_event_t) os_event_list;/*!< list of all created events */
};
變量3 state
變量類型 trx_state_t 變量備注
事務狀態TRX_STATE_NOT_STARTED TRX_STATE_ACTIVE TRX_STATE_PREPARED TRX_STATE_COMMITTED_IN_MEMORY (alias below COMMITTED)
結構體trx_state_t的源碼
/** Transaction states (trx_t::state) */
enum trx_state_t {TRX_STATE_NOT_STARTED,TRX_STATE_ACTIVE,TRX_STATE_PREPARED, /* Support for 2PC/XA */TRX_STATE_COMMITTED_IN_MEMORY
};
變量4 lock
變量類型 trx_lock_t
變量備注
Information about the transaction locks and state. Protected by trx->mutex or lock_sys->mutex or both 事務鎖和狀態有關的信息。 受trx-> mutex或lock_sys-> mutex或兩者保護
trx_lock_t的結構體定義
struct trx_lock_t {ulint n_active_thrs; /*!< number of active query threads */trx_que_t que_state; /*!< valid when trx->state== TRX_STATE_ACTIVE: TRX_QUE_RUNNING,TRX_QUE_LOCK_WAIT, ... */lock_t* wait_lock; /*!< if trx execution state isTRX_QUE_LOCK_WAIT, this points tothe lock request, otherwise this isNULL; set to non-NULL when holdingboth trx->mutex and lock_sys->mutex;set to NULL when holdinglock_sys->mutex; readers shouldhold lock_sys->mutex, except whenthey are holding trx->mutex andwait_lock==NULL */ib_uint64_t deadlock_mark; /*!< A mark field that is initializedto and checked against lock_mark_counterby lock_deadlock_recursive(). */ibool was_chosen_as_deadlock_victim;/*!< when the transaction decides towait for a lock, it sets this to FALSE;if another transaction chooses thistransaction as a victim in deadlockresolution, it sets this to TRUE.Protected by trx->mutex. */time_t wait_started; /*!< lock wait started at this time,protected only by lock_sys->mutex */que_thr_t* wait_thr; /*!< query thread belonging to thistrx that is in QUE_THR_LOCK_WAITstate. For threads suspended in alock wait, this is protected bylock_sys->mutex. Otherwise, this mayonly be modified by the thread that isserving the running transaction. */mem_heap_t* lock_heap; /*!< memory heap for trx_locks;protected by lock_sys->mutex */UT_LIST_BASE_NODE_T(lock_t)trx_locks; /*!< locks requestedby the transaction;insertions are protected by trx->mutexand lock_sys->mutex; removals areprotected by lock_sys->mutex */ib_vector_t* table_locks; /*!< All table locks requested by thistransaction, including AUTOINC locks */ibool cancel; /*!< TRUE if the transaction is beingrolled back either via deadlockdetection or due to lock timeout. Thecaller has to acquire the trx_t::mutexin order to cancel the locks. Inlock_trx_table_locks_remove() wecheck for this cancel of a transaction'slocks and avoid reacquiring the trxmutex to prevent recursive deadlocks.Protected by both the lock sys mutexand the trx_t::mutex. */
};
變量5 is_recovered
變量類型 ulint 變量備注
0=normal transaction, 1=recovered, must be rolled back, protected by trx_sys->mutex when trx->in_rw_trx_list holds 0 =正常事務,1 =恢復,必須回滾,當trx-> in_rw_trx_list保持時由trx_sys-> mutex保護
變量6 op_info
變量類型 const char*
變量7 isolation_level
變量類型 ulint 變量備注 READ UNCOMMITTED,READ COMMITTED,REPEATABLE READ,SERIALIZABLE
變量8 is_registered:1
變量類型 unsigned 變量備注
MySQL has a transaction coordinator to coordinate two phase commit between multiple storage engines and the binary log. When an engine participates in a transaction, it's responsible for registering itself using the trans_register_ha() API. This flag is set to 1 after the transaction has been registered with the coordinator using the XA API, and is set to 0 after commit or rollback
MySQL有一個事務協調器來協調多個存儲引擎和二進制日志之間的兩階段提交,它負責使用trans_register_ha()API注冊自己.
在事務已使用XA API向協調器注冊后,此標志設置為1,并在提交或回滾后設置為0
變量9 check_unique_secondary
變量類型 ulint
變量備注
通常為TRUE,但是如果用戶想要通過抑制對二級索引的唯一鍵檢查來加快插入,當我們決定是否可以為它們使用插入緩沖區時,我們設置這個FALSE
變量10 support_xa
變量類型 ulint
變量備注 normally we do the XA two-phase commit steps, but by setting this to FALSE, one can save CPU time and about 150 bytes in the undo log size as then we skip XA steps
通常我們做XA兩階段提交步驟,但通過設置為FALSE,可以節省CPU時間和大約150字節的undo日志大小,然后我們跳過XA步驟
變量11 flush_log_later
變量類型 ulint
變量備注
In 2PC, we hold the prepare_commit mutex across both phases. In that case, we defer flush of the logs to disk until after we release the mutex
在兩階段提交,我們持有兩個階段的prepare_commit互斥體。 在這種情況下,我們推遲刷新日志到磁盤,直到我們釋放互斥體。
變量12 must_flush_log_later
變量類型 ulint
變量備注
this flag is set to TRUE in trx_commit() if flush_log_later was TRUE, and there were modifications by the transaction; in that case we must flush the log in trx_commit_complete_for_mysql()
此標志在trx_commit()中設置為TRUE,如果flush_log_later為TRUE,并且事務進行了修改; 在這種情況下,我們必須通過trx_commit_complete_for_mysql()刷新日志
變量13 duplicates
變量類型 ulint
變量備注 TRX_DUP_IGNORE | TRX_DUP_REPLACE
變量14 has_search_latch
變量類型 ulint
變量備注
TRUE if this trx has latched the search system latch in S-mode 如果此trx已在S模式下鎖定搜索系統鎖存器,則為TRUE
變量15 search_latch_timeout
變量類型 ulint
變量備注
If we notice that someone is waiting for our S-lock on the search latch to be released, we wait in?row0sel.cc?for BTR_SEA_TIMEOUT new searches until we try to keep the search latch again over calls from MySQL; this is intended to reduce contention on the search latch
如果我們注意到有事務在等待我們的S鎖鎖定被釋放,我們在row0sel.cc中等待BTR_SEA_TIMEOUT新的搜索,直到我們嘗試保持搜索鎖定再次超過來自MySQL的調用; 這旨在減少對搜索鎖存器的爭用
變量16 dict_operation
變量類型 trx_dict_op_t
變量備注
主要是三種狀態 沒有修改表結構=0 改表=1 修改索引=2
/** Type of data dictionary operation */
enum trx_dict_op_t {/** The transaction is not modifying the data dictionary. */TRX_DICT_OP_NONE = 0,/** The transaction is creating a table or an index, ordropping a table. The table must be dropped in crashrecovery. This and TRX_DICT_OP_NONE are the only possibleoperation modes in crash recovery. */TRX_DICT_OP_TABLE = 1,/** The transaction is creating or dropping an index in anexisting table. In crash recovery, the data dictionarymust be locked, but the table must not be dropped. */TRX_DICT_OP_INDEX = 2
};
變量17 declared_to_be_inside_innodb
變量 ulint
備注
this is TRUE if we have declared this transaction in srv_conc_enter_innodb to be inside the InnoDB engine 如果我們已經在srv_conc_enter_innodb中聲明這個事務在InnoDB引擎中,那么這是TRUE
變量18 n_tickets_to_enter_innodb
變量類型 ulint
變量備注
< this can be > 0 only when declared_to_... is TRUE; when we come to srv_conc_innodb_enter, if the value here is > 0, we decrement this by 1
<這可以> 0只有當declared_to_be_inside_innodb _…為TRUE 當我們來到srv_conc_innodb_enter,如果這里的值> 0,我們將其減1
變量19 dict_operation_lock_mode
變量類型 ulint
變量備注
0, RW_S_LATCH, or RW_X_LATCH: the latch mode trx currently holds on dict_operation_lock. Protected by dict_operation_lock 0,RW_S_LATCH或RW_X_LATCH:鎖存模式trx當前持有dict_operation_lock。 受dict_operation_lock保護
變量20 no
變量類型 trx_id_t
storage/innobase/include/univ.i
typedef ib_uint64_t ib_id_t;storage/innobase/include/trx0types.h
/** Transaction identifier (DB_TRX_ID, DATA_TRX_ID) */
typedef ib_id_t trx_id_t;
變量備注
transaction serialization number: max trx id shortly before the transaction is moved to COMMITTED_IN_MEMORY state. Protected by trx_sys_t::mutex when trx->in_rw_trx_list. Initially set to TRX_ID_MAX
事務序列號:事務被移動到COMMITTED_IN_MEMORY狀態之前不久的max trx id。 當trx-> in_rw_trx_list時,由trx_sys_t :: mutex保護。 最初設置為TRX_ID_MAX
變量21 start_time
變量類型 time_t
變量備注 time the trx state last time became TRX_STATE_ACTIVE
trx狀態變為TRX_STATE_ACTIVE的上一次時間(開始時間)
變量22 id
變量類型 trx_id_t
變量備注 transaction id 事務id
變量23 XID
變量類型 XID
sql/handler.h
typedef struct xid_t XID;
xid_t是一個結構體,代碼太多,這里就不貼了
變量備注 X/Open XA transaction identification to identify a transaction branch 打開分布式事務的標識,以便區分事務分支
變量24 commit_lsn
變量類型
/* Type used for all log sequence number storage and arithmetics */
typedef ib_uint64_t lsn_t;
變量備注
lsn at the time of the commit lsn提交的時間
變量25 table_id
變量類型 table_id_t
變量備注 Table to drop iff dict_operation == TRX_DICT_OP_TABLE, or 0
變量26 mysql_thd
變量類型 THD* 這是一個超級大的結構體,詳情需要看代碼
變量備注
MySQL thread handle corresponding to this trx, or NULL 對應這個trx的MySQL線程句柄,或NULL
變量27 mysql_log_file_name
變量類型 const char*
變量備注
if MySQL binlog is used, this field contains a pointer to the latest file name; this is NULL if binlog is not used
如果使用MySQL binlog,此字段包含指向最新文件名的指針; 如果不使用binlog,則為NULL
變量28 mysql_log_offset
變量類型 ib_int64_t
變量備注
if MySQL binlog is used, this field contains the end offset of the binlog entry
如果使用MySQL binlog,則此字段包含binlog條目的結束偏移量
變量29 n_mysql_tables_in_use
變量類型 ulint
變量備注 number of Innobase tables used in the processing of the current SQL statement in MySQL
MySQL中處理當前SQL語句時使用的Innobase表的數量
變量30 mysql_n_tables_locked
變量類型 ulint
變量備注
how many tables the current SQL statement uses, except those in consistent read
變量31 trx_list
變量類型 UT_LIST_NODE_T(trx_t)
#define UT_LIST_NODE_T(TYPE) \
struct { \TYPE* prev; /*!< pointer to the previous node, \NULL if start of list */ \TYPE* next; /*!< pointer to next node, NULL if end of list */\
}
變量備注
list of transactions; protected by trx_sys->mutex. The same node is used for both trx_sys_t::ro_trx_list and trx_sys_t::rw_trx_list
事務列表:受trx_sys->mutex保護。 同樣的節點用于trx_sys_t :: ro_trx_list和trx_sys_t :: rw_trx_list
變量32 mysql_trx_list
變量類型 UT_LIST_NODE_T(trx_t)
變量備注
list of transactions created for MySQL; protected by trx_sys->mutex
變量33 error_state
變量類型 dberr_t
變量備注
0 if no error, otherwise error number; NOTE That ONLY the thread doing the transaction is allowed to set this field: this is NOT protected by any mutex
錯誤碼 注意只有執行事務的線程才允許設置此字段 這不受任何互斥保護
變量34 dict_index_t*error_info
變量類型 const
變量備注
if the error number indicates a duplicate key error, a pointer to the problematic index is stored here 如果錯誤號表示重復鍵錯誤,則在此存儲指向有問題索引的指針
變量35 error_key_num
變量類型 ulint
變量備注
if the index creation fails to a duplicate key error, a mysql key number of that index is stored here
如果索引創建失敗,重復的鍵錯誤,該索引的mysql鍵號存儲在這里
變量36 sess
變量類型 sess_t*
/* The session handle. This data structure is only used by purge and is
not really necessary. We should get rid of it. */
struct sess_t{ulint state; /*!< state of the session */trx_t* trx; /*!< transaction object permanentlyassigned for the session: thetransaction instance designated by thetrx id changes, but the memorystructure is preserved */UT_LIST_BASE_NODE_T(que_t)graphs; /*!< query graphs belonging to thissession */
};
變量備注
session of the trx, NULL if none
變量37 graph
變量類型que_t*
變量備注
query currently run in the session, or NULL if none; NOTE that the query belongs to the session, and it can survive over a transaction commit, if it is a stored procedure with a COMMIT WORK statement, for instance
查詢當前在運行中的會話,如果沒有則為NULL 注意,查詢屬于會話,并且它可以通過事務提交存活,如果它是具有COMMIT WORK語句的存儲過程,例如
變量38 global_read_view_heap
變量類型 mem_heap_t*
/* A memory heap is a nonempty linear list of memory blocks */
typedef mem_block_t mem_heap_t;/* A block of a memory heap consists of the info structure
followed by an area of memory */
typedef struct mem_block_info_t mem_block_t;mem_block_info_t又是一個復雜的結構體
變量備注
memory heap for the global read view 內存堆的全局讀取視圖
變量39 global_read_view
變量類型 read_view_t* 又是一個比較復雜的結構體
一致的讀取視圖關聯到事務或NULL
變量40 read_view
變量類型 read_view_t* 變量備注
consistent read view used in the transaction or NULL, this read view if defined can be normal read view associated to a transaction (i.e. same as global_read_view) or read view associated to a cursor
一致的讀取視圖用于事務或NULL,此讀取視圖如果定義可以是與事務相關聯的正常讀取視圖(即與global_read_view相同)或與光標相關聯的讀取視圖
變量41 trx_savepoints
變量類型 UT_LIST_BASE_NODE_T(trx_named_savept_t)
變量備注 savepoints set with SAVEPOINT ..., oldest first 使用SAVEPOINT設置保存點的列表 ...,最舊的放在最前面
變量42 undo_mutex
變量類型 ib_mutex_t
變量備注 mutex protecting the fields in this section (down to undo_no_arr), EXCEPT last_sql_stat_start, which can be accessed only when we know that there cannot be any activity in the undo logs!
互斥體保護此部分中的字段(向下到undo_no_arr),除了 last_sql_stat_start,只有當我們知道在撤銷日志中不能有任何活動時才可以訪問它們!
變量43 undo_no
變量類型undo_no_t
/** Undo number */
typedef ib_id_t undo_no_t;
變量備注
next undo log record number to assign; since the undo log is private for a transaction, this is a simple ascending sequence with no gaps; thus it represents the number of modified/inserted rows in a transaction
分配下一個撤銷日志記錄號; 因為撤銷日志對于事務是私有的,這是一個沒有間隙的簡單升序序列; 因此它表示事務中修改/插入行的數量。
變量44 last_sql_stat_start
變量類型 trx_savept_t
變量備注 undo_no when the last sql statement was started: in case of an error, trx is rolled back down to this undo number; see note at undo_mutex
當最后一個sql語句啟動時,分配undo_no:在出現錯誤的情況下,trx被回滾到這個撤銷號; 請參見undo_mutex中的注釋
變量45 rseg
變量類型 trx_rseg_t* 這又是一個比較復雜的結構體,但segement其實是和Page綁定在一起的, 變量備注
rollback segment assigned to the transaction, or NULL if not assigned yet
回滾段分配給事務,如果尚未分配,則為NULL
變量46 insert_undo
變量類型 trx_undo_t*
變量備注
pointer to the insert undo log, or NULL if no inserts performed yet
指向插入undo日志的指針,如果尚未執行插入,則為NULL
變量47 update_undo
變量類型 trx_undo_t*
變量備注 pointer to the update undo log, or NULL if no update performed yet
變量48 roll_limit
變量類型 undo_no_t
變量備注 least undo number to undo during a rollback
事務回滾的時候,最小的回滾數
變量49 pages_undone
變量類型 ulint
變量備注 number of undo log pages undone since the last undo log truncation
自上次撤消日志截斷以來撤銷日志頁面的數量
變量50 undo_no_arr
變量類型 trx_undo_arr_t*
變量備注 array of undo numbers of undo log records which are currently processed by a rollback operation
撤消日志記錄的撤銷編號數組,這些撤消日志記錄當前由回滾操作處理
變量51 n_autoinc_rows
變量類型ulint
變量備注 no. of AUTO-INC rows required for an SQL statement. This is useful for multi-row INSERTs
SQL語句所需的AUTO-INC行的數量。 這對于多行INSERT非常有用
變量52 autoinc_locks
變量類型 ib_vector_t*
變量備注
AUTOINC locks held by this transaction. Note that these are also in the lock list trx_locks. This vector needs to be freed explicitly when the trx instance is destroyed. Protected by lock_sys->mutex
此事務持有的AUTOINC鎖。 注意,這些也在鎖定列表trx_locks中。 當trx實例被銷毀時,這個向量需要被明確地釋放。 受lock_sys-> mutex保護
變量53 read_only
變量類型 ibool
變量備注 TRUE if transaction is flagged as a READ-ONLY transaction. if !auto_commit || will_lock > 0 then it will added to the list trx_sys_t::ro_trx_list. A read only transaction will not be assigned an UNDO log. Non-locking auto-commit read-only transaction will not be on either list
如果事務標記為READ-ONLY事務,則為TRUE。 if!auto_commit || will_lock> 0,那么它將被添加到列表trx_sys_t :: ro_trx_list。 只讀事務將不會被分配UNDO日志。 非鎖定自動提交只讀事務將不在任一列表上
變量54 auto_commit
變量類型 ibool
變量備注 TRUE if it is an autocommit
變量55 will_lock
變量類型 ulint
變量備注
Will acquire some locks. Increment each time we determine that a lock will be acquired by the MySQL layer
標記會獲得一些鎖。 每次我們確定需要獲取鎖的時候,將被MySQL層獲取時遞增
變量56 ddl
變量類型 bool
變量備注 true if it is a transaction that is being started for a DDL operation
如果它是為DDL操作啟動的事務,則為true
變量57 fts_trx
變量類型 fts_trx_t*
變量類型 FTS information, or NULL if transaction hasn't modified tables with FTS indexes (yet)
FTS信息,如果事務尚未修改具有FTS索引的表(尚未),則為NULL;
變量58 fts_next_doc_id
變量類型 doc_id_t
變量備注
The document id used for updates 用于更新的文檔ID
變量59 flush_tables
變量類型 ulint
變量備注 if "covering" the FLUSH TABLES",count of tables being flushed
如果“覆蓋”FLUSH TABLES“,則表的計數被刷新
變量60 api_trx
變量類型 bool
變量備注 trx started by InnoDB API
變量61 api_auto_commit
變量類型 bool
變量備注 automatic commit
變量62 read_write
變量類型 bool
變量備注 if read and write operation
變量63 detailed_error[256]
變量類型char
變量備注 detailed error message for last error, or empty