本文討論FIL_PAGE_INDEX頁的可回收垃圾記錄(Garbage/Deleted Records),當我們刪除某一條記錄(delete from …)時,通常InnoDB并不會在物理存儲上進行完全刪除,而是在記錄上置一個刪除標志位,我們稱這些行記錄為垃圾記錄,刪除標志位位于對應記錄的Extra Bytes中。與正常的記錄(User Records)類似,InnoDB在頁內也有一個單向鏈表將可回收垃圾記錄串在一起,用戶記錄是從Infimum Record開始,到Supremum Record結束, 而可回收垃圾記錄則是從"FIL_PAGE_INDEX"頁中的"INDEX Header"中的PAGE_FREE (First Garbage Record Offset) 開始,到最后一條垃圾數據結束(next_offset=0,指向自己)。記錄遍歷的方式都是一樣的。
sakila數據庫是MySQL官方的案例庫,這里介紹一下獲取方法,打開鏈接: https://dev.mysql.com/doc/index-other.html,然后找到"Example Databases"章節,即可獲取下載和相關文檔鏈接。
上代碼,解析Garbage Records:
/*
在數據庫中執行刪除操作:
root@localhost [sakila]> set foreign_key_checks = 0; --> 因為sakila有外鍵約束, 為了簡化刪除, 先在會話中禁用外鍵檢查;
Query OK, 0 rows affected (0.00 sec)root@localhost [sakila]> delete from sakila.film where film_id in (646, 656, 666); --> 刪除3條記錄;
Query OK, 3 rows affected (0.01 sec)
*/public class IdxPage5 {public static void main(String[] args) throws Exception {String fileName = "D:\\Data\\mysql\\8.0.18\\data\\sakila\\film.ibd";try (IbdFileParser parser = new IbdFileParser(fileName)) {// 通過上文"InnoDB文件物理結構解析4"可知,page(14)包含film_id=(565, 668)之間的數據;Page page = parser.getPage(14);// System.out.println(ParserHelper.hexDump(page.getPageRaw()));ClusteredKeyLeafPage leafPage = new ClusteredKeyLeafPage(page.getPageRaw(), page.getPageSize());List<ClusteredKeyLeafRecord> garbageRecords = leafPage.getUserRecords(IdxPage3.getFilmTableMeta());StringBuilder buff = new StringBuilder();for (ClusteredKeyLeafRecord record : garbageRecords) {List<RecordField> fields = record.getRecordFields();buff.append("\n ==> Extra: deleted = ").append(record.getDeletedFlag()).append("; next offset = ").append(record.getNextRecordOffset()).append(" <==\n");for (RecordField field : fields) {buff.append(String.format("%20s", field.getName())).append(": ").append(field.getContent()).append("\n");}}System.out.println(buff);}}
}/*
程序執行結果:==> Extra: deleted = true; next offset = -1432 <==film_id: 666DB_TRX_ID: 00000000843cDB_ROLL_PTR: 01000001321526title: PAYCHECK WAITdescription: A Awe-Inspiring Reflection of a Boy And a Man who must Discover a Moose in The Sahara Desertrelease_year: 2006language_id: 1
original_language_id: nullrental_duration: 4rental_rate: 4.99length: 145replacement_cost: 27.99rating: PG-13special_features: [Commentaries, Deleted Scenes]last_update: 2006-02-15T05:03:42==> Extra: deleted = true; next offset = -1489 <==film_id: 656DB_TRX_ID: 00000000843cDB_ROLL_PTR: 010000013214c7title: PAPI NECKLACEdescription: A Fanciful Display of a Car And a Monkey who must Escape a Squirrel in Ancient Japanrelease_year: 2006language_id: 1
original_language_id: nullrental_duration: 3rental_rate: 0.99length: 128replacement_cost: 9.99rating: PGspecial_features: [Trailers, Deleted Scenes, Behind the Scenes]last_update: 2006-02-15T05:03:42==> Extra: deleted = true; next offset = 0 <==film_id: 646DB_TRX_ID: 00000000843cDB_ROLL_PTR: 01000001321466title: OUTBREAK DIVINEdescription: A Unbelieveable Yarn of a Database Administrator And a Woman who must Succumb a A Shark in A U-Boatrelease_year: 2006language_id: 1
original_language_id: nullrental_duration: 6rental_rate: 0.99length: 169replacement_cost: 12.99rating: NC-17special_features: [Trailers, Deleted Scenes, Behind the Scenes]last_update: 2006-02-15T05:03:42
*/
程序輸出和我們預想的一樣,記錄雖然被delete語句刪除了,但是數據還是保留在頁內的。只是Extra Bytes的delete flag被置為true,最后一條被刪除的記錄指向的不是"Supremum Record",而是自己(next offset = 0)。
案例中獲取刪除數據用到了ClusteredKeyLeafPage的getGarbageRecords()方法,與獲取普通用戶記錄的getUserRecords()方法的唯一不同是遍歷記錄的開始位置不同:
public class ClusteredKeyLeafPage {public List<ClusteredKeyLeafRecord> getUserRecords(TableMeta tableMeta) {int pos = getSystemRecords().getInfimumNextRecordPos(); return iterateRecordInPage(tableMeta, pos);}public List<ClusteredKeyLeafRecord> getGarbageRecords(TableMeta tableMeta) {int pos = getIndexHeader().getFirstGarbageOffset();return iterateRecordInPage(tableMeta, pos);}private List<ClusteredKeyLeafRecord> iterateRecordInPage(TableMeta tableMeta, int firstRecordPos) {//...// 遍歷結束的條件if (recCount > maxRecs || nextOffset == 0 || nextRecord == SUPREMUM_EXTRA_END_POS) {break;}//}
}
更多測試情況:
-
如果執行的是"truncate sakila.film",該方法是無效的,因為整個ibd文件的存儲空間會被"重置"(文件會變小,沒有page(14)),全表刪除(“delete from sakila.film”)通常不會,但也有特例,當一個表的數據量非常小(索引深度小于1),所有的行都在一個(Leaf) Page時,觀察到全表刪除和truncate一樣,整個頁的記錄數據會被清掉(置為00),可以通過hexdump確認。
-
在刪除整個頁內的記錄時,記錄雖然不會被清掉,但觀察到會有部分刪除記錄在User Record鏈表內的情況。
最后,介紹一個通過hexdump命令查看某個頁內容的方法:
# 假設我們要看page(4),Page的大小為16KB(16384字節);
# 那么page(4)的起始位置為 4 * 16384=65536,讀取長度為16384;
# 所以命令hexdump的命令為:
[think@TP-T470 sakila]$ hexdump --skip 65536 --length 16384 -C -v film2.ibd