一、場景描述

????公司某工程師執行db.giveget_card.drop(),誤將線上表刪除。


????幸好每天都有做備份,這個時候就體現了備份的重要性了,哈哈哈。。。


二、模擬故障過程


備份數據大小:

rs_test01:PRIMARY>?use?ycsb
switched?to?db?ycsb
rs_test01:PRIMARY>?db.giveget_card.count();
3173391


刪除之前,此表有更新。

rs_test01:PRIMARY>?db.giveget_card.insert({id:1});
WriteResult({?"nInserted"?:?1?})
rs_test01:PRIMARY>?db.giveget_card.insert({id:2});
WriteResult({?"nInserted"?:?1?})
rs_test01:PRIMARY>?db.giveget_card.insert({id:3});
WriteResult({?"nInserted"?:?1?})
rs_test01:PRIMARY>?db.giveget_card.insert({id:4});
WriteResult({?"nInserted"?:?1?})


其他表也有更新操作。

rs_test01:PRIMARY>?db.tab.find();
{?"_id"?:?ObjectId("59354ba202d9a99ab2f879c6"),?"name"?:?"a"?}
{?"_id"?:?ObjectId("59354ba602d9a99ab2f879c7"),?"name"?:?"b"?}
{?"_id"?:?ObjectId("59354ba802d9a99ab2f879c8"),?"name"?:?"c"?}
{?"_id"?:?ObjectId("59354baa02d9a99ab2f879c9"),?"name"?:?"d"?}


刪除操作之后,此表和其他表都有更新。

rs_test01:PRIMARY>?db.giveget_card.find();
{?"_id"?:?ObjectId("59354c28d905432aeaccd53c"),?"id"?:?5?}
{?"_id"?:?ObjectId("59354c2bd905432aeaccd53d"),?"id"?:?6?}
rs_test01:PRIMARY>?db.tab.find();
{?"_id"?:?ObjectId("59354ba202d9a99ab2f879c6"),?"name"?:?"a"?}
{?"_id"?:?ObjectId("59354ba602d9a99ab2f879c7"),?"name"?:?"b"?}
{?"_id"?:?ObjectId("59354ba802d9a99ab2f879c8"),?"name"?:?"c"?}
{?"_id"?:?ObjectId("59354baa02d9a99ab2f879c9"),?"name"?:?"d"?}
{?"_id"?:?ObjectId("59354ccfd905432aeaccd542"),?"name"?:?"e"?}
{?"_id"?:?ObjectId("59354cd2d905432aeaccd543"),?"name"?:?"f"?}


三、恢復步驟


1、將備份中 tab 表的 giveget_card.bson 及 giveget_card.metadata.json 文件拷貝到 /tmp/restore/ycsb 目錄(自建目錄),ycsb 為庫名。

#?cp?/data/backup/rs07/ycsb/giveget_card.*?/tmp/restore/ycsb


2、將備份時間之后,誤刪操作之前的 oplog 導出,用于恢復表

#?mongodump?--port?2203?-d?local?-c?oplog.rs?-q?'{"ts"?:?{$gte?:?Timestamp(1496664480,?10430),?$lte?:?Timestamp(1496665113,?10430)}}'?-o?/tmp/oplog

--時間戳 是使用轉換工具轉換之后的結果。


3、使用 bsondump 查看 oplog 日志,找到 drop 操作的時間戳 1496665069

#?bsondump?/tmp/oplog/local/oplog.rs.bson?
{"ts":{"$timestamp":{"t":1496664760,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"7079172056815894727"},"v":2,"op":"i","ns":"ycsb.giveget_card","o":{"_id":{"$oid":"59354ab8c5308d8c7a9da8b5"},"id":1.0}}
{"ts":{"$timestamp":{"t":1496664762,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"-1797107728294067016"},"v":2,"op":"i","ns":"ycsb.giveget_card","o":{"_id":{"$oid":"59354abac5308d8c7a9da8b6"},"id":2.0}}
{"ts":{"$timestamp":{"t":1496664765,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"8604646791509150392"},"v":2,"op":"i","ns":"ycsb.giveget_card","o":{"_id":{"$oid":"59354abdc5308d8c7a9da8b7"},"id":3.0}}
{"ts":{"$timestamp":{"t":1496664768,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"9018614066505371436"},"v":2,"op":"i","ns":"ycsb.giveget_card","o":{"_id":{"$oid":"59354ac0c5308d8c7a9da8b8"},"id":4.0}}
{"ts":{"$timestamp":{"t":1496664994,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"-4471524661347063602"},"v":2,"op":"c","ns":"ycsb.$cmd","o":{"create":"tab"}}
{"ts":{"$timestamp":{"t":1496664994,"i":2}},"t":{"$numberLong":"12"},"h":{"$numberLong":"-4215905958456607246"},"v":2,"op":"i","ns":"ycsb.tab","o":{"_id":{"$oid":"59354ba202d9a99ab2f879c6"},"name":"a"}}
{"ts":{"$timestamp":{"t":1496664998,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"6170506962401844481"},"v":2,"op":"i","ns":"ycsb.tab","o":{"_id":{"$oid":"59354ba602d9a99ab2f879c7"},"name":"b"}}
{"ts":{"$timestamp":{"t":1496665000,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"-8071456063660489895"},"v":2,"op":"i","ns":"ycsb.tab","o":{"_id":{"$oid":"59354ba802d9a99ab2f879c8"},"name":"c"}}
{"ts":{"$timestamp":{"t":1496665002,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"4387884836668659146"},"v":2,"op":"i","ns":"ycsb.tab","o":{"_id":{"$oid":"59354baa02d9a99ab2f879c9"},"name":"d"}}
{"ts":{"$timestamp":{"t":1496665069,"i":1}},"t":{"$numberLong":"12"},"h":{"$numberLong":"-6913449254950935781"},"v":2,"op":"c","ns":"ycsb.$cmd","o":{"drop":"giveget_card"}}
2017-06-05T20:27:25.552+0800	10?objects?found


4、將 oplog 的 bson 文件拷貝到相應目錄下

#?cp?/tmp/oplog/local/oplog.rs.bson?/tmp/restore/oplog.bson


此時恢復的目錄結構:

#?pwd
/tmp/restore
#?ls
oplog.bson??ycsb

? ??

5、至此,所有的準備操作已經做完,恢復數據。

[root@ops-db-test02?restore]#?mongorestore?--port?2203?--oplogReplay?--oplogLimit=1496665069:1?/tmp/restore
2017-06-05T20:36:45.361+0800	building?a?list?of?dbs?and?collections?to?restore?from?/tmp/restore?dir
2017-06-05T20:36:45.364+0800	reading?metadata?for?ycsb.giveget_card?from?/tmp/restore/ycsb/giveget_card.metadata.json
2017-06-05T20:36:45.364+0800	restoring?ycsb.giveget_card?from?/tmp/restore/ycsb/giveget_card.bson
2017-06-05T20:36:48.362+0800	[........................]??ycsb.giveget_card??15.4MB/475MB??(3.2%)
2017-06-05T20:36:51.362+0800	[#.......................]??ycsb.giveget_card??31.1MB/475MB??(6.6%)
2017-06-05T20:36:54.362+0800	[##......................]??ycsb.giveget_card??46.6MB/475MB??(9.8%)
2017-06-05T20:36:57.362+0800	[###.....................]??ycsb.giveget_card??62.1MB/475MB??(13.1%)
2017-06-05T20:37:00.362+0800	[###.....................]??ycsb.giveget_card??76.4MB/475MB??(16.1%)
2017-06-05T20:37:03.362+0800	[####....................]??ycsb.giveget_card??90.7MB/475MB??(19.1%)
2017-06-05T20:37:06.362+0800	[#####...................]??ycsb.giveget_card??105MB/475MB??(22.0%)
2017-06-05T20:37:09.362+0800	[######..................]??ycsb.giveget_card??120MB/475MB??(25.2%)
2017-06-05T20:37:12.362+0800	[######..................]??ycsb.giveget_card??133MB/475MB??(28.0%)
2017-06-05T20:37:15.362+0800	[#######.................]??ycsb.giveget_card??146MB/475MB??(30.8%)
2017-06-05T20:37:18.363+0800	[########................]??ycsb.giveget_card??163MB/475MB??(34.3%)
2017-06-05T20:37:21.362+0800	[########................]??ycsb.giveget_card??178MB/475MB??(37.4%)
2017-06-05T20:37:24.362+0800	[#########...............]??ycsb.giveget_card??196MB/475MB??(41.3%)
2017-06-05T20:37:27.362+0800	[##########..............]??ycsb.giveget_card??214MB/475MB??(45.0%)
2017-06-05T20:37:30.362+0800	[###########.............]??ycsb.giveget_card??231MB/475MB??(48.6%)
2017-06-05T20:37:33.362+0800	[############............]??ycsb.giveget_card??245MB/475MB??(51.5%)
2017-06-05T20:37:36.362+0800	[#############...........]??ycsb.giveget_card??261MB/475MB??(54.8%)
2017-06-05T20:37:39.362+0800	[##############..........]??ycsb.giveget_card??279MB/475MB??(58.7%)
2017-06-05T20:37:42.362+0800	[###############.........]??ycsb.giveget_card??297MB/475MB??(62.5%)
2017-06-05T20:37:45.362+0800	[###############.........]??ycsb.giveget_card??312MB/475MB??(65.8%)
2017-06-05T20:37:48.362+0800	[################........]??ycsb.giveget_card??328MB/475MB??(69.0%)
2017-06-05T20:37:51.362+0800	[#################.......]??ycsb.giveget_card??341MB/475MB??(71.8%)
2017-06-05T20:37:54.362+0800	[#################.......]??ycsb.giveget_card??356MB/475MB??(74.9%)
2017-06-05T20:37:57.362+0800	[##################......]??ycsb.giveget_card??373MB/475MB??(78.5%)
2017-06-05T20:38:00.362+0800	[###################.....]??ycsb.giveget_card??388MB/475MB??(81.7%)
2017-06-05T20:38:03.362+0800	[####################....]??ycsb.giveget_card??405MB/475MB??(85.2%)
2017-06-05T20:38:06.362+0800	[#####################...]??ycsb.giveget_card??419MB/475MB??(88.2%)
2017-06-05T20:38:09.362+0800	[#####################...]??ycsb.giveget_card??434MB/475MB??(91.4%)
2017-06-05T20:38:12.362+0800	[######################..]??ycsb.giveget_card??442MB/475MB??(93.1%)
2017-06-05T20:38:15.362+0800	[#######################.]??ycsb.giveget_card??459MB/475MB??(96.6%)
2017-06-05T20:38:18.362+0800	[#######################.]??ycsb.giveget_card??475MB/475MB??(99.9%)
2017-06-05T20:38:18.427+0800	[########################]??ycsb.giveget_card??475MB/475MB??(100.0%)
2017-06-05T20:38:18.427+0800	restoring?indexes?for?collection?ycsb.giveget_card?from?metadata
2017-06-05T20:38:44.680+0800	finished?restoring?ycsb.giveget_card?(3173391?documents)
2017-06-05T20:38:44.680+0800	replaying?oplog
2017-06-05T20:38:44.739+0800	done


6、查看恢復的結果

rs_test01:PRIMARY>?db.giveget_card.find({id?:?{$gte?:?1?}});
{?"_id"?:?ObjectId("59354cb9d905432aeaccd540"),?"id"?:?5?}
{?"_id"?:?ObjectId("59354cc0d905432aeaccd541"),?"id"?:?6?}
{?"_id"?:?ObjectId("59354ab8c5308d8c7a9da8b5"),?"id"?:?1?}
{?"_id"?:?ObjectId("59354abac5308d8c7a9da8b6"),?"id"?:?2?}
{?"_id"?:?ObjectId("59354abdc5308d8c7a9da8b7"),?"id"?:?3?}
{?"_id"?:?ObjectId("59354ac0c5308d8c7a9da8b8"),?"id"?:?4?}


數據內容相同,但存儲順序與之前數據的存儲順序不同了。

rs_test01:PRIMARY>?db.giveget_card.count();
3173397


結果 count= 備份表數據 3173391+ 之后的更新數據 6 。

????

7、因為 dump 出來的 oplog 也包含了其他表的操作。查看恢復過程中有沒有對其他表產生影響。

rs_test01:PRIMARY>?db.tab.find();
{?"_id"?:?ObjectId("59354ba202d9a99ab2f879c6"),?"name"?:?"a"?}
{?"_id"?:?ObjectId("59354ba602d9a99ab2f879c7"),?"name"?:?"b"?}
{?"_id"?:?ObjectId("59354ba802d9a99ab2f879c8"),?"name"?:?"c"?}
{?"_id"?:?ObjectId("59354baa02d9a99ab2f879c9"),?"name"?:?"d"?}
{?"_id"?:?ObjectId("59354ccfd905432aeaccd542"),?"name"?:?"e"?}
{?"_id"?:?ObjectId("59354cd2d905432aeaccd543"),?"name"?:?"f"?}

--查看 tab 表的數據跟原表數據相同,沒有什么影響,說明其他表的日志在空跑。


以上就是備份結合 oplog 的恢復操作。


備份很重要!!! 備份很重要!!! 備份很重要!!!重要的事情講三遍~~~