Oracle非歸檔模式遇到文件損壞怎么辦?

昨天夜里基地夜班的兄弟,打電話說有個報表庫連不上了,趕緊起來連上VPN查看一下,看到實例宕機了,先趕緊startup起來。

1.查看報錯信息

環境介紹:Redhat 6.9 Oracle 11.2.0.4? ?No Archive Mode

查看alert log 關鍵報錯信息如下

Thread 1 advanced to log sequence 4231012 (LGWR switch)Current log# 2 seq# 4231012 mem# 0: /oradata/rtp/redo02.log
Thu May 08 23:22:56 2025
KCF: read, write or open error, block=0x240ab online=1file=118 '/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf'error=27072 txt: 'Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1'
Errors in file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_dbw0_3300.trc:
Errors in file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_dbw0_3300.trc:
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 118 (block # 147627)
ORA-01110: data file 118: '/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf'
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1
DBW0 (ospid: 3300): terminating the instance due to error 63999
Thu May 08 23:22:57 2025
System state dump requested by (instance=1, osid=3300 (DBW0)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_diag_3292_20250508232257.trc
Instance terminated by DBW0, pid = 3300

排查路徑??查看報錯的trc文件

Trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp\_dbw0\_3300.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE\_HOME = /u01/app/oracle/product/11.2.0/db\_1
System name:    Linux
Node name:      rtpdb
Release:        2.6.32-696.el6.x86\_64
Version:        #1 SMP Tue Feb 21 00:53:17 EST 2017
Machine:        x86\_64
VM name:        VMWare Version: 6
Instance name: rtp
Redo thread mounted by this instance: 1
Oracle process number: 10
Unix process pid: 3300, image: oracle\@rtpdb (DBW0)\*\*\* 2025-05-08 23:22:56.680
\*\*\* SESSION ID:(1521.1) 2025-05-08 23:22:56.680
\*\*\* CLIENT ID:() 2025-05-08 23:22:56.680
\*\*\* SERVICE NAME:(SYS\$BACKGROUND) 2025-05-08 23:22:56.680
\*\*\* MODULE NAME:() 2025-05-08 23:22:56.680
\*\*\* ACTION NAME:() 2025-05-08 23:22:56.680KCF: read, write or open error, block=0x240ab online=1
file=118 '/oradata2/rtp/RTP/datafile/o1\_mf\_tbs\_ods\_n1qx02j0\_.dbf'
error=27072 txt: 'Linux-x86\_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1'
Encountered write error
DDE rules only execution for: ORA 1110
\----- START Event Driven Actions Dump ----
\---- END Event Driven Actions Dump ----
\----- START DDE Actions Dump -----
Executing SYNC actions
\----- START DDE Action: 'DB\_STRUCTURE\_INTEGRITY\_CHECK' (Async) -----
Successfully dispatched
\----- END DDE Action: 'DB\_STRUCTURE\_INTEGRITY\_CHECK' (SUCCESS, 0 csec) -----
Executing ASYNC actions
\----- END DDE Actions Dump (total 0 csec) -----
error 63999 detected in background process
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 118 (block # 147627)
ORA-01110: data file 118: '/oradata2/rtp/RTP/datafile/o1\_mf\_tbs\_ods\_n1qx02j0\_.dbf'
ORA-27072: File I/O error
Linux-x86\_64 Error: 5: Input/output error
Additional information: 4
Additional information: 147627
Additional information: -1
kjzduptcctx: Notifying DIAG for crash event
\----- Abridged Call Stack Trace -----
ksedsts()+465<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+63<-ksuitm()+5570<-ksbrdp()+3507<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai\_real()+250<-ssthrdmain()+265<-main()+201<-\_\_libc\_start\_main()+253
\----- End of Abridged Call Stack Trace -----\*\*\* 2025-05-08 23:22:56.750
DBW0 (ospid: 3300): terminating the instance due to error 63999
ksuitm: waiting up to \[5] seconds before killing DIAG(3292)
\[oracle\@rtpdb \~]\$

2. OS層面檢查IO報錯問題

2.1查看/oradata2掛載點是否正常,發現有較多的io錯誤

[oracle@rtpdb ~]$ dmesg | grep -i error
end_request: I/O error, dev sdc, sector 716711160
Buffer I/O error on device dm-0, logical block 89588639
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588640
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588641
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588642
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588643
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588644
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588645
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588646
lost page write due to I/O error on dm-0
Buffer I/O error on device dm-0, logical block 89588647
lost page write due to I/O error on dm-0
JBD2: Detected IO errors while flushing file data on dm-0-8
[root@rtpdb ~]# tail -f /var/log/messages
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588645
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588646
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: Buffer I/O error on device dm-0, logical block 89588647
May  8 23:22:50 rtpdb kernel: lost page write due to I/O error on dm-0
May  8 23:22:50 rtpdb kernel: JBD2: Detected IO errors while flushing file data on dm-0-8
May  9 03:40:04 rtpdb rhsmd: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date.
May  9 12:38:21 rtpdb kernel: NET: Unregistered protocol family 36
May  9 12:38:21 rtpdb kernel: NET: Registered protocol family 36

3.Rman檢查報錯的文件是否有壞塊

RMAN> VALIDATE DATAFILE 118;Starting validate at 09-MAY-25
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=647 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00118 name=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:15
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
118  FAILED 0              85421        268800          74401038924File Name: /oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbfBlock Type Blocks Failing Blocks Processed---------- -------------- ----------------Data       0              99872           Index      0              82856           Other      49             651             validate found one or more corrupt blocks
See trace file /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc for details
Finished validate at 09-MAY-25RMAN> list backup summary;specification does not match any backup in the repositoryRMAN> 

從/u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc中檢查具體的有哪些block損壞, 一下檢查到這么多corrupt block

而且還沒有物理備份(非歸檔模式的庫)?該如何處理

[oracle@rtpdb ~]$ cat /u01/app/oracle/diag/rdbms/rtp/rtp/trace/rtp_ora_22883.trc | grep -i "Corrupt"
Corrupt block relative dba: 0x1d82e5cf (file 118, block 189903)
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189903, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data。
Corrupt block relative dba: 0x1d82e5ff (file 118, block 189951)
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data
Reread of blocknum=189951, file=/oradata2/rtp/RTP/datafile/o1_mf_tbs_ods_n1qx02j0_.dbf. found same corrupt data

這里出現連續的block 189903 到 block 189918?至少有 16壞塊

4.查看壞塊對應的object

檢查這些壞塊是輸入對應的哪個object,看到是一個表

SELECT tablespace_name, segment_type, owner, segment_name 
FROM dba_extents 
WHERE file_id = 118 AND (block_id BETWEEN 189903 AND 189918 OR(block_id + blocks - 1) BETWEEN 189903 AND 189918 ORblock_id < 189903 AND (block_id + blocks - 1) > 189918);
TABLESPACE_NAME                SEGMENT_TYPE       OWNER                          SEGMENT_NAME
------------------------------ ------------------ ------------------------------ ---------------------------------------------------------------------------------
TBS_ODS                        TABLE              ODS                            LOT_MATERIAL_MASTER

5.標記壞塊,防止操作失敗

標記這些壞塊并跳過,這個不算是標準處理流程,因為是報表庫,元數據都是從另外一個庫拉取的,這里標記跳過,聯系報表的同事重建這個表

BEGINDBMS_REPAIR.SKIP_CORRUPT_BLOCKS (schema_name   => 'ODS',object_name   => 'LOT_MATERIAL_MASTER',object_type   => DBMS_REPAIR.TABLE_OBJECT,flags         => DBMS_REPAIR.SKIP_FLAG);
END;
/
PL/SQL procedure successfully completed.

5.1 DBMS_REPAIR.SKIP_CORRUPT_BLOCKS包介紹

DBMS_REPAIR.SKIP_CORRUPT_BLOCKS? 用于告訴數據庫在訪問特定表或索引時跳過已知的壞塊(corrupt blocks),從而避免訪問錯誤中斷操作


??主要作用

  • 標記指定對象中的壞塊為可跳過,當應用或查詢訪問這些壞塊時,Oracle 會跳過它們,而不是報錯。

  • 適用于:

    • 表(TABLE_OBJECT

    • 索引(INDEX_OBJECT

  • 常用于數據庫文件損壞、硬盤故障、備份文件不完整等情況下臨時繞過問題塊繼續業務運行或數據導出


🧠 使用場景舉例:

  • 表中某些數據塊損壞,導致全表掃描失敗。

  • 臨時需要導出未損壞的數據,用于轉移或恢復。

  • 配合?DBMS_REPAIR.CHECK_OBJECT?檢測壞塊后,繼續運行業務邏輯。


📌 工作機制

啟用后,對象上的查詢或操作:

  • 遇到壞塊 → Oracle?跳過不訪問這些壞塊

  • 這樣能?最大程度保留/導出/訪問完好數據

  • 不影響數據塊的實際內容(不會修復壞塊,僅跳過)


?常用調用格式

BEGIN DBMS_REPAIR.SKIP_CORRUPT_BLOCKS ( schema_name => 'SCOTT', object_name => 'EMP', object_type => DBMS_REPAIR.TABLE_OBJECT, flags => DBMS_REPAIR.SKIP_FLAG -- 開啟跳過 ); END;

關閉跳過功能:

BEGIN DBMS_REPAIR.SKIP_CORRUPT_BLOCKS ( schema_name => 'SCOTT', object_name => 'EMP', object_type => DBMS_REPAIR.TABLE_OBJECT, flags => DBMS_REPAIR.NOSKIP_FLAG -- 關閉跳過 ); END;


?? 注意事項

  1. 此操作不會修復壞塊,只是忽略它們

  2. 配合?DBMS_REPAIR.CHECK_OBJECT?使用,先識別出壞塊。

  3. 通常用于應急,不應長期依賴。

  4. 處理后建議盡快進行數據恢復或表重建
    ?

總結

?因為這個是庫是非歸檔模式的,所以沒有物理備份,這樣遭遇了block corrupt確實非常麻煩,建議重要的庫還是一定要啟用歸檔并使用RMAN備份。

SQL> archive log list;
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            /u01/app/oracle/product/11.2.0/db_1/dbs/arch
Oldest online log sequence     4231143
Current log sequence           4231147

expdp備份部分表的腳本 供參考

[oracle@rtpdb ~]$ cat $HOME/jobs/expback.sh
#!/bin/bash
#backup table on noarchive db
#create by norton.fan 20220729
PATH=$PATH:$HOME/bin
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
ORACLE_SID=rtp
PATH=$ORACLE_HOME/bin:$PATH
export ORACLE_BASE ORACLE_HOME ORACLE_SID
export PATH
NLS_LANG=AMERICAN_AMERICA.UTF8
export NLS_LANG
#export DELTIME=`date -d "15 days ago" +%Y%m%d`
export BACKUPTIME=`date +%Y%m%d%H%M%S`
expdp ods/ods dumpfile=ods$BACKUPTIME.dmp logfile=ods$BACKUPTIME.log parfile=/home/oracle/jobs/exp.par
#echo "Delete backup cycle before 15 days"
find /oradata/backup/ -mtime +1 -name  *.dmp -exec rm -f {} ';'   
find /oradata/backup/ -mtime +7 -name  *.log -exec rm -f {} ';'
[oracle@rtpdb ~]$ 
[oracle@rtpdb ~]$ cat /home/oracle/jobs/exp.par
DIRECTORY = dmpdir
SCHEMAS = ods
INCLUDE = TABLE:"IN (select table_name from exptab)" ##將需要備份的表名放入到exptab表中

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/web/79359.shtml
繁體地址,請注明出處:http://hk.pswp.cn/web/79359.shtml
英文地址,請注明出處:http://en.pswp.cn/web/79359.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

關于一些平時操作系統或者軟件的步驟轉載

關于一些平時操作系統或者軟件的步驟轉載 關于python環境搭建 關于Ubuntu 1. 雙系統之Ubuntu快速卸載 2. VMware安裝Ubuntu虛擬機實現COpenCV代碼在虛擬機下運行教程 3. ubuntu 下 opencv的安裝以及配置&#xff08;親測有效&#xff09; 4. Ubuntu將c編譯成.so文件并測試 5…

hz2新建Keyword頁面

新建一個single-keywords.php即可&#xff0c;需要篩選項再建taxonomy-knowledge-category.php 參考&#xff1a;https://www.tkwlkj.com/customize-wordpress-category-pages.html WordPress中使用了ACF創建了自定義產品分類products&#xff0c;現在想實現自定義產品分類下的…

VRRP協議-IP地址冗余配置

有兩個服務器172.16.42.1和172.16.42.121&#xff0c;通過VRRP協議使兩臺設備共用一個虛擬地址172.16.42.100&#xff0c;當 172.16.42.1 可用時&#xff0c;它會作為主路由器使用虛擬 IP 地址&#xff1b;當它不可用時&#xff0c;172.16.42.121 會接管虛擬 IP 地址&#xff0…

21、DeepSeekMath論文筆記(GRPO)

DeepSeekMath論文筆記 0、研究背景與目標1、GRPO結構GRPO結構PPO知識點**1. PPO的網絡模型結構****2. GAE&#xff08;廣義優勢估計&#xff09;原理****1. 優勢函數的定義**2.GAE&#xff08;廣義優勢估計&#xff09; 2、關鍵技術與方法3、核心實驗結果4、結論與未來方向關鍵…

卡爾曼濾波算法(C語言)

此處感謝華南虎和互聯網的眾多大佬的無償分享。 入門常識 先簡單了解以下概念&#xff1a;疊加性&#xff0c;齊次性。 用大白話講&#xff0c;疊加性&#xff1a;多個輸入對輸出有影響。齊次性&#xff1a;輸入放大多少倍&#xff0c;輸出也跟著放大多少倍 卡爾曼濾波符合這…

SolidWork-2023 鼠標工程

地址 https://github.com/MartinxMax/SW2023-Project/tree/main/mouse 鼠標

vue 組件函數式調用實戰:以身份驗證彈窗為例

通常我們在 Vue 中使用組件&#xff0c;是像這樣在模板中寫標簽&#xff1a; <MyComponent :prop"value" event"handleEvent" />而函數式調用&#xff0c;則是讓我們像調用一個普通 JavaScript 函數一樣來使用這個組件&#xff0c;例如&#xff1a;…

Vite Proxy配置詳解:從入門到實戰應用

Vite Proxy配置詳解&#xff1a;從入門到實戰應用 一、什么是Proxy代理&#xff1f; Proxy&#xff08;代理&#xff09;是開發中常用的解決跨域問題的方案。Vite內置了基于http-proxy的代理功能&#xff0c;可以輕松配置API請求轉發。 二、基礎配置 在vite.config.js中配置…

圖像畫質算法記錄(前言)

一、背景介紹 本篇主要是對圖像畫質增強相關&#xff0c;進行簡單整理和記錄。 二、整體流程 整體效果主要受到兩部分影響&#xff1a; 1、前端isp處理。 2、后端畫質增強。 三、isp常規流程 可以參考&#xff1a;劉斯寧&#xff1a;Understanding ISP Pipeline 四、后端畫質…

Qt 中信號與槽(signal-slot)機制支持 多種連接方式(ConnectionType)

Qt 中信號與槽&#xff08;signal-slot&#xff09;機制支持 多種連接方式&#xff08;ConnectionType&#xff09; Qt 中信號與槽&#xff08;signal-slot&#xff09;機制支持 多種連接方式&#xff08;ConnectionType&#xff09;&#xff0c;用于控制信號發出后如何調用槽…

卷積神經網絡實戰(4)代碼詳解

目錄 一、導包 二、數據準備 1.數據集 2. 標準化轉換(Normalize) 3.設置dataloader 三、定義模型 四、可視化計算圖&#xff08;不重要&#xff09; 五、評估函數 六、Tensorboard 一、導包 import matplotlib as mpl import matplotlib.pyplot as plt %matplotlib i…

深入解析進程地址空間:從虛擬到物理的奇妙之旅

深入解析進程地址空間&#xff1a;從虛擬到物理的奇妙之旅 前言 各位小伙伴&#xff0c;還記得我們之前探討的 fork 函數嗎&#xff1f;當它返回兩次時&#xff0c;父子進程中同名變量卻擁有不同值的現象&#xff0c;曾讓我們驚嘆于進程獨立性與寫時拷貝的精妙設計。但你是否…

opencv處理圖像(二)

接下來進入到程序線程設計部分 我們主線程負責圖形渲染等操作&#xff0c;OpenGL的限制&#xff0c;opencv技術對傳入圖像加以處理&#xff0c;輸出預期圖像給主線程 QThread 我之前也是在想給opencv開一個專門的線程&#xff0c;但經過了解有幾個弊端&#xff0c;第一資源浪…

學習threejs,使用Physijs物理引擎

&#x1f468;??? 主頁&#xff1a; gis分享者 &#x1f468;??? 感謝各位大佬 點贊&#x1f44d; 收藏? 留言&#x1f4dd; 加關注?! &#x1f468;??? 收錄于專欄&#xff1a;threejs gis工程師 文章目錄 一、&#x1f340;前言1.1 ??Physijs 物理引擎1.1.1 ??…

ARCGIS PRO DSK 選擇坐標系控件(CoordinateSystemsControl )的調用

在WPF窗體上使用 xml&#xff1a;加入空間命名引用 xmlns:mapping"clr-namespace:ArcGIS.Desktop.Mapping.Controls;assemblyArcGIS.Desktop.Mapping" 在控件區域加入&#xff1a; <mapping:CoordinateSystemsControl x:Name"CoordinateSystemsControl&q…

LangGraph(三)——添加記憶

目錄 1. 創建MemorySaver檢查指針2. 構建并編譯Graph3. 與聊天機器人互動4. 問一個后續問題5. 檢查State參考 1. 創建MemorySaver檢查指針 創建MemorySaver檢查指針&#xff1a; from langgraph.checkpoint.memory import MemorySavermemory MemorySaver()這是位于內存中的檢…

深入理解Mysql

BufferPool和Changebuffer是如何加快讀寫速度的? BufferPool 在Mysql啟動的時候 Mysql會申請連續的空間來存儲BufferPool 每個頁16kb 當控制塊不足以存儲信息的時候就會向后申請一個新的頁 每個控制塊都對應了一個緩存頁 控制塊占chunk的百分之5左右 LRU鏈表 Changebuffer …

Python核心編程深度解析:作用域、遞歸與匿名函數的工程實踐

引言 Python作為現代編程語言的代表&#xff0c;其作用域管理、遞歸算法和匿名函數機制是構建高質量代碼的核心要素。本文基于Python 3.11環境&#xff0c;結合工業級開發實踐&#xff0c;深入探討變量作用域的內在邏輯、遞歸算法的優化策略以及匿名函數的高效應用&#xff0c…

《用MATLAB玩轉游戲開發》貪吃蛇的百變玩法:從命令行到AI對戰

《用MATLAB玩轉游戲開發&#xff1a;從零開始打造你的數字樂園》基礎篇&#xff08;2D圖形交互&#xff09;-&#x1f40d; 貪吃蛇的百變玩法&#xff1a;從命令行到AI對戰 &#x1f3ae; 歡迎來到這篇MATLAB貪吃蛇編程全攻略&#xff01;本文將帶你從零開始&#xff0c;一步步…

Android平臺FFmpeg音視頻開發深度指南

一、FFmpeg在Android開發中的核心價值 FFmpeg作為業界領先的多媒體處理框架&#xff0c;在Android音視頻開發中扮演著至關重要的角色。它提供了&#xff1a; 跨平臺支持&#xff1a;統一的API處理各種音視頻格式完整功能鏈&#xff1a;從解碼、編碼到濾鏡處理的全套解決方案靈…