測試工作正在如火如荼的進行,突然數據庫就連接不上了。我連接上主機發現數據庫alert_sid日志中有如下信息:
KCF: write/open error block=0x9a6 online=1
file=2 /oracle_data1/UNDOTBS3.dbf
error=27072 txt: 'Linux Error: 5: Input/output error
Additional information: 2469'
Thu Dec? 4 12:56:39 2008
Errors in file /opt/ora9/admin/tax/bdump/orcl_dbw0_9605.trc:
ORA-01242: data file suffered media failure: database in NOARCHIVELOG mode
ORA-01114: IO error writing block to file 2 (block # 2470)
ORA-01110: data file 2: '/oracle_data1/UNDOTBS3.dbf'
ORA-27072: skgfdisp: I/O error
Linux Error: 5: Input/output error
Additional information: 2469
DBW0: terminating instance due to error 1242
Instance terminated by DBW0, pid = 9605
數據庫已經down了。初步看是因為磁盤的IO錯誤。看看主機的日志吧。/var/log/message
Dec? 4 12:52:10 tax smartd[2924]: Device: /dev/sdb, 2 Currently unreadable (pending) sectors
Dec? 4 12:52:10 tax smartd[2924]: Device: /dev/sdb, 2 Offline uncorrectable sectors
Dec? 4 12:56:39 tax kernel: ata1: command 0xca timeout, stat 0xd0 host_stat 0x61
Dec? 4 12:56:39 tax kernel: ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
Dec? 4 12:56:39 tax kernel: ata1: status=0xd0 { Busy }
Dec? 4 12:56:39 tax kernel: SCSI error : <0 0 1 0> return code = 0x8000002
Dec? 4 12:56:39 tax kernel: Info fld=0x5b4b38b, Current sdb: sense key Aborted Command
Dec? 4 12:56:39 tax kernel: Additional sense: Scsi parity error
Dec? 4 12:56:39 tax kernel: end_request: I/O error, dev sdb, sector 95728523
Dec? 4 12:56:39 tax kernel: Buffer I/O error on device sdb6, logical block 1483645
Dec? 4 12:56:39 tax kernel: lost page write due to I/O error on sdb6
Dec? 4 12:56:39 tax kernel: Aborting journal on device sdb6.
Dec? 4 12:56:39 tax kernel: ext3_abort called.
Dec? 4 12:56:39 tax kernel: EXT3-fs error (device sdb6): ext3_journal_start_sb: Detected aborted journal
Dec? 4 12:56:39 tax kernel: Remounting filesystem read-only
Dec? 4 12:57:09 tax kernel: ata1: command 0xca timeout, stat 0xd0 host_stat 0x61
Dec? 4 12:57:09 tax kernel: ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
Dec? 4 12:57:09 tax kernel: ata1: status=0xd0 { Busy }
Dec? 4 12:57:09 tax kernel: SCSI error : <0 0 1 0> return code = 0x8000002
Dec? 4 12:57:09 tax kernel: Info fld=0x5b4b38b, Current sdb: sense key Aborted Command
Dec? 4 12:57:09 tax kernel: Additional sense: Scsi parity error
Dec? 4 12:57:09 tax kernel: end_request: I/O error, dev sdb, sector 41934794
Dec? 4 12:57:09 tax kernel: Buffer I/O error on device sdb3, logical block 643
Dec? 4 12:57:09 tax kernel: lost page write due to I/O error on sdb3
Dec? 4 12:57:44 tax kernel: ata1: command 0xca timeout, stat 0xd0 host_stat 0x61
Dec? 4 12:57:44 tax kernel: ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
Dec? 4 12:57:44 tax kernel: ata1: status=0xd0 { Busy }
Dec? 4 12:57:44 tax kernel: SCSI error : <0 0 1 0> return code = 0x8000002
Dec? 4 12:57:44 tax kernel: Info fld=0x5b4b38b, Current sdb: sense key Aborted Command
Dec? 4 12:57:44 tax kernel: Additional sense: Scsi parity error
Dec? 4 12:57:44 tax kernel: end_request: I/O error, dev sdb, sector 83864507
Dec? 4 12:57:44 tax kernel: Buffer I/O error on device sdb6, logical block 643
Dec? 4 12:57:44 tax kernel: lost page write due to I/O error on sdb6
Dec? 4 12:57:44 tax sshd(pam_unix)[11222]: session opened for user oracle by (uid=0)
Dec? 4 12:58:03 tax sshd(pam_unix)[11276]: session opened for user oracle by (uid=0)
Dec? 4 12:59:25 tax kernel: ata1: command 0xc8 timeout, stat 0xd0 host_stat 0x61
Dec? 4 12:59:25 tax kernel: ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
Dec? 4 12:59:25 tax kernel: ata1: status=0xd0 { Busy }
Dec? 4 12:59:25 tax kernel: SCSI error : <0 0 1 0> return code = 0x8000002
Dec? 4 12:59:25 tax kernel: Info fld=0x5b4b38b, Current sdb: sense key Aborted Command
Dec? 4 12:59:25 tax kernel: Additional sense: Scsi parity error
Dec? 4 12:59:25 tax kernel: end_request: I/O error, dev sdb, sector 41934794
Dec? 4 12:59:25 tax kernel: EXT3-fs error (device sdb3): ext3_get_inode_loc: unable to read inode block - inode=12, block=643
Dec? 4 12:59:25 tax kernel: Aborting journal on device sdb3.
Dec? 4 12:59:55 tax kernel: ata1: command 0xca timeout, stat 0xd0 host_stat 0x61
Dec? 4 12:59:55 tax kernel: ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
Dec? 4 12:59:55 tax kernel: ata1: status=0xd0 { Busy }
操作系統后臺出現嚴重的IO錯誤。
但是當進入到某一個分區后,竟然無法創建文件,報錯誤為只讀的文件系統。
[oracle@tax oracle_data2]$ touch aa
touch: cannot touch `aa': Read-only file system
操作系統加載的磁盤方式為rw,全部為讀寫的方式加載的。
oracle_data1]# mount
/dev/sda5 on / type ext3 (rw)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/sda1 on /boot type ext3 (rw)
none on /dev/shm type tmpfs (rw)
/dev/sda9 on /opt type ext2 (rw)
/dev/sdb6 on /oracle_data1 type ext3 (rw)
/dev/sdb5 on /oracle_data2 type ext3 (rw)
/dev/sdb3 on /oracle_data3 type ext3 (rw)
/dev/sdb2 on /oracle_data4 type ext3 (rw)
/dev/sdb1 on /oracle_data5 type ext3 (rw)
/dev/sda8 on /oracle_index type ext3 (rw)
/dev/sda7 on /oracle_iot type ext3 (rw)
/dev/sda6 on /oracle_tmp type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
You have new mail in /var/spool/mail/root
既然是文件系統有問題,那么就修復文件系統吧。使用單用戶模式進入系統,單用戶就是在系統啟動的時候啟動項加入single選項。
然后使用fsck修故操作系統,修復完畢后,進入系統正常,因為數據庫是自動啟動的,只能啟動都mount狀態,說數據庫文件需要恢復,于是recover database,修復完成。直接打開數據庫了。
最近怎么磁盤總是出現問題呢?