eygle.com   eygle.com
eygle.com eygle
eygle.com  
 

« 元大都的枫叶 绚烂斑斓的美丽 | Blog首页 | Oracle11g的新特性:Database和SQL重演(replay) »

磁盘IO故障 导致Redo损坏一例
modb.pro

前几天一个数据库的硬盘出现问题,经过格式化之后恢复正常,今天这块硬盘再次出现问题。

这次损坏的是Redo日志,数据库警告日志给出Redo相关的错误信息:

Mon Nov 13 11:42:54 2006
Errors in file /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc:
ORA-00333: redo log read error block 186498 count 6144
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186497
Mon Nov 13 11:42:58 2006
Errors in file /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc:
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498
Mon Nov 13 11:43:03 2006
Errors in file /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc:
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498

相关的跟踪文件记录了类似的错误信息:


[oracle@gdmstest bdump]$ cat /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc
/opt/oracle/admin/mydb/udump/mydb_ora_16682.trc
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0
System name: Linux
Node name: gdmstest.hurray.com.cn
Release: 2.4.21-15.EL
Version: #1 Thu Apr 22 00:27:41 EDT 2004
Machine: i686
Instance name: mydb
Redo thread mounted by this instance: 1
Oracle process number: 11
Unix process pid: 16682, image: oracle@gdmstest.hurray.com.cn (TNS V1-V3)

*** SESSION ID:(9.3) 2006-11-13 11:41:23.555
Thread checkpoint rba:0x00001d.00000002.0010 scn:0x0000.000f94cd
On-disk rba:0x00001d.0002dc60.0000 scn:0x0000.000f9b4e
Use incremental checkpoint cache-low RBA
Thread 1 recovery from rba:0x00001d.00029082.0000 scn:0x0000.00000000
*** 2006-11-13 11:42:54.830
ORA-00333: redo log read error block 186498 count 6144
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186497
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498

察看系统提示,发现存在问题的扇区(Sector)和上次相同(sector=14266880),看来真的是物理损坏,只能更换硬盘了:

[oracle@gdmstest bdump]$ dmesg
or=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880

-The End-


历史上的今天...
    >> 2008-11-13文章:
    >> 2005-11-13文章:

By eygle on 2006-11-13 14:51 | Comments (7) | Case | 965 |

7 Comments

数据库有没有自动down掉?

会的,挂了:

Linux Error: 4: Interrupted system call
Additional information: 187487
LGWR: terminating instance due to error 340
Instance terminated by LGWR, pid = 15045

看来镜像redo log还是很有用的。

牺牲点性能还是有收益的。

不过有阵列一般出这种问题的概率较小,没有阵列数据库的重要性也就相对差一些。

对!毕竟安全是第一位的!

我做过Redo镜像的数据库不多,但是其中一个后来真是救了我一命。

是呀,有时候还是必要的,特别是比较重要的库


CopyRight © 2004~2020 云和恩墨,成就未来!, All rights reserved.
数据恢复·紧急救援·性能优化 云和恩墨 24x7 热线电话:400-600-8755 业务咨询:010-59007017-7040 or 7037 业务合作: marketing@enmotech.com