eygle.com   eygle.com
eygle.com  
 

« 元大都的枫叶 绚烂斑斓的美丽 | Blog首页 | Oracle11g的新特性:Database和SQL重演(replay) »

磁盘IO故障 导致Redo损坏一例

作者:eygle |【转载时请务必以超链接形式标明文章和作者信息及本声明
链接:

前几天一个数据库的硬盘出现问题,经过格式化之后恢复正常,今天这块硬盘再次出现问题。

这次损坏的是Redo日志,数据库警告日志给出Redo相关的错误信息:

Mon Nov 13 11:42:54 2006
Errors in file /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc:
ORA-00333: redo log read error block 186498 count 6144
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186497
Mon Nov 13 11:42:58 2006
Errors in file /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc:
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498
Mon Nov 13 11:43:03 2006
Errors in file /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc:
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498

相关的跟踪文件记录了类似的错误信息:


[oracle@gdmstest bdump]$ cat /opt/oracle/admin/mydb/udump/mydb_ora_16682.trc
/opt/oracle/admin/mydb/udump/mydb_ora_16682.trc
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0
System name: Linux
Node name: gdmstest.hurray.com.cn
Release: 2.4.21-15.EL
Version: #1 Thu Apr 22 00:27:41 EDT 2004
Machine: i686
Instance name: mydb
Redo thread mounted by this instance: 1
Oracle process number: 11
Unix process pid: 16682, image: oracle@gdmstest.hurray.com.cn (TNS V1-V3)

*** SESSION ID:(9.3) 2006-11-13 11:41:23.555
Thread checkpoint rba:0x00001d.00000002.0010 scn:0x0000.000f94cd
On-disk rba:0x00001d.0002dc60.0000 scn:0x0000.000f9b4e
Use incremental checkpoint cache-low RBA
Thread 1 recovery from rba:0x00001d.00029082.0000 scn:0x0000.00000000
*** 2006-11-13 11:42:54.830
ORA-00333: redo log read error block 186498 count 6144
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186497
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498
ORA-00333: redo log read error block 184450 count 8192
ORA-00312: online log 2 thread 1: '/opt/oracle/oradata/mydb/redo02.log'
ORA-27091: skgfqio: unable to queue I/O
ORA-27072: skgfdisp: I/O error
Linux Error: 2: No such file or directory
Additional information: 186498

察看系统提示,发现存在问题的扇区(Sector)和上次相同(sector=14266880),看来真的是物理损坏,只能更换硬盘了:

[oracle@gdmstest bdump]$ dmesg
or=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880
hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=58847319, high=3, low=8515671, sector=14266880
end_request: I/O error, dev 03:06 (hda), sector 14266880

-The End-

By eygle on 2006-11-13 14:51 | Comments (7) | Posted to Case | Edit |Pageviews:

相关文章 随机文章
  • 阵列增加硬盘导致IO错误一例
  • DataGuard数据库服务器硬盘故障处理一则
  • 成功优化案例:解决ERP系统更新性能问题
  • 解决ORA-600 [qmxiUnpPacked2]错误一则
  • 成功恢复案例:解决字典表误Truncate故障
  • DBA警世录:备份重于一切
    Oracle OpenWorld 2005
    ITPUB年会印象-相会朋友们
    Linux上配置Unix ODBC连接Oracle数据库
    如何重建UNDO TABLESPACE
    网上相关主题:
    Google

    留言 (7)

    数据库有没有自动down掉?

    Posted by: 香猪 at November 13, 2006 5:13 PM

    会的,挂了:

    Linux Error: 4: Interrupted system call
    Additional information: 187487
    LGWR: terminating instance due to error 340
    Instance terminated by LGWR, pid = 15045

    Posted by: eygle at November 13, 2006 5:41 PM

    看来镜像redo log还是很有用的。

    Posted by: 香猪 at November 14, 2006 9:11 AM

    牺牲点性能还是有收益的。

    不过有阵列一般出这种问题的概率较小,没有阵列数据库的重要性也就相对差一些。

    Posted by: eygle at November 14, 2006 9:58 AM

    对!毕竟安全是第一位的!

    Posted by: 香猪 at November 14, 2006 11:56 AM

    我做过Redo镜像的数据库不多,但是其中一个后来真是救了我一命。

    Posted by: eygle at November 15, 2006 9:17 AM

    是呀,有时候还是必要的,特别是比较重要的库

    Posted by: sunchao at December 19, 2007 2:24 PM

    发表留言:



    Remember Me?
    (输入验证码后方可评论,谢谢支持)



    CopyRight © 2004 eygle.com, All rights reserved.