eygle.com   eygle.com
eygle.com eygle
eygle.com  
 

« ACOUG 2012年2月 Ask Tom and Eygle - 上海 | Blog首页 | 《Oracle DBA手记》以及51CTO 年度图书作者 »

SCSI读写错误导致文件系统只读的数据库恢复

假期马上来到,一个客户数据库出现问题。

两个实例异常终止,文件系统变成只读:
PCLERPDB2:[10g]:/DBMS/PCMK/admin/PCMK> sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Jan 19 09:08:05 2012

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

ERROR:
ORA-09925: Unable to create audit trail file
Linux Error: 30: Read-only file system
Additional information: 9925
ORA-01075: you are currently logged on
检查系统日志,发现早晨出现SCSI IO错误:
Jan 19 07:56:00 PCLERPDB2 kernel: SCSI error : <0 0 0 1> return code = 0x10000
Jan 19 07:56:00 PCLERPDB2 kernel: end_request: I/O error, dev sda, sector 26480696
Jan 19 07:56:00 PCLERPDB2 kernel: Buffer I/O error on device sda1, logical block 3310083
Jan 19 07:56:00 PCLERPDB2 kernel: lost page write due to I/O error on sda1
Jan 19 07:56:00 PCLERPDB2 kernel: SCSI error : <0 0 0 9> return code = 0x10000
Jan 19 07:56:00 PCLERPDB2 kernel: end_request: I/O error, dev sdh, sector 60052680
Jan 19 07:56:00 PCLERPDB2 kernel: SCSI error : <0 0 0 4> return code = 0x10000
Jan 19 07:56:00 PCLERPDB2 kernel: end_request: I/O error, dev sdc, sector 20042688
Jan 19 07:56:00 PCLERPDB2 kernel: SCSI error : <0 0 0 9> return code = 0x10000
Jan 19 07:56:00 PCLERPDB2 kernel: end_request: I/O error, dev sdh, sector 26747408
Jan 19 07:56:00 PCLERPDB2 kernel: Buffer I/O error on device sdh2, logical block 843074
Jan 19 07:56:00 PCLERPDB2 kernel: lost page write due to I/O error on sdh2
Jan 19 07:56:00 PCLERPDB2 kernel: SCSI error : <0 0 0 1> return code = 0x10000
Jan 19 07:56:00 PCLERPDB2 kernel: end_request: I/O error, dev sda, sector 32606944
Jan 19 07:56:00 PCLERPDB2 kernel: Buffer I/O error on device sda1, logical block 4075864
然后数据库崩溃.

安排用户重启数据库主机,检查是否硬件软故障。
很幸运,重启后数据库能够正常启动:
Thu Jan 19 09:55:09 2012
Completed redo application
Thu Jan 19 09:55:09 2012
Completed crash recovery at
 Thread 1: logseq 18735, block 5214, scn 5965501404211
 59 data blocks read, 59 data blocks written, 609 redo blocks read
Thu Jan 19 09:55:09 2012
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=23, OS id=14599
Thu Jan 19 09:55:09 2012
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=24, OS id=14601
Thu Jan 19 09:55:09 2012
Thread 1 advanced to log sequence 18736
Thread 1 opened at log sequence 18736
  Current log# 3 seq# 18736 mem# 0: /DBMS/DCERP/dcerpdata/log03a.dbf
  Current log# 3 seq# 18736 mem# 1: /DBMS/DCERP/dcerpdata/log03b.dbf
Successful open of redo thread 1
Thu Jan 19 09:55:09 2012
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Jan 19 09:55:09 2012
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Thu Jan 19 09:55:09 2012
ARC1: Becoming the heartbeat ARCH
Thu Jan 19 09:55:09 2012
SMON: enabling cache recovery
Thu Jan 19 09:55:11 2012
Successfully onlined Undo Tablespace 368.
Thu Jan 19 09:55:11 2012
SMON: enabling tx recovery
Thu Jan 19 09:55:11 2012
Database Characterset is UTF8
Thu Jan 19 09:55:11 2012
Incremental checkpoint up to RBA [0x4930.3.0], current log tail at RBA [0x4930.43.0]
Thu Jan 19 09:55:11 2012
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=25, OS id=14626
Thu Jan 19 09:55:25 2012
Completed: ALTER DATABASE OPEN
Thu Jan 19 10:15:13 2012
Incremental checkpoint up to RBA [0x4930.100d.0], current log tail at RBA [0x4930.107d.0]
Thu Jan 19 10:35:16 2012
Incremental checkpoint up to RBA [0x4930.150f.0], current log tail at RBA [0x4930.155b.0]
Thu Jan 19 10:55:17 2012
Incremental checkpoint up to RBA [0x4930.1724.0], current log tail at RBA [0x4930.175a.0]
Thu Jan 19 11:15:18 2012
Incremental checkpoint up to RBA [0x4930.1edf.0], current log tail at RBA [0x4930.1f13.0]

估计硬件的生命周期达到,需要更新了。




历史上的今天...
    >> 2009-01-19文章:
    >> 2008-01-19文章:
    >> 2007-01-19文章:
    >> 2006-01-19文章:
           Oracle10gR2 ASM 值得信赖么?
    >> 2005-01-19文章:

无觅

By eygle on 2012-01-19 09:18 | Comments (0) | Case | 2947 |


CopyRight © 2004~2020 云和恩墨,成就未来!, All rights reserved.
数据恢复·紧急救援·性能优化 云和恩墨 24x7 热线电话:400-600-8755 业务咨询:010-59007017-7040 or 7037 业务合作: marketing@enmotech.com