Oracle数据恢复:ORA-600 4097错误解决案例一则

« Oracle队列等待: enq: IS - contention 阻塞启动 | Blog首页 | Oracle等待事件: resmgr:cpu quantum引发CPU冲高 »

在客户数据库恢复过程中,遇到ORA-600 4097号错误,这个错误是比较少见的.

在Oracle进行延迟块清除时,会去查询回滚段头,以确认事务状态,然而,由于一些损坏或强制性恢复手段的采取,获得的回滚段信息来自与未来,超前于当前数据库状态.这样就出现了4097号错误.

这个错误明确的由于事务清除和回滚段问题导致.

在观察以下错误时,我们注意到一个重要的信息:

Non-fatal internal error happenned while SMON was doing logging scn->time mapping.

具体内容如下:

Tue Jul 12 09:26:09 2011
Errors in file /oracle/socl/admin/bdump/socl_smon_909326.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 12 09:26:10 2011
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

Tue Jul 12 09:31:12 2011
Errors in file /oracle/socl/admin/bdump/socl_smon_909326.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 12 09:31:13 2011
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 2 out of maximum 100 non-fatal internal errors.

我们知道数据库中scn->time的映射关系是由smon_scn_time表来维护的:

rem
rem create the scn<->time tracking table that smon will maintain
rem as a circular queue - notice that we populate the entire
rem table with at least 144000 entries (enough for 5 days).
rem
rem -"thread" is for backward compatibility and is always 0
rem -"orig_thread" is for upgrade/downgrade
rem - scn_wrp, scn_bas, and time_dp are for backward compatibility
rem and not queried by the ktf layer.
rem

create cluster smon_scn_to_time (
thread number /* thread, compatibility */
)
/
create index smon_scn_to_time_idx on cluster smon_scn_to_time
/
create table smon_scn_time (
thread number, /* thread, compatibility */
time_mp number, /* time this recent scn represents */
time_dp date, /* time as date, compatibility */
scn_wrp number, /* scn.wrp, compatibility */
scn_bas number, /* scn.bas, compatibility */
num_mappings number,
tim_scn_map raw(1200),
scn number default 0, /* scn */
orig_thread number default 0 /* for downgrade */
) cluster smon_scn_to_time (thread)
/

create unique index smon_scn_time_tim_idx on smon_scn_time(time_mp)
/

create unique index smon_scn_time_scn_idx on smon_scn_time(scn)
/

这其中的数据并不重要,可以安全清除.随后数据库会继续自动向该表记录映射信息.

不能直接truncate 表 smon_scn_time :

soclrac01:oracle:[/oracle/socl/admin/bdump]#sqlplus "/ as sysdba"

SQL*Plus: Release 10.2.0.4.0 - Production on Tue Jul 12 09:41:09 2011

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select count(*) from smon_scn_time;

COUNT(*)
----------
1743

SQL> truncate table smon_scn_time;
truncate table smon_scn_time
*
ERROR at line 1:
ORA-03292: Table to be truncated is part of a cluster

ORA-03292提示该表是Cluster的一部分.

我们尝试删除数据,发现存在错误,索引和数据不一致:

SQL> delete from smon_scn_time ;
delete from smon_scn_time
*
ERROR at line 1:
ORA-08102: index key not found, obj# 577, file 1, block 69410 (2)

直接truncate cluster即可清除数据:

SQL> truncate cluster smon_scn_to_time;

Cluster truncated.

SQL> select count(*) from smon_scn_time;

COUNT(*)
----------
0

SQL>

重新启动数据库后,数据库一切恢复正常,不存在任何告警和错误信息了:

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 1224736768 bytes
Fixed Size 2083664 bytes
Variable Size 318768304 bytes
Database Buffers 889192448 bytes
Redo Buffers 14692352 bytes
Database mounted.
Database opened.
SQL> select count(*) from smon_scn_time;

COUNT(*)
----------
1

至此这个数据库恢复圆满解决了.

历史上的今天...
>> 2008-07-19文章:

Google Book上的Oracle图书

>> 2006-07-19文章:

Oracle10gR2 ORA-3136 错误解决

Oracle10g v$database视图SCN增强

>> 2005-07-19文章:

瑞典游记-正章-第一个周末（游览市中心）

By eygle on 2011-07-19 08:48 | Comments (0) | Backup&Recovery | 2838 |