eygle.com   eygle.com
eygle.com eygle
eygle.com  
 

« Exadata现场演示与锁机制-ACOUG 7月活动小记 | Blog首页 | 《循序渐进Oracle》第二版一书相关代码 »

Oracle数据恢复:SYSTEM回滚段损坏案例一则
modb.pro

前一段时间,接收到一次用户报告,用户因为断电导致了数据库故障.启动时遇到了01555的错误.

通常ORA-01555错误并不可怕,但是如果出现在SYSTEM回滚段上,则问题就严重了,因为SYSTEM回滚段无法Offline,也无法重建.以下是错误的主要信息:

Thu Jul 07 15:18:20 CST 2011
ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
Thu Jul 07 15:18:20 CST 2011
ORA-01555 caused by SQL statement below (SQL ID: 7bd391hat42zk, Query Duration=0 sec, SCN: 0x000a.79ed044d):
Thu Jul 07 15:18:20 CST 2011
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1
Thu Jul 07 15:18:20 CST 2011
Errors in file /home/oracle/oracle/admin/EDB01/udump/edb01_ora_1208.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 0 with name "SYSTEM" too small
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 1208
ORA-1092 signalled during: alter database open...

注意,以下一段SQL非常著名:
select /*+ rule */ name,file#,block#,status$,user#,undosqn,xactsqn,scnbas,scnwrp,DECODE(inst#,0,NULL,inst#),ts#,spare1 from undo$ where us#=:1

这是启动过程中对于undo$的递归读取,获得其中的回滚段信息.如果某个回滚段上存在活动事务,则该事务必须被读取回滚,以便保证事务的一致性.

以下是Oracle 9i的SYSTEM回滚段空间分配,通常这些数据块损坏会非常复杂:
SQL> select segment_name,block_id,blocks from dba_extents where segment_name='SYSTEM';

SEGMENT_NAME                     BLOCK_ID     BLOCKS
------------------------------ ---------- ----------
SYSTEM                                  9          8
SYSTEM                                 17          8
SYSTEM                                385          8
SYSTEM                                393          8
SYSTEM                                401          8
SYSTEM                                409          8

对于SYSTEM回滚段,其为Oracle数据库第一个创建的回滚段,主要用于数据库的内部事务或SYS的事务信息记录。如果数据库创建了其他用户的回滚段,则SYSTEM回滚段将近用于UNDO$的信息记录,这也是为什么在出现问题时,我们看到的是在undo$读取时抛出的异常。
在sql.bsq文件中,记录了数据库创建第一个步骤中的SYSTEM回滚段信息:
create tablespace SYSTEM datafile "D_DBFN" 
  "D_DSTG" online
/
create rollback segment SYSTEM tablespace SYSTEM
  storage (initial 50K next 50K)
/

系统回滚段的作用如下:
When a database is first created using the CREATE DATABASE command, only a single rollback segment is created.  
This is the system rollback segment and it is created in the system tablespace.  

The system rollback segment has one basic difference from any other rollback segment, including any other rollback segments that are created in the system tablespace.  
This difference is that the system rollback segment can only be used for transactions that occur on objects inside the system tablespace.  
This is done because the main purpose of the system rollback segment is to handle rollback for DDL transactions - that is transactions against the data dictionary tables themselves.  Making the system rollback usable only for the system tablespace was simply an easy way to enforce that.  

It is possible for the system rollback segment to be used for non-data dictionary tables, but only if those tables are created inside the system tablespace (which is very bad development practice).

Steve对此的重要注解
When other rollback segments are available, the SYSTEM rollback segment is only used for the changes to UNDO$ associated with bringing other rollback segments online, or taking them offline.

至于SYSTEM回滚段损坏,你最好有备份,否则就只能通过BBED去修改相关的数据块,在《Oracle DBA手记 2》上曾经有此案例。



历史上的今天...
    >> 2007-08-05文章:
    >> 2005-08-05文章:
           10个想法

By eygle on 2011-08-05 14:14 | Comments (1) | Backup&Recovery | 2847 |

1 Comment

大师,我觉的解决该问题的核心思想应该是增大这个数据库的scn或者减小system回滚段对应这条递归SQL的Itl槽的commit scn来解决该问题。前者比后者处理起来简单一些。当然我还没有进行测试,如果思路上有问题的话,请大师指正。


CopyRight © 2004~2020 云和恩墨,成就未来!, All rights reserved.
数据恢复·紧急救援·性能优化 云和恩墨 24x7 热线电话:400-600-8755 业务咨询:010-59007017-7040 or 7037 业务合作: marketing@enmotech.com