eygle.com   eygle.com
eygle.com eygle
eygle.com  
 

« 过年回家 您都买点啥? | Blog首页 | 一个凉粉在豪华列车上的北上之旅 »

为何而心跳-Oracle Heartbeat研究之二

在上一篇文章(为何而心跳-Oracle Heartbeat研究)中,我简单介绍了heartbeat的机制,现在我们来作点进一步的研究.

首先启动数据库到Mount状态: 

[oracle@jumper bdump]$ sqlplus '/ as sysdba'
SQL*Plus: Release 9.2.0.4.0 - Production on Tue Jan 24 14:10:02 2006
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area   97588504 bytes
Fixed Size                   451864 bytes
Variable Size              33554432 bytes
Database Buffers           62914560 bytes
Redo Buffers                 667648 bytes
Database mounted.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production 

在Mount状态DUMP控制文件,比较前后变化,我们发现Heartbeat每3秒都被更新一次:

[oracle@jumper udump]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Tue Jan 24 14:10:33 2006
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
SQL> alter session set events 'immediate trace name CONTROLF level 10' ;
Session altered.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
[oracle@jumper udump]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Tue Jan 24 14:10:46 2006
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
SQL> alter session set events 'immediate trace name CONTROLF level 10' ;
Session altered.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
[oracle@jumper udump]$ ls
conner_ora_31841.trc  conner_ora_31846.trc
[oracle@jumper udump]$ diff conner_ora_31841.trc conner_ora_31846.trc
....
16c16
< *** SESSION ID:(9.5) 2006-01-24 14:10:39.076
---
> *** SESSION ID:(9.7) 2006-01-24 14:10:48.822
63c63
< heartbeat: 580556236 mount id: 3192501183
---
> heartbeat: 580556239 mount id: 3192501183 

这说明Heartbeat并非在数据库Open状态下才会更新,也说明HeartBeat是用来维持实例的Mount状态检测.

Q:这里的mount id应该代表的是instance的id,每次不同,这个id来自何处呢?应该和系统有关,我没找到具体的含义.哪位知道请告知.

之所以验证这个内容是因为在Steve ( www.ixora.com.au )的网站上有这样一段话:

The checkpoint RBA is copied into the checkpoint progress record of the controlfile by the checkpoint heartbeat once every 3 seconds.

这段话在不同的数据库版本里需要被重新理解.

另外我们还可以看到,在Mount状态下,数据库通过以下两个锁定来维持Instance的变化:

SQL> select * from v$lock;
ADDR     KADDR           SID TYPE        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
-------- -------- ---------- ---- ---------- ---------- ---------- ---------- ---------- ----------
562E35F4 562E3604          3 RT            1          0          6          0       1616          0
562E34C4 562E34D4          4 XR            4          0          1          0       1619          0 

其中RT锁是:Redo thread global enqueue ,为LGWR进程持有.

XR锁Oracle的解释为:

acquired for ALTER SYSTEM QUIESCE RESTRICTED command (or alter database open) in RAC mode."

此处为CKPT进程持有.

在数据库Mount状态下,我们也可以通过查询X$KCCRT来观察heartbeat的变化: 

[oracle@jumper oracle]$ sqlplus '/ as sysdba'
SQL*Plus: Release 9.2.0.4.0 - Production on Wed Jan 25 00:08:06 2006
Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
SQL> select cphbt from X$KCCCP;
     CPHBT
----------
 580567934
SQL> /
     CPHBT
----------
 580567935
SQL> /   
     CPHBT
----------
 580567936
SQL> select open_mode from v$database;
OPEN_MODE
----------
MOUNTED
SQL>  

 


历史上的今天...
    >> 2011-01-27文章:
    >> 2008-01-27文章:

无觅

By eygle on 2006-01-27 21:04 | Comments (5) | Internal | 659 |

5 Comments

eygle大师你好,
上面你的测试是在mount状态下进行的,dump出来的控制文件中只有心跳值发生变化,这很正常,但是在open状态下,如果数据库脏缓冲区有很多数据要写入数据文件时刻,此时再间隔一段时间dump出来控制文件的信息,可能就不只是心跳值变化了,

13c13
Unix process pid: 16613, image: oracle@www (TNS V1-V3)
15,16c15
*** SESSION ID:(13.19) 2006-09-18 17:22:38.334
59,61c58,60
THREAD #1 - status:0x2 flags:0x0 dirty:226
> low cache rba:(0x1.362b.0) on disk rba:(0x1.1ca06.0)
> on disk scn: 0x0000.0158ee4b 09/18/2006 17:22:31
63c62
heartbeat: 601429873 mount id: 76487130

看到上面on disk scn以及心跳都发生了变化,所以俺认为不能说heartbeat每3秒更新的仅是"heartbeat"而已,同时你的测试时间间隔也不是3妙!

一点不同看法,如有不对,请指正,谢谢!

>'同时你的测试时间间隔也不是3妙!'

你怎么知道不是?
这个时间,你去跟踪后台进程就知道了。

>'看到上面on disk scn以及心跳都发生了变化,所以俺认为不能说heartbeat每3秒更新的仅是"heartbeat"而已'

heartbeat 一直在跳,这期间有其它数据发生变化一点都不奇怪啊。

可以再参考我之前给出的链接:

http://www.eygle.com/archives/2006/01/why_oracle_heartbeat.html

可不可这样说:心跳和更新控制文件中的low cache rba、on disk RBA和on disk scn等内容都是CKPT的任务。在三秒超时时,如果不需要更新任这些RBA和SCN,CKPT只记录heartbeat。

X$KCCRT这个写错了吧,应该是X$KCCCP?


CopyRight © 2004~2020 云和恩墨,成就未来!, All rights reserved.
数据恢复·紧急救援·性能优化 云和恩墨 24x7 热线电话:400-600-8755 业务咨询:010-59007017-7040 or 7037 业务合作: marketing@enmotech.com