eygle.com   eygle.com
eygle.com eygle
Digest Net: March 2009 Archives

March 2009 Archives

原文引自: http://www.osxcn.com/ubuntu/ext3-and-reiserfs.html

这篇文章是 Ubuntu 分区和文件系统的选择 的延续阅读,适合初级用户了解。

Linux 上的文件系统很多,例如 ext3, ReiserFS, XFS, JFS 这些,但桌面用户使用比较多的还是 ext3 和 ReiserFS。据我所知,ext3 独特的优点就是易于转换,很容易在 ext2 和 ext3 之间相互转换,而具有良好的兼容性,其它优点 ReiserFS 都有,而且还比它做得更好。如高效的磁盘空间利用和独特的搜寻方式都是 ext3 所不具备的,速度上它也不能和 ReiserFS、XFS 相媲美,在实际使用过程中,ReiserFS 也更加安全高效,据说反删除功能也不错。

要说 ext3 和 ReiserFS,可以先了解一下日志文件系统,它就是在非日志文件系统中加入了文件系统更改的日志记录,可以跟踪记录文件系统的变化,并将变化内容写入 日志,写操作首先是对日志记录文件进行操作,若整个写操作由于某种原因 (如系统掉电) 而中断,系统重启时,会根据日志记录来恢复中断前的写操作,而且这个过程费时极短。ext3 和 ReiserFS 都是拥有这种日志功能的日志式文件系统。

ext3 和 ReiserFS 分别是 Redhat / SuSE Linux 默认文件系统,而 ReiserFS 的优势在于,它是基于 B*Tree 快速平衡树这种高效算法的文件系统,例如在处理小于 1k 的文件比 ext3 快 10 倍。再一个就是 ReiserFS 空间浪费较少,它不会对一些小文件分配 inode,而是打包存放在同一个磁盘块 (簇) 中,ext2/ext3 是把它们单独存放在不同的簇上,如簇大小为 4k,那么 2 个 100 字节的文件会占用 2 个簇,ReiserFS 则只占用一个。当然 ReiserFS 也有缺点,就是每升级一个版本,都要将磁盘重新格式化一次。

由于日志文件系统在写入数据的同时还要记录日志,这样就需要更多的磁盘 I/O 操作,必然会带来性能上的损失 (但 ext3 优化了硬盘磁头的运动,总处理能力不比 ext2 慢)。还有就是日志文件系统在频繁记录日志的同时,产生的磁盘碎片也比 ext2 这种非日志文件系统多 (虽然相比 fat32 这些碎片根本算不了什么)。所以一些资料上推荐用户使用混合文件系统,例如一些只读目录 /usr 使用 ext2,把 /var 这些需要频繁写入数据的目录使用 ext3,但我认为对桌面用户来说,ReiserFS 则是更好的选择,它的速度比 ext3 快,碎片比 ext3 少。

ext2, ext3, xfs, reiserfs 文件系统性能测试
实战 ReiserFS 文件系统
Linux 日志文件系统及性能分析
在 Linux 中使用 ReiserFS 文件系统

What is Oracle consistent gets?

Quote From : http://www.dba-oracle.com/m_consistent_gets.htm

The consistent gets Oracle metric is the number of times a consistent read (a logical RAM buffer I/O) was requested to get data from a data block. Part of Oracle tuning is to increase logical I/O by reducing the expensive disk I/O (physical reads), but high consistent gets presents it's own tuning challenges, especially when we see super high CPU consumption (i.e. the "top 5 timed events" in an AWR report).

Tuning Consistent Gets

Many shops with super-high consistent gets have high CPU consumption and this is quickly fixed by adding more CPU's to the server. Note that Oracle expert Kevin Closson sees "buffer chains latch" thrashing (latch overhead) as a major contributor to high CPU consumption on highly-buffered Oracle databases (e.g. 64-bit Oracle with a 50 gig db_cache_size): 

" The closer a system gets to processor saturation, the more troublesome latch gets become--presuming the chain is hot.

While cache buffers chains latch thrashing may seem like a nebulous place to put blame for high processor utilization, trust me, it isn't.". 

Types of Consistent Gets

Not all buffer touches are created equal, and Oracle has several types of "consistent gets", the term used by Oracle to describe an Oracle I/O that is done exclusively from the buffer cache.  Oracle AWR and STATSPACK reports mention several types of consistent gets, all undocumented:

  • consistent gets

  • consistent gets from cache

  • consistent gets - examination

  • consistent gets direct

Some Oracle experts claim that these undocumented underlying mechanism can be revealed and that these consistent gets metrics may tell us about data clustering  Mladen Gogala, author of "Easy Oracle PHP" makes these observations about consistent gets:

"The [consistent gets] overhead is the time spent by Oracle to maintain its own structures + the time spent by OS to maintain its own structures. So, what exactly happens during a consistent get in the situation described? As I don't have access to the source code, I cannot tell precisely, with 100% of certainty, but based on my experience, the process goes something like this:

1) Oracle calculates the hash value of the block and searches the SGA hash table for the place where the block is located.

2) Oracle checks the SCN of the block and compares it with the SCN of the current transaction. Here, I'll assume that this check will be OK and that no read consistent version needs to be constructed.

3) If the instance is a part of RAC, check the directory and see whether any other instance has modified the block. It will require communication with the GES process using the IPC primitives (MSG system calls). MSG system calls are frequently implemented using device driver which brings us to the OS overhead (context switch, scheduling)

4) If everything is OK, the block is paged in the address space of the requesting process. For this step I am not exactly sure when does it happen, but it has to happen at some point. Logically, it would look as the last step, but my logic may be flawed. Here, of course, I assume a soft fault. Hard fault would mean that a part of SGA was swapped out.

All of this is an overhead of a consistent get and it is the simplest case. How much is it in terms of microseconds, depends on many factors, but the overhead exists and is strictly larger then zero. If your SQL does a gazillion of consistent gets, it will waste significant CPU power and time to perform that."

For more insights on consistent gets, we see expert Kevin Closson who has a great description of the internal mechanisms within consistent gets.  Kevin goes on to describe the internals of a consistent get:

"The routine is kcbget() (or one of his special purpose cousins). It doesn't really "search" a hash *table* if you will.  A hash table would be more of a "perfect hash" structure and to implement that, every possible hash value has to be known when the table is set up. That would mean knowing every possible database block address.

Instead, it hashes to a bucket that has similar hashed dbas chained off off it in a linked list. So it is more of a scan of the linked list looking for the right dba and right version of it.

The particulars of the structures under a get are not as important as remembering that before walking that chain, the process has to obtain the latch on the chain. "

Consistent gets - examination

Mike Ault notes that "consistent gets - examinations" are related to buffer management overhead and data access overhead such as index reads and undo writes:

"consistent gets - examination is from reading something like undo blocks...

Other examples of "consistent gets - examination" are: reading the root block of an index, reading an undo block while creating a consistent read data block, reading a block in a single table hash cluster - unless it is found to have the 'collision flag' set."

Steve Karam, OCM notes about "consistent gets - examination":

"Consistent gets - examination are a different kind of consistent get  that only requires a single latch, saving CPU.  The most common use of a consistent get - examination is to read undo blocks for consistent read purposes, but they also do it for the first part of an index read and in certain cases for hash clusters.

So if you're doing a query on a couple tables that are mostly cached, but one of them has uncommitted DML against it at the time, you'll do consistent gets for the standard data in the cache, and the query will do consistent gets - examination to read the undo blocks and create read consistent blocks; this doesn't necessarily save CPU unfortunately, because while the consistent gets - examination only acquire one latch,  creating the read consistent data block also takes a latch.

However, I think that when you use single table hash clusters (or the new 10g Sorted Hash Clusters I mentioned once that automatically sort by a key so they don't need order by) you can get a performance gain, because reads from the blocks of a hash cluster are usually consistent get - examination, therefore they only need one latch instead of two. "

Interpreting consistent gets in reports

Here is a STATSPACK (pr AWR) report  we see displays for "consistent gets" and "consistent gets - examinations": 

Statistic                         Total              per Second     per Trans
--------------------------------- ------------------ -------------- -----------
consistent gets                           35,024,284        9,718.2      3,703.9

consistent gets - examination             12,148,672        3,370.9      1,284.8


An Oracle FAQ's forum had a problem in which a user had trouble, "when we run set autotrace on or similar execution statistics."  The problem was resolved in part with this advices: "consistent gets is the blocks in consistent mode (sometimes reconstructed using information from RBS). So this reconstruction from RBS takes more resources (reads actually), which will end up as high consistent gets."

Quote From ASKTOM: http://asktom.oracle.com

Consistenet gets is based upon re-constructing a block for consistent read. 
Hence it is a function of only the
number of db_blocks to be read.
If you say that it is altered by the arraysize, do you suggest that,
due to arraysize,
some blocks are read muliple times and hence some blocks have > 1
consistent read in the process
No, you are wrong in your statement.
A consistent get is a block gotten in read consistent mode (point in time mode).
It MAY or MAY NOT involve reconstruction (rolling back).
Db Block Gets are CURRENT mode gets -- blocks read "as of right now".

Some blocks are processed more then once, yes, the blocks will have more then 1
consistent read in the process. Consider:

ops$tkyte@ORA817DEV.US.ORACLE.COM> create table t as select * from all_objects;
Table created.
ops$tkyte@ORA817DEV.US.ORACLE.COM> exec show_space( 'T')
Free Blocks.............................0
Total Blocks............................320
Total Bytes.............................2621440
Unused Blocks...........................4
Unused Bytes............................32768
Last Used Ext FileId....................7
Last Used Ext BlockId...................40969
Last Used Block.........................60
PL/SQL procedure successfully completed.
Table has 316 blocks, 22,908 rows..

ops$tkyte@ORA817DEV.US.ORACLE.COM> set autotrace traceonly statistics;
ops$tkyte@ORA817DEV.US.ORACLE.COM> set arraysize 15
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t;
22908 rows selected.
here with an array size of 15, we expect
22908/15 + 316 = 1843 consistent mode gets. db block gets -- they were for
performing the FULL SCAN, they had nothing to do with the data itself we

0 recursive calls
12 db block gets
1824 consistent gets

170 physical reads
0 redo size
2704448 bytes sent via SQL*Net to client
169922 bytes received via SQL*Net from client
1529 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
22908 rows processed
ops$tkyte@ORA817DEV.US.ORACLE.COM> set arraysize 100
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t;
22908 rows selected.
Now, with 100 as the arraysize, we expect
22908/100 + 316 = 545 consistent mode gets.

0 recursive calls
12 db block gets
546 consistent gets
180 physical reads
0 redo size
2557774 bytes sent via SQL*Net to client
25844 bytes received via SQL*Net from client
231 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
22908 rows processed
ops$tkyte@ORA817DEV.US.ORACLE.COM> set arraysize 1000
ops$tkyte@ORA817DEV.US.ORACLE.COM> select * from t;
22908 rows selected.
now, with arraysize = 1000, we expect:
22908/1000+316 = 338 consistent mode gets...

0 recursive calls
12 db block gets
342 consistent gets
222 physical reads
0 redo size
2534383 bytes sent via SQL*Net to client
2867 bytes received via SQL*Net from client
24 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
22908 rows processed

so yes, the blocks are gotten in consistent mode MORE THEN ONCE when the array
fetch size is lower then the number of rows to be retrieved in this case

This is because we'll be 1/2 way through processing a block -- have enough rows
to return to the client -- and we'll give UP that block. When they ask for the
next N rows, we need to get that halfway processed block again and pick up where
we left off.

在一些操作系统平台中,我们可以OracleSGA定在内存里,这样可以避免页交换,从而提高Oracle的性能。在AIX下,要把操作系统的v_pinshm参数设置为1,否则即使在Oracle中将LOCK_SGA设置为TRUE也是不管用的。然而仅仅知道这两个参数还远不够用的,必须对AIX内存管理有一定了解。本文要求操作系统是5.3 ML01以上,Oracle9.2.0.4以上。


XXIBM:#oslevel -r




XXIBM:#bootinfo -r



再用rmss -p来看看当前可用内存是否与实际内存一致。因为有的时候可能出于测试的考虑,我们可能用rmss把内存模拟到某个大小(当然只能向小模拟)。

XXIBM:#rmss -p

Simulated memory size is 63231.9375 Mb.

如果上面的输出小于实际的内存,就要考虑用rmss -r来将内存恢复到实际大小。



XXIBM:#vmo -L lru_file_repage



lru_file_repage   1    1     1     0    1    boolean  D



XXIBM:#vmo -o lru_file_repage=0

Setting lru_file_repage to 0


XXIBM:#vmo -L v_pinshm



v_pinshm   1    0     0     0    1    boolean  D

XXIBM:#vmo -o v_pinshm=1

Setting v_pinshm to 1


XXIBM:#vmo -o minperm%=10

Setting minperm% to 10

XXIBM:#vmo -o maxperm%=90

Setting maxperm% to 90


XXIBM:#su - oracle

$sqlplus /nolog

SQL*Plus: Release9. - Production on Fri Sep 19 08:40:10 2008

Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.

SQL>conn / as sysdba


SQL>show parameter lock_sga

NAME                                TYPE       VALUE

------------------------------------ -----------

lock_sga                            boolean    FALSE


SQL>alter system set lock_sga=true scope=spfile;

System altered.


SQL>select sum(value)/1024/1024 from v$sga;







$su -

root's Password:

XXIBM:#svmon -P -t 100|grep -p Pid|head


 Pid Command      Inuse     Pin    Pgsp Virtual 64-bit Mthrd 16MB 225546 oracle     9313207 9270407  2232 9308982  Y    N     N


 Pid Command      Inuse     Pin    Pgsp Virtual 64-bit Mthrd 16MB

 119692 oracle       9312614 9270438 2232 9308978  Y     N  N


Pid Command       Inuse     Pin    Pgsp Virtual 64-bit Mthrd 16MB



XXIBM:#vmo -p -o v_pinshm=1

Setting v_pinshm to1 innextboot file

Setting v_pinshm to 1

原文: http://space.itpub.net/78033/viewspace-462686








Powered by Movable Type 6.3.2

About this Archive

This page is an archive of entries from March 2009 listed from newest to oldest.

February 2009 is the previous archive.

April 2009 is the next archive.

回到 首页 查看最近文章或者查看所有归档文章.