Thursday, July 16, 2009

More than one way to save blocks

In these days, disk capacity is often not a big issue today. If you have at least a decent load on the database, you will hit the IOPS limit much sooner than you run out of disk space.

Well, almost. First, you will still have a lot of inactive data that consumes the space but does not require any IOPS. And second, in some applications (like ETL in DWH) you are bound by throughput. Let me talk about this case.

Don't expect any exceptional thoughts, this post just inspired by a real production case and tries to pinpoint that there is always one more way to consider.

The optimal plan for many ETL queries is a lot of fullscans with hash joins. And often, you read one table more times, to join it in different ways. Such queries benefit if you make your tables smaller - you save on I/O.

(1) In ETL, your source tables are often imported from a different system, and you actually don't need all columns from the tables. So, first of all - don't load the data you don't need. However, usually you just can't drop the columns - this would change the data model, you would have to update it in the ETL tool, and you would have to do a lot of work when the list of used columns change.
How to tackle this? Use views, and select NULL for the columns you don't need. Use FGA (Fine-grained auditing) (at least on test) to verify you don't access any of those non-loaded columns. (Just beware, that things like dbms_stats access all columns.)
(Bonus: depending on the source system, transferring less data may take less time due to limits of the transfer channel - ODBC, network, etc.)

(2) As the source data is usually loaded only once and truncated before each load, PCTFREE should be 0, so no space is lost for allocations that will never come.

(3) Now, with the (1) implemented, the tables contain (a lot of) NULL columns. It's just one byte for each such column, but interestingly, it still makes a difference. Just recreate the tables while putting the NULL columns at the end. (No proper application depend on column order, right?). On a 1.2GB table, we got a 35% saving just by using (2) and (3) - it's really worth trying.

(4) And of course, use APPEND hint and use the 10g table direct-load data compression (COMPRESS in table definition). Another 50% for us...

Please note that the only problem you usually face here is to get a list of used columns - fortunately, most ETL tools (like ODI) can provide it, even it means accessing directly their repository (snp_txt_crossr in ODI). The rest is easy to automate.

Wednesday, July 8, 2009

A latteral view quirk

This quest started with the usual question: why is this query so slow? To put it in the picture, it was a query loading one DWH table by reading one source table from a legacy system (already loaded to Oracle, so no heterogenous services were involved at this step), joining it several times to several tables.
(It's the usual badly-designed legacy system: if flag1 is I, join table T1 by C1, if flag1 is N, join table T1 by C2... 20 times.)

If I simplify the query, we are talking about something like:
SELECT T1.m, 
case
when T1.h = 'I' then T2_I.n
when T1.h = 'G' then T2_G.n
else null
end
FROM T1
LEFT OUTER JOIN T2 T2_I
ON (T1.h = 'I' and T1.y = T2_I.c1)
LEFT OUTER JOIN T2 T2_G
ON (T1.h = 'G' and T1.z = T2_G.c2)

We even know, that the query always return number of rows identical to number of rows in T2. However, ommiting the T1.h = 'I'/'G' conditions in join clause would duplicate the rows, so the conditions are necessary there. Ofcourse it's not possible to move the conditions to WHERE clause, as this would elimitate all rows from result query.

To make the test case query even shorter, we can use for the demonstration just:
SELECT count(*)
FROM T1
LEFT OUTER JOIN T2
ON (T1.h = 'I' and T1.y = T2.c1)

(This query makes almost no business sense now, but the the lateral view issue I want to demonstrate is still there.)

The query plan looks like:

---------------------------------------------------------------
|Id|Operation |Name|Rows |Bytes| Cost | Time |
---------------------------------------------------------------
|0 |SELECT STATEMENT | | 1 | 43 | 1804M|999:59:59 |
|1 | SORT AGGREGATE | | 1 | 43 | | |
|2 | NESTED LOOPS OUTER | | 9805M| 392G| 1804M|999:59:59 |
|3 | TABLE ACCESS FULL | T1 | 188K|7899K| 718 | 00:00:09 |
|4 | VIEW | |52124 | | 9593 | 00:01:56 |
|*5| FILTER | | | | | |
|*6| TABLE ACCESS FULL| T2 |52124 | 356K| 9593 | 00:01:56 |
---------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

5 - filter("T1"."H"='I')
6 - filter("T1"."Y"="T2"."C1")


This is awful! Cost of 1804M just for joining two tables (T1: 188K rows, T2: 5M rows). And yes, the execution proves the plan is not good (I did not have the patience to wait many hours (days?) for the query to finish).


However, a colleague suggested modifying the query as follows:

SELECT count(*)
FROM T1
LEFT OUTER JOIN T2
ON (T1.h = nvl('I',T2.c1) and T1.y = T2.c1)

This does not change the result set - the 'I' is always not null and thus the nvl is superfluos. However, we get a different execution plan!


-------------------------------------------------------------
|Id|Operation |Name|Rows |Bytes| Cost | Time |
-------------------------------------------------------------
|0 |SELECT STATEMENT | | 1 | 54 | 5409K| 18:01:45 |
|1 | SORT AGGREGATE | | 1 | 54 | | |
|*2| HASH JOIN OUTER | | 9805M| 493G| 5409K| 18:01:45 |
|3 | TABLE ACCESS FULL| T1 | 188K|7899K| 718 | 00:00:09 |
|4 | TABLE ACCESS FULL| T2 | 5212K| 54M| 9585 | 00:01:55 |
-------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T1"."Y"="T2"."C1"(+) AND
"T1"."H"=NVL('I',"T2"."C1"(+)))

The cost is now 5409K, the operation is a nice hash join, and the query really finishes in few minutes.

The question now is: WHY?

Well, this is a matter of query optimization and plan generation, so the first person to ask is directly the CBO. So, I enabled the 10053 event for the two queries and dived into the two trace files, mainly to see the differences.

Both queries had the main query block initially rewritten as:
SQL:******* UNPARSED QUERY IS *******
SELECT "T1"."Y" "Y","T1"."H" "H",
"from$_subquery$_004"."C1_0" "C1"
FROM "SCOTT"."T1" "T1",
LATERAL( (SELECT "T2"."C1" "C1_0" FROM "SCOTT"."T2" "T2" WHERE "T1"."H"='G' AND "T1"."Y"="T2"."C1"))(+) "from$_subquery$_004"

(The second query with the added NVL in "T1"."H"=NVL('G',"T2"."C1") ).

So, for Oracle, it is a lateral (correlated) view. That's not nice, but at this stage of CBO processing, normal. CBO will try to get rid of it.

However only for the NVL case the CBO trace shows:
CVM:   Merging SPJ view SEL$1 (#0) into SEL$2 (#0)

Followed by:
SQL:******* UNPARSED QUERY IS *******
SELECT COUNT(*) "COUNT(*)"
FROM "SCOTT"."T1" "T1","SCOTT"."T2" "T2"
WHERE "T1"."Y"="T2"."C1"(+)
AND "T1"."H"=NVL('G',"T2"."C1"(+))

Thus, CBO was able to rewrite is as the old-fashioned (+) outer join; however, it was not able to do it for the non-NVL query. And these result are passed to the next stage, and as no constraints or predicate move-around changes the query, they are verbatim passed for actual plan generation. And understandably, a lateral (correlated) view is not considered for hash join.

Anyway, should you read the Inside the Oracle Optimizer blog, you would already know that this is the classical example of the lateral non-mergeable view. Still, why the second one worked as we wanted?

Well, the quirk is in the fact that there is no way how to write the non-NVL query using (+) syntax - there is just no place to put the (+) sign to the t1.y='I' predicate to change it from filter to join predicate. However, adding artifically a column from T2 makes it possible, and the CBO did it. The CBO internally uses the old Oracle syntax, and thus if you can't rewrite your query using that syntax, neither CBO can.

Just a note - the same applies for example for predicate length(t1.q)=10, you can save the day by using length(nvl(t1.q,t2.c1))=10.

Tested on: Windows 64-bit (EM64T), Oracle 10.2.0.4.

Sunday, July 5, 2009

Gentle introduction to opimization

I was just asked to prepare a short, one-hour workshop/presentation about optimization on Oracle. As this topis is so huge, and everyone had already read something, I decided to concept this workshop as an overview of the concepts (starting with db design) and the tools available.
The .pdf version is thus a kind of checklist - have you read about all of these issues? Have you thought them out when designing your system?
I hope you will find at least one new thing there:-)
The PDF is available for download on my website download area.

Tuesday, September 23, 2008

Unique and non-unique indexes

An interesting questions is: what is in fact the difference between unique and non-unique indexes? For a long discussion, see Richard Foote's blog. Here, we look at the on-disk differences.

Let's start with environment setup and block dump creation:
connect system

create user itest identified by itest;

grant dba to itest;

create tablespace ITEST;

alter user itest default tablespace itest;

connect itest/itest

create table TDATA (pk varchar2(20));

begin
for i in 1..10000 loop
insert into TDATA values ('VAL'||i);
end loop;
commit;
end;

create index TIDX1 on TDATA(pk);

(I chose varchar2 type so that the actual characters are clearly seen in the dump. Also, I created a fresh new tablespace so the block numbers are small and probably consecutive.)
Now, find the extents involved:
select * from dba_extents where owner=user;
And dump the blocks (take numbers from the query above - relative_fno, block_id):
alter system dump datafile 14 block min 33 block max 62;
Save the trace, create a unique index, dump it:
drop index TIDX1;
create unique index TIDX2 on TDATA(pk);
select * from dba_extents where owner=user;
alter system dump datafile 14 block min 33 block max 62;
Now, take look at the dumped blocks. For the sake of the explanation flow, let's start with the leaf blocks:

Non-unique:
Leaf block dump
===============
header address 146678372=0x8be2264
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 380
kdxcofbo 796=0x31c
kdxcofeo 1617=0x651
kdxcoavs 821
kdxlespl 0
kdxlende 0
kdxlenxt 58720295=0x3800027
kdxleprv 58720293=0x3800025
kdxledsz 0
kdxlebksz 8036
row#0[8020] flag: ------, lock: 0, len=16
col 0; len 6; (6): 56 41 4c 31 33 34
col 1; len 6; (6): 03 80 00 0d 00 85
row#1[8003] flag: ------, lock: 0, len=17
col 0; len 7; (7): 56 41 4c 31 33 34 30
col 1; len 6; (6): 03 80 00 0f 00 8a
...
What is interesting: the index has 2 colums! This is indicated by kdxconco, and is of course clearly seen in the entries themselves. The second column is the rowid of the table row. So after all, the entries are unique, by considering rowid as an additional column.

Interestingly, the unique index looks a bit differently:
Leaf block dump
===============
header address 152117860=0x9112264
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 401
kdxcofbo 838=0x346
kdxcofeo 1664=0x680
kdxcoavs 826
kdxlespl 0
kdxlende 0
kdxlenxt 58720295=0x3800027
kdxleprv 58720293=0x3800025
kdxledsz 6
kdxlebksz 8036
row#0[8020] flag: ------, lock: 0, len=16, data:(6): 03 80 00 0f 00 9d
col 0; len 7; (7): 56 41 4c 31 33 35 39
row#1[8005] flag: ------, lock: 0, len=15, data:(6): 03 80 00 0d 00 87
col 0; len 6; (6): 56 41 4c 31 33 36
Oracle now does not need to artifically add rowid to column list, as the 1 column suffices to uniquely identify the entry. However, it still needs it to find the row. The difference: 1 byte saved (in fact, the length byte for the rowid).

Now, let's have a look at the branch block (the root, for this small table). First, the non-unique:
Branch block dump
=================
header address 146678348=0x8be224c
kdxcolev 1
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 26
kdxcofbo 80=0x50
kdxcofeo 7724=0x1e2c
kdxcoavs 7644
kdxbrlmc 58720293=0x3800025
kdxbrsno 0
kdxbrbksz 8060
kdxbr2urrc 0
row#0[8048] dba: 58720294=0x3800026
col 0; len 6; (6): 56 41 4c 31 33 34
col 1; TERM
row#1[8035] dba: 58720295=0x3800027
col 0; len 7; (7): 56 41 4c 31 36 38 32
col 1; TERM
In the branch blocks, only the information needed to find the correct leaf block is storred. Thus, the column contents is truncated as much as possible (see more below). However, we have two columns in the index - and we have to indicate that the 2nd column (rowid) is not needed - and that's the "col 1; TERM" entry.

Now, for the unique case:
Branch block dump
=================
header address 152117836=0x911224c
kdxcolev 1
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 24
kdxcofbo 76=0x4c
kdxcofeo 7775=0x1e5f
kdxcoavs 7699
kdxbrlmc 58720293=0x3800025
kdxbrsno 0
kdxbrbksz 8060
kdxbr2urrc 4
row#0[8048] dba: 58720294=0x3800026
col 0; len 7; (7): 56 41 4c 31 33 35 39
row#1[8037] dba: 58720295=0x3800027
col 0; len 6; (6): 56 41 4c 31 37 32
Here, we have only one column, so no need to write that no other column data is needed. The net difference: 1 bytes per index entry.

Please note, that Oracle fills the blocks end-to-start (so the header can grow), and thus the dump usually starts with high addresses. However, this is only physical location, and does not represent the index ordering. For example, after index block split, it can (and will) change. Just reverse the example order: first create index, then insert data. Then compare first and last leaf block.

One final note regarding the branch blocks and inclusion only of the prefix needed to identify the correct leaf block.
Let's change the example slightly: insert the values 'VAL'||i||'0000000'. Now, the leaf blocks has to contain these values:
row#1[4367] flag: ----S-, lock: 2, len=24
col 0; len 14; (14): 56 41 4c 33 37 35 30 30 30 30 30 30 30 30
col 1; len 6; (6): 03 80 00 0f 00 4f
However, the branch block does not have to:
row#0[7989] dba: 58720300=0x380002c
col 0; len 6; (6): 56 41 4c 31 31 34
col 1; TERM

Thursday, May 8, 2008

Who am I?

Quite often, a procedure needs to know some information about the session it is running in - at least for logging some information.

It seems to be easy to find this information - just look at SYS_CONTEXT documentation, and find the impressive list of attributes of USERENV context. And if something is missing, just grab the SESSIONID, SID or AUDSID and query v$session or other view.

Well, this works nicely, as long as you don't do it in a job. Unfortunately, in a job you always get 0.
You can try to battle with FG_JOB_ID and BG_JOB_ID, but I think this deserves a more elegant solution - and that's why I always query v$mystat to get the SID of my process. Just fetch the first line:
select * from v$session where sid = (select sid from v$mystat where rownum=1)
You need a select grant on v_$mystat by SYS, but as this view does not expose anything important, it's not a security issue.

The next question that comes is: if I am run by a job (let's talk about dbms_job now), what's my job id?

The universal solution is to query v$lock and check for the JQ lock you are holding - and id2 is jor id:
select id2 from v$lock where type = 'JQ' and sid = (select sid from v$mystat where rownum=1)
However, there is one more elegant solution - the dbms_job engine does not run just the "what" code specified by you - it wraps it in a short anonymous block. In this block, a variable called "job" is defined and binded to contain just the id you need. So, in your "what" code, just use this variable:
dbms_job.submit(j, 'my_proc(job, my_param1, my_param2);'
If you are curious, this block will be run:
DECLARE job BINARY_INTEGER := :job; next_date DATE := :mydate;
broken BOOLEAN := FALSE; BEGIN my_proc(job, my_param1, my_param2);
:mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0;
END IF; END;

For dbms_scheduler, see for example discussion at http://forums.oracle.com/forums/thread.jspa?messageID=2335238. I still personally don't like it, it's too cumbersone for the things I do (a workflow engine, for example).

Tuesday, April 15, 2008

A list partitiong bug

Last week, I found a not-so-nice bug in 10gR2. Just try this short test case, creating a table with two list partitions, default and null:
alter session set nls_territory=America;
alter session set nls_language=American;

select version from product_component_version;

drop table th3;

prompt Creating TH3...
create table TH3
(
DF DATE,
DT DATE,
N NUMBER
)
partition by list (DT)
(
partition THTBL_CURRENT values (NULL)
tablespace USERS,
partition THTBL_OLD values (default)
tablespace USERS
)
;

prompt Loading TH3...
insert into TH3 (DF, DT, N)
values (to_date('01-01-2000', 'dd-mm-yyyy'), to_date('10-04-2008', 'dd-mm-yyyy'), 1);
insert into TH3 (DF, DT, N)
values (to_date('01-01-1000', 'dd-mm-yyyy'), to_date('01-03-3000', 'dd-mm-yyyy'), 2);
insert into TH3 (DF, DT, N)
values (to_date('10-04-2008', 'dd-mm-yyyy'), to_date('10-04-2008', 'dd-mm-yyyy'), 10);
insert into TH3 (DF, DT, N)
values (to_date('11-04-2008', 'dd-mm-yyyy'), to_date('11-04-2008', 'dd-mm-yyyy'), 11);
insert into TH3 (DF, DT, N)
values (to_date('10-04-2008', 'dd-mm-yyyy'), to_date('11-04-2008', 'dd-mm-yyyy'), 1011);
commit;
prompt 5 records loaded
prompt Done.

SELECT to_date('10.04.2008', 'dd.mm.yyyy'), N, df, dt, rowid
FROM th3
WHERE to_date('10.04.2008', 'dd.mm.yyyy')
BETWEEN DF AND dt;

SELECT to_date('10.04.2008', 'dd.mm.yyyy'), N, df, dt, rowid
FROM th3
WHERE to_date('10.04.2008', 'dd.mm.yyyy')
BETWEEN DF AND
nvl(dt,to_date('10.04.9999', 'dd.mm.yyyy'));

exit;

Although the selects should not differ in the results (there is no NULL value in dt column), they do:

SQL*Plus: Release 10.2.0.1.0 - Production on Sat Apr 12 14:59:05 2008

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected To:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options


Session altered.


Session altered.


VERSION
--------------------------------------------------------------------------------
10.2.0.1.0
10.2.0.1.0
10.2.0.1.0
10.2.0.1.0

drop table th3
*
ERROR at line 1:
ORA-00942: table or view does not exist


Creating TH3...

Table created.

Loading TH3...

1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


Commit complete.

5 records loaded
Done.

no rows selected


TO_DATE(' N DF DT ROWID
--------- ---------- --------- --------- ------------------
10-APR-08 1 01-JAN-00 10-APR-08 AAAM1CAAEAAAAGtAAA
10-APR-08 2 01-JAN-00 01-MAR-00 AAAM1CAAEAAAAGtAAB
10-APR-08 10 10-APR-08 10-APR-08 AAAM1CAAEAAAAGtAAC
10-APR-08 1011 10-APR-08 11-APR-08 AAAM1CAAEAAAAGtAAE

Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

The same results were obtained for 10.2.0.3. However, in the 10.2.0.4 (and 11.1.0.6), it seems fixed:

SQL*Plus: Release 10.2.0.4.0 - Production on Sat Apr 12 18:45:30 2008

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


Session altered.


Session altered.


VERSION
--------------------------------------------------------------------------------
10.2.0.4.0
10.2.0.4.0
10.2.0.4.0
10.2.0.4.0

drop table th3
*
ERROR at line 1:
ORA-00942: table or view does not exist


Creating TH3...

Table created.

Loading TH3...

1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


Commit complete.

5 records loaded
Done.

TO_DATE(' N DF DT ROWID
--------- ---------- --------- --------- ------------------
10-APR-08 1 01-JAN-00 10-APR-08 AAAM7pAAEAAAAGlAAA
10-APR-08 2 01-JAN-00 01-MAR-00 AAAM7pAAEAAAAGlAAB
10-APR-08 10 10-APR-08 10-APR-08 AAAM7pAAEAAAAGlAAC
10-APR-08 1011 10-APR-08 11-APR-08 AAAM7pAAEAAAAGlAAE


TO_DATE(' N DF DT ROWID
--------- ---------- --------- --------- ------------------
10-APR-08 1 01-JAN-00 10-APR-08 AAAM7pAAEAAAAGlAAA
10-APR-08 2 01-JAN-00 01-MAR-00 AAAM7pAAEAAAAGlAAB
10-APR-08 10 10-APR-08 10-APR-08 AAAM7pAAEAAAAGlAAC
10-APR-08 1011 10-APR-08 11-APR-08 AAAM7pAAEAAAAGlAAE

Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


If you create the table as nonpartitioned, you get correct results irrespective of version.

Tested on Windows 32-bit (on 10.2.0.1, 10.2.0.3 and 10.2.0.4) and RHEL 5.1 x86_64 (10.2.0.3).

If you are curious, here's the explain plan (10.2.0.3) for the first select:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4063410327

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 43 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION LIST EMPTY| | 1 | 43 | 2 (0)| 00:00:01 |INVALID|INVALID|
|* 2 | TABLE ACCESS FULL | TH3 | 1 | 43 | 2 (0)| 00:00:01 |INVALID|INVALID|
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


2 - filter("DF"<=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "DT">=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

On 10.2.0.4, I get:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3503314195

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 43 | 3 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 1 | 43 | 3 (0)| 00:00:01 |KEY |KEY |
|* 2 | TABLE ACCESS FULL | TH3 | 1 | 43 | 2 (0)| 00:00:01 | 2 | 2|
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


2 - filter("DF"<=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "DT">=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

So it looks like a problem in partition elimination - on 10.2.0.3, Oracle didn't evaluate the involved partitions properly.

No patch was provided by Oracle, as we have two luxurious options: a workaround and 10.2.0.4. And of course, the second select is after all the correct one for the initial problem...

10gR2 RAC on RHEL5.1 (x86_64)

Just a few issues you should be aware of when trying to install 10gR2 RAC on RHEL 5.1:

See Metalink 465001.1 for raw device configuration for RHEL 5. It's not explicitly said, but all disks used must be partitioned! Otherwise, if you don't partition the OCR disk, the root.sh from clusterware installation will fail (fail with “Failed to upgrade Oracle Cluster Registry configuration” error, with “Failed to call clsssinit” in log.)
The ASMLib will also refuse to stamp whole disk, a partition is required.

For OS configuration, see Metalink 421308.1. However, the sysctl parameters listed there do not exist on RHEL 5.1, you will have to use:

kernel.shmmni = 4096

kernel.sem = 250 32000 100 128

fs.file-max = 65536

net.ipv4.ip_local_port_range = 1024 65000

net.core.rmem_default = 262144

net.core.rmem_max = 262144

net.core.wmem_default = 262144

net.core.wmem_max = 262144

net.ipv4.tcp_rmem = 4194304 4194304 4194304
net.ipv4.tcp_wmem = 262144 262144 262144


You will also need to install one more rpm package (kernel-headers from CD1), but this will indicate the rpm command anyway.

As for network, remember that the localhost entries must be at the end of hosts file, so OUI will resolve current server with its name, instead of "localhost". (You would see this in the "Specify cluster configuration" screen.) If the node is listed as "localHost" (note the capital H), check your /etc/hosts for errors and that you can ping all the nodes listed (check also for typing errors).

Before running the root.sh script, read Metalink 414163.1. You will have to edit vipca and srvctl scripts (remove LD_ASSUME_KERNEL setting) and manually configure public/interconnect interfaces, finally running vipca interactively from root X session.
If you ran root.sh without these changes, remove the /etc/oracle/ocr.loc, remove the entries from /etc/inittab and run telinit so the root.sh can configure the OCR from scratch again.

Hope this helps:-)