Tuesday, April 15, 2008

A list partitiong bug

Last week, I found a not-so-nice bug in 10gR2. Just try this short test case, creating a table with two list partitions, default and null:
alter session set nls_territory=America;
alter session set nls_language=American;

select version from product_component_version;

drop table th3;

prompt Creating TH3...
create table TH3
(
DF DATE,
DT DATE,
N NUMBER
)
partition by list (DT)
(
partition THTBL_CURRENT values (NULL)
tablespace USERS,
partition THTBL_OLD values (default)
tablespace USERS
)
;

prompt Loading TH3...
insert into TH3 (DF, DT, N)
values (to_date('01-01-2000', 'dd-mm-yyyy'), to_date('10-04-2008', 'dd-mm-yyyy'), 1);
insert into TH3 (DF, DT, N)
values (to_date('01-01-1000', 'dd-mm-yyyy'), to_date('01-03-3000', 'dd-mm-yyyy'), 2);
insert into TH3 (DF, DT, N)
values (to_date('10-04-2008', 'dd-mm-yyyy'), to_date('10-04-2008', 'dd-mm-yyyy'), 10);
insert into TH3 (DF, DT, N)
values (to_date('11-04-2008', 'dd-mm-yyyy'), to_date('11-04-2008', 'dd-mm-yyyy'), 11);
insert into TH3 (DF, DT, N)
values (to_date('10-04-2008', 'dd-mm-yyyy'), to_date('11-04-2008', 'dd-mm-yyyy'), 1011);
commit;
prompt 5 records loaded
prompt Done.

SELECT to_date('10.04.2008', 'dd.mm.yyyy'), N, df, dt, rowid
FROM th3
WHERE to_date('10.04.2008', 'dd.mm.yyyy')
BETWEEN DF AND dt;

SELECT to_date('10.04.2008', 'dd.mm.yyyy'), N, df, dt, rowid
FROM th3
WHERE to_date('10.04.2008', 'dd.mm.yyyy')
BETWEEN DF AND
nvl(dt,to_date('10.04.9999', 'dd.mm.yyyy'));

exit;

Although the selects should not differ in the results (there is no NULL value in dt column), they do:

SQL*Plus: Release 10.2.0.1.0 - Production on Sat Apr 12 14:59:05 2008

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected To:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options


Session altered.


Session altered.


VERSION
--------------------------------------------------------------------------------
10.2.0.1.0
10.2.0.1.0
10.2.0.1.0
10.2.0.1.0

drop table th3
*
ERROR at line 1:
ORA-00942: table or view does not exist


Creating TH3...

Table created.

Loading TH3...

1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


Commit complete.

5 records loaded
Done.

no rows selected


TO_DATE(' N DF DT ROWID
--------- ---------- --------- --------- ------------------
10-APR-08 1 01-JAN-00 10-APR-08 AAAM1CAAEAAAAGtAAA
10-APR-08 2 01-JAN-00 01-MAR-00 AAAM1CAAEAAAAGtAAB
10-APR-08 10 10-APR-08 10-APR-08 AAAM1CAAEAAAAGtAAC
10-APR-08 1011 10-APR-08 11-APR-08 AAAM1CAAEAAAAGtAAE

Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

The same results were obtained for 10.2.0.3. However, in the 10.2.0.4 (and 11.1.0.6), it seems fixed:

SQL*Plus: Release 10.2.0.4.0 - Production on Sat Apr 12 18:45:30 2008

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


Session altered.


Session altered.


VERSION
--------------------------------------------------------------------------------
10.2.0.4.0
10.2.0.4.0
10.2.0.4.0
10.2.0.4.0

drop table th3
*
ERROR at line 1:
ORA-00942: table or view does not exist


Creating TH3...

Table created.

Loading TH3...

1 row created.


1 row created.


1 row created.


1 row created.


1 row created.


Commit complete.

5 records loaded
Done.

TO_DATE(' N DF DT ROWID
--------- ---------- --------- --------- ------------------
10-APR-08 1 01-JAN-00 10-APR-08 AAAM7pAAEAAAAGlAAA
10-APR-08 2 01-JAN-00 01-MAR-00 AAAM7pAAEAAAAGlAAB
10-APR-08 10 10-APR-08 10-APR-08 AAAM7pAAEAAAAGlAAC
10-APR-08 1011 10-APR-08 11-APR-08 AAAM7pAAEAAAAGlAAE


TO_DATE(' N DF DT ROWID
--------- ---------- --------- --------- ------------------
10-APR-08 1 01-JAN-00 10-APR-08 AAAM7pAAEAAAAGlAAA
10-APR-08 2 01-JAN-00 01-MAR-00 AAAM7pAAEAAAAGlAAB
10-APR-08 10 10-APR-08 10-APR-08 AAAM7pAAEAAAAGlAAC
10-APR-08 1011 10-APR-08 11-APR-08 AAAM7pAAEAAAAGlAAE

Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


If you create the table as nonpartitioned, you get correct results irrespective of version.

Tested on Windows 32-bit (on 10.2.0.1, 10.2.0.3 and 10.2.0.4) and RHEL 5.1 x86_64 (10.2.0.3).

If you are curious, here's the explain plan (10.2.0.3) for the first select:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4063410327

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 43 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION LIST EMPTY| | 1 | 43 | 2 (0)| 00:00:01 |INVALID|INVALID|
|* 2 | TABLE ACCESS FULL | TH3 | 1 | 43 | 2 (0)| 00:00:01 |INVALID|INVALID|
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


2 - filter("DF"<=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "DT">=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

On 10.2.0.4, I get:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3503314195

---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 43 | 3 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 1 | 43 | 3 (0)| 00:00:01 |KEY |KEY |
|* 2 | TABLE ACCESS FULL | TH3 | 1 | 43 | 2 (0)| 00:00:01 | 2 | 2|
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


2 - filter("DF"<=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "DT">=TO_DATE('2008-04-10 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

So it looks like a problem in partition elimination - on 10.2.0.3, Oracle didn't evaluate the involved partitions properly.

No patch was provided by Oracle, as we have two luxurious options: a workaround and 10.2.0.4. And of course, the second select is after all the correct one for the initial problem...

10gR2 RAC on RHEL5.1 (x86_64)

Just a few issues you should be aware of when trying to install 10gR2 RAC on RHEL 5.1:

See Metalink 465001.1 for raw device configuration for RHEL 5. It's not explicitly said, but all disks used must be partitioned! Otherwise, if you don't partition the OCR disk, the root.sh from clusterware installation will fail (fail with “Failed to upgrade Oracle Cluster Registry configuration” error, with “Failed to call clsssinit” in log.)
The ASMLib will also refuse to stamp whole disk, a partition is required.

For OS configuration, see Metalink 421308.1. However, the sysctl parameters listed there do not exist on RHEL 5.1, you will have to use:

kernel.shmmni = 4096

kernel.sem = 250 32000 100 128

fs.file-max = 65536

net.ipv4.ip_local_port_range = 1024 65000

net.core.rmem_default = 262144

net.core.rmem_max = 262144

net.core.wmem_default = 262144

net.core.wmem_max = 262144

net.ipv4.tcp_rmem = 4194304 4194304 4194304
net.ipv4.tcp_wmem = 262144 262144 262144


You will also need to install one more rpm package (kernel-headers from CD1), but this will indicate the rpm command anyway.

As for network, remember that the localhost entries must be at the end of hosts file, so OUI will resolve current server with its name, instead of "localhost". (You would see this in the "Specify cluster configuration" screen.) If the node is listed as "localHost" (note the capital H), check your /etc/hosts for errors and that you can ping all the nodes listed (check also for typing errors).

Before running the root.sh script, read Metalink 414163.1. You will have to edit vipca and srvctl scripts (remove LD_ASSUME_KERNEL setting) and manually configure public/interconnect interfaces, finally running vipca interactively from root X session.
If you ran root.sh without these changes, remove the /etc/oracle/ocr.loc, remove the entries from /etc/inittab and run telinit so the root.sh can configure the OCR from scratch again.

Hope this helps:-)

dbms_alert on RAC

Not long time ago, I came across a usage of dbms_alert to manage running jobs. As the solution implemented must work also for RAC, I wanted to know whether dbms_alert works on RAC across instances.

The documentation nor Metalink does not say anything (contrary to dbms_pipe, which does NOT work on RAC). So, if they don't warn, it should work... However, Julian Dyke says, that dbms_alert does not work and is the same as dbms_pipe (sources: http://juliandyke.com/Presentations/Presentations.html#ARoughGuideToRAC, page 17, or Pro Oracle Database 10g RAC on Linux, page 426).

You know, never trust anybody, so I conducted a test case (10.2.0.3 on Linux x86_64, VMware ESX server, 2-node RAC):

You will need two simultaneous sessions, I mark them with DWH1> and DWH2> here.
DWH1> select instance_name from v$instance;

INSTANCE_NAME
----------------
DWH1

DWH2> select instance_name from v$instance;

INSTANCE_NAME
----------------
DWH2

DWH2> exec dbms_alert.register('TST');

DWH2> set serveroutput on


DWH2> declare
message varchar2(2000);
status integer;
name varchar2(2000);
begin
DBMS_ALERT.WAITANY (
name ,
message,
status);
dbms_output.put_line(name);
dbms_output.put_line(message);
end;
/

DWH1> exec dbms_alert.signal('TST','msg1');

DWH2> TST
msg1

PL/SQL procedure successfully completed.

DWH2> declare
message varchar2(2000);
status integer;
begin
DBMS_ALERT.WAITONE (
'TST',
message,
status );
dbms_output.put_line(message);
end;
/

DWH1> exec dbms_alert.signal('TST','msg1');

msg1

PL/SQL procedure successfully completed.

So, as you see, it works even on RAC. Just test it yourself, and never trust anybody anything without proper testing. Everyone does mistakes.