Skip to main content

Column projections

The lower half of the plan

In the previous part, we've seen the data behind the execution plane table, i.e. all the operations, their cost, and the tree structure of the plan. Although interesting to investigate, it's all already available in the dbms_xplan output. Although possibly you can find cases when it does, perhaps with some more complex parallel queries, partitioning and so on.

The more complex part is the "lower half" of the plan, i.e. the column projections, filters and access predicates. These can contain arbitrary operations/functions and it's also here where we can follow where a value comes from, starting from a single table, going through joins, aggregations, set functions and so on, up to the level where is it used (be it top-level column projection = in the select output, or as a filter or access predicate).

In this blog post, let's look at the column projections. In other words, what columns are shown in the resulting select, as well as every step of the execution plan.

Execution plan tree

Obviously, the execution plan is a tree - each row has a depth and exactly one parent (except for the top one, of course). Usually there is just one or two children - but in general, there can be more, for example with UNIONs.

When looking around  kxscio, we nodes of a tree at 0x320. We see a list of pointers there - and if we follow them, we again see row id, and the pointers form a structure:
(Obviously having a more complex plan would show a more complex tree, making the following insights more obvious.)

It might be a little surprising that we always have just one child, and then a pointer to the next sibling... but this is actually a very standard way how to convert an arbitrary tree to a binary tree.

Also if we try some other examples, we may find that the tree has extra nodes, not just those listed at 0x320. These are part of the tree structure, have row id of -1, and look like any other, valid, tree entry. Apparently sometimes Oracle wants/needs to describe intermediate steps in column projections. The only lesson is that we really need to start at the root (first entry) and traverse the tree, in order to see all the nodes.


The tree nodes have more info than just the pointers in the tree structure. Just the next pointer point to something interesting:
0000000 0000000000010002 0000000000000000 
0000020 0000000065fa2ac8 0000000065fa2190

0000000 0000000000010002 0000000000000000
0000020 0000000065fa2bc8 0000000065fa2ac8

0000000 0000000000010001 0000000000000000
0000020 0000000065fa2190

Well, we actually need to follow the pointers one more time, for example the first ones:
0000000: 0b 00 00 00 01 01 00 00 00 00 00 00 1e 00 00 00 
0000010: 69 03 01 00 00 00 00 00 20 00 00 00 d8 02 00 00
0000050: 90 2a fa 65 00 00 00 00

0000000: 0b 00 00 00 02 01 04 00 00 00 00 00 16 00 00 00
0000010: 06 00 00 00 00 00 00 00 20 00 00 00 d8 03 00 00
0000050: 58 21 fa 65 00 00 00 00

Hmm... still not convincing. Note that we are looking at the first row, which column projections are "FOOBAR"."KEY"[VARCHAR2,30], "PRODUCTS"."PROD_ID"[NUMBER,22]. The datatypes are 01 and 02 (again, it's better to use something more unique, like INTERVALs... the numbers are the Oracle internal codes, same as what you for example see in 10046 trace when looking at binds). The lengths are 0x1e and 0x16 and we indeed see them here!

The data at 0x50 are actually ordinary pointers - it's a reminder that we are working on little endian architecture and it makes a difference if we look at the data as bytes or as 8-byte chunks. If this is not obvious to you, I recommend stopping here for a while and reading - and understanding - wikipedia explanation. This switching back and forth between byte output and 64-bit pointers is something one has to do automatically, almost at every dump.

0000000 0000000000000000 0000000065fa2130
0000020 0000000065fa2108

And following two more pointers, we arrive at names of the table and the column! (Again, it's useful to be able to spot ASCII values in hex, in order to know when to switch to text output.)
1900 0000 0800 5052 4f44 5543 5453 0000  ......PRODUCTS..
0000 0000 0700 5052 4f44 5f49 4400 0000  ......PROD_ID...

In other test cases, we'd see that even the first pointer might be not null and point to schema name. Also we would confirm that the 08 and 07 are lengths of the name strings.


Now we know where to find the plan as a tree - and we can even see the columns, their datatypes, lengths and which table they come from. Next time we will look at the filters and predicates - and that there is more structure to all these than meets the eye at first.


Popular posts from this blog

ORA-27048: skgfifi: file header information is invalid

I was asked to analyze a situation, when an attempt to recover a 11g (standby) database resulted in bunch of "ORA-27048: skgfifi: file header information is invalid" errors.

I tried to reproduce the error on my test system, using different versions (EE, SE,,, but to no avail. Fortunately, I finally got to the failing system:

SQL> recover standby database;
ORA-00279: change 9614132 generated at 11/27/2009 17:59:06 needed for thread 1
ORA-00289: suggestion :
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-00280: change 9614132 for thread 1 is in sequence #208

Interestingly, nothing interesting is written to alert.log n…

Reading data from PGA and SGA

Overview For our investigation of execution plan as it is stored in memory, we need in the first place to be able to read the memory.

We have the options of
x$ksmmem, reading SGA using SQL. Personally I don't like it, it's cumbersome and SGA read: obviously reading SGA only; it's fast and easy to doread process memory: can read PGA, process stack - and since the processes do map the SGA, too, you can read it as well. Unfortunately ptrace sends signals to the processes and the process is paused when reading it, but so far all my reads were short and fast and the processes did not notice. Some OS configurations can prevent you from using ptrace (e.g. docker by default), google for CAP_SYS_PTRACE.gdb: using your favorite debugger, you can read memory as well. Useful when investigating. Direct SGA read I always considered direct SGA read of some dark magic, but the fundamentals are actually very easy. It still looks like sorcery when actually reading the Oracle in…

PDB saving state does not save its state on shutdown

When came out, one of the gripes was that upon a CDB start, all the PDBs were in the mounted mode. The DBA had to open them manually, or use a database trigger to do that. introduced SAVE STATE - according to the docs:
For example, if a PDB is in open read/write mode before the CDB is restarted, then the PDB is in open read/write mode after the CDB is restarted; if a PDB is in mounted mode before the CDB is restarted, then the PDB is in mounted mode after the CDB is restarted.
The trouble is that this is simply wrong, it does not work like this. Oracle has a table externalized as  DBA_PDB_SAVED_STATES and this stores the state. The table is updated only by the SAVE STATE command - and reflects the status when the SAVE STATE was issued, not when the database goes down.
It simply stores the open mode of the database and the CDB will open the database in this mode when the CDB opens. Lack of a row implies MOUNTED mode, i.e. the CDB won't do anything.
The row is dele…