Sunday, February 28, 2016

PDB saving state does not save its state on shutdown

When came out, one of the gripes was that upon a CDB start, all the PDBs were in the mounted mode. The DBA had to open them manually, or use a database trigger to do that. introduced SAVE STATE - according to the docs:
For example, if a PDB is in open read/write mode before the CDB is restarted, then the PDB is in open read/write mode after the CDB is restarted; if a PDB is in mounted mode before the CDB is restarted, then the PDB is in mounted mode after the CDB is restarted.

The trouble is that this is simply wrong, it does not work like this. Oracle has a table externalized as  DBA_PDB_SAVED_STATES and this stores the state. The table is updated only by the SAVE STATE command - and reflects the status when the SAVE STATE was issued, not when the database goes down.
It simply stores the open mode of the database and the CDB will open the database in this mode when the CDB opens. Lack of a row implies MOUNTED mode, i.e. the CDB won't do anything.
The row is deleted by the DISCARD STATE command - or by issuing the SAVE STATE when the PDB is mounted.

Let's see a short example: P2 does not have state saved. We open it read only, save the state, open it read write and restart the database. P2 comes up as read only - the state which it was when we saved the state, not the state it was when we shut the CDB down. The saved state is still OPEN READ ONLY.

SQL> select name, open_mode from v$pdbs;

------------- ----------
P2            MOUNTED

SQL> alter pluggable database p2 open read only;

Pluggable database altered.

SQL> alter pluggable database p2 save state;

Pluggable database altered.

SQL> select con_name, state from DBA_PDB_SAVED_STATES;

---------- --------------

SQL>  alter pluggable database p2 close;

Pluggable database altered.

SQL>  alter pluggable database p2 open;

Pluggable database altered.

SQL>  select name, open_mode from v$pdbs;

------------- ----------
P2            READ WRITE

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  838860800 bytes
Fixed Size      2929936 bytes
Variable Size    335547120 bytes
Database Buffers   494927872 bytes
Redo Buffers      5455872 bytes
Database mounted.
Database opened.
SQL>  select name, open_mode from v$pdbs;

------------- ----------
P2            READ ONLY

SQL>  select con_name, state from DBA_PDB_SAVED_STATES;

---------- --------------

Friday, February 19, 2016

Small addendum to the lying (Data Guard) broker

A friend of mine (Deiby Gómez) pointed me to an interesting article on MOS 1956103.1 - Warning: standby redo logs not configured for thread on db_unique_name/db_unique_name.

It essentially describes the same issue I described in
Don't trust the lying (Data Guard) broker - the newly created SRLs are not assigned to a particular thread and the VALIDATE command does not like it, although the standby is perfectly happy and will grab the SRLs as necessary, as it always did before 12c.

The Metalink note adds a solution - to assign the the SRLs to the threads manually during creation. The syntax is
alter database add standby logfile thread 1 group 1 'file spec' size ....

This thus disables the auto-assign to the thread that needs it, but that should not matter. We usually size all the threads uniformly and assign enough SRLs to all of them, in other words we expect even distribution of SRLs to threads. Thus doing it manually is not a bad thing.

Tuesday, February 9, 2016

How many columns in a query

Everybody knows that the limit for number of columns in an Oracle table is 1000. It is actually limit of all columns in the table, including internal ones, virtual, unused but not yet dropped and so on.

But what is the limit for a query?

Let's start with a simple table, called many_columns. It has 1000 columns, all NUMBERs, to make things easy. Columns are named COLUMN_0001 to COLUMN_1000.

And I insert 1 row into the table:

insert into many_columns(COLUMN_0001) values (1);

So what happens with an innocent query?

select m.*, n.* from many_columns m, many_columns n;

Well, nothing special - SQL*Plus is happy to return 2000 columns.

Obviously, there must an upper limit, right? At the very maximum, OCI specifies value for column count as ub2, i.e. max 65535.
However, SQL*Plus complains much sooner: the limit seems to be 8150. I added one more table - many_columns2 with just then columns. The first query to go over the limit, with 8151, fails with:

many_columns m01,
many_columns m02,
many_columns m03,
many_columns m04,
many_columns m05,
many_columns m06,
many_columns m07,
many_columns m08,
many_columns2 m10,
many_columns2 m11,
many_columns2 m12,
many_columns2 m13,
many_columns2 m14,
many_columns2 m15,
many_columns2 m16,
many_columns2 m17,
many_columns2 m18,
many_columns2 m19,
many_columns2 m20,
many_columns2 m21,
many_columns2 m22,
many_columns2 m23,
many_columns2 m24,
many_columns2 m25,

ERROR at line 1:
ORA-00913: too many values

However, in more complex situations, Oracle will complain much sooner.
select *
from   many_columns
right outer join (select count(*) c, count(*) c2 from dual) on (c=column_0001);
ERROR at line 2:
ORA-01792: maximum number of columns in a table or view is 1000

However, this is version dependent, this was in Same test, same tables on my environment and Oracle does not complain about this.

Tuesday, February 2, 2016

Docker machine - wonderful idea, too many bugs?

When doing various experiments with docker, I painfully realized that btrfs keeps a lot to be desired.

Wonderful idea, terrible user experience. First of all, df lies, and you are supposed to run btrfs balance often. Maybe it's because of the way docker uses it - it creates a ton of large images.
Eventually you touch all chunks and rebalance stops working completely. Now you desperately delete things, hoping to get chunk free and let rebalance get thins back to order. Or not - and you end up nuking the server and reinstalling.
Or you perhaps end up crashing up the server and the btrfs won't mount anymore...

So after going through 5 servers (OL7.1 in VBox), I moved onto docker-machine. Wonderful idea - and it does not use btrfs, yay!

However, it also has it's bugs... and pretty ugly. First of all, the latest stable boot2docker 1.9 has a kernel bug that causes Java process to become zombies and docker container won't finish. See . In my case, it means that Oracle database software installation never finishes.

Ok, the link says it's fixed in the upcoming 1.10 image. And indeed it does - and it's very easy to switch to it, just add --virtualbox-boot2docker-url= or similar to docker-machine create. Oracle then installs fine.
However, another bug that emerges: nobody can ptrace a process. Which includes gdb - it cannot attach to a running process, becoming completely useless.

Attaching to process 24

ptrace: Operation not permitted.

Let's hope this is fixed soon...
(See my post at )

Thursday, January 21, 2016

Docker: Handling multiple copies of the same database/container

Inspired by Frits Hoogland's excellent article on Oracle running in Docker, I started building a lot of Oracle containers. It's nice to have multiple different Oracle versions available at your fingertips for research, product testing and so on.

However, one thing annoys me with Docker: if you want any usable IPC, you need to use --ipc=host. This means that all the images share the same namespace and, furthermore, when a container exits it sometimes does not clean up the IPC entries.

As you probably know the IPC is used by Oracle for SGA memory and semaphore sets. It identifies which belong to which instance, by combining SID and ORACLE_HOME.

This in turn means that you cannot run two databases with the same SID and ORACLE_HOME at the same time... which is usually fine, but not so with Docker and --ipc=host. In this case we do want to run multiple containers built off the same image, or perhaps have multiple similar images with the same ORACLE_HOME, differing in minor details only, such as patchset level.

Fortunately it is actually pretty easy to change the ORACLE_SID, without altering the name of the database. The only thing you really need to change is the name of the spfile (or you can specify the name explicitly when starting the database). You should also change the name of the password file, if you use one, and add an entry to /etc/oratab for convenience.

This has to happen when the container is started, not in the image. And you also have to decide how you handle container stop/start: do you want to generate a new name, or do you remember the new names? (Because, as you know, you need the SID to startup the database in the start scripts.)

I decided to go with the first approach, so that on every start I generate a new name. And I just copy the scripts, so that the previous name is always there and the copy scripts always find it, even when executed repeatedly.

export OLD_SID=SRC
export NEW_SID=`perl -e 'my @c=("A".."Z","a".."z","0".."9");my $s; $s.=$c[rand @c] for 1..8;print $s;'`
. oraenv #get ORACLE_HOME
cp spfile$OLD_SID.ora spfile$NEW_SID.ora
cp orapw$OLD_SID orapw$NEW_SID.ora
echo "$NEW_SID:$ORACLE_HOME:N" >> /etc/oratab
echo "Generated: $NEW_SID:$ORACLE_HOME:N"
. oraenv
cd -

You can also see that the names of some files will change, for example, alert log changed from, for example alert log changed from diag/rdbms/src/SRC/trace/altert_SRC.log to diag/rdbms/src/081b59ce/trace/alert_081b59ce.log.

So, to conclude, note that the purpose of this script is to have a quick and easy way to spin up multiple containers - and it leaves much room for improvement. There are other possibilities, such as statically registering the new SIDs in listener.ora so you can connect to start the instances without knowing the SID, or writing the new SIDs to disk and using them on container restart.

Wednesday, December 23, 2015

A few thoughts about OCM 12c upgrade

Yesterday I sat for the 12c OCM upgrade exam, which I mentioned in few blog posts before. The first step after checking your ID is of course signing the NDA, and thus you won't find much real information here.

This time I chose Utrecht as the place to take the exam. Not that I have any special preference, I took each of the exams in a different place so far. The only requirements were convenient time and location defined as 'somewhere in Europe'. But in the end, Utrecht turned out to be a good place. Oracle NL headquarters are easy accessible, it's a very new building, the lunch was good:-)
And the city is nice to see.

Regarding the exam, the usual important notes still hold true:

  1. Arrive on time. It's a long day and you will have a lot of things to do.
  2. You will work hard the whole day. Get a good sleep before, be well rested.
  3. Review the exam topics well. Note that they may have change over time. There is for example an update as of January 1, 2016: Flex ASM was added.
  4. Learn how to work with the docs - with no search available. You will need the docs, nobody can remember all the syntax and all the arcane settings.
  5. Love your command line. "GUI is not available for every segment of the exam." And anyway, it's much faster to do things in sqlplus. And you will struggle for time.
Now I just have to wait for the results... And for any of you who wants to take the exam: Good luck!

Monday, December 21, 2015

Don't trust the lying (Data Guard) broker

One of the new 12c features is the "VALIDATE DATABASE" command. According to the documentation it should do many thorough checks and tell you if all is configured well and correctly. However, there is one catch - or to put it a little more bluntly - bug. Or two.

You know that you need standby redo logs for SYNC (or the new FASTSYNC) transport mode. The validate command knows that, too. And you know that you should have one more standby redo log than online redo logs. The validate command seems to know this one as well.

However, the checks appear to have one flaw: they test whether the threads (and let's talk here about a single-instance, so we have only thread #1) have enough standby redo logs (SRLs) assigned. But when you create an SRL with 'alter database add standby logfile', they are unassigned to any thread. In fact, you get 0 as thread#:

select thread#, sequence# from V$STANDBY_LOG;

------- ---------
      0         0
      0         0
      0         0
      0         0
Which is perfectly fine - Oracle waits until the instance actually needs the SRL and only then is this assigned. Makes the administration easier.

But the guys responsible for VALIDATE DATABASE do not seem to realize this. So if you have just set up your SRLs and run the validate command - just to see if the config is all ok (e.g. because you just want to change the LogXptMode and protection mode) then you will get a result like this:
Thread #  Online Redo Log Groups  Standby Redo Log Groups Status
              (CDB5)                  (CDB5SBY)
    1         3                       0                       Insufficient SRLs
    Warning: standby redo logs not configured for thread 1 on CDB5SBY

WTF? Yes, the validate command did not understand that we have plenty of SRLs, only that they have not yet been assigned to any thread.

So.. we do a switchover, back and forth, to let both databases touch the SRLs and...

Thread #  Online Redo Log Groups  Standby Redo Log Groups Status
              (CDB5)                  (CDB5SBY)
    1         3                       2                       Insufficient SRLs

And we still receive a warning - although we have created 4 SRLs, only two of which Oracle has required so far...with the other two currently unassigned. Again, VALIDATE DATABASE is not aware of this and complains.

The morale? Don't just trust the command, especially in the beginning, when your configuration is fresh and still settling down. Although that's exactly the time you want to use checks like this.