Tuesday, July 10, 2012

inserted value too large for column - in a delete?

One of the well know error everyone meets in Oracle is the ORA-01041: inserted value too large for column (while s/he is cursing Oracle why it did not returned ORA-12899: value too large for column ... with the details).
The error is usualy right - you tried to insert a value too large for the column definition, so Oracle rejects the statement. Although it mentions insert, one can understand such error for updates, too.

But recently I got this issue for a delete statement. And no, there was not any trigger nor other advanced feature doing an insert in the background. So why on Earth, why?

The application (apply process of Dbvisit Replicate) was using OCI and the error indeed came from the OCIStmtExecute call. As it turns out, however, the devil is somewhere else.

One of the confusing topic in OCI is string handling - there are four ways how to pass a string to Oracle, depending on whether it's null-terminated or length-specified and whether it should be space-padded (think about CHAR/VARCHAR2 semantics):
  • SQLT_CHR: varchar2-style, length-specified
  • SQLT_STR: varchar2-style, null-terminated
  • SQLT_AFC: char-style, length-specified
  • SQLT_AFV: char-stype, null-terminated
  • (and a few more for LONGs, RAWs and a different API for LOBs)
The application uses SQLT_AFC, as the data it gets are fed directly from the source database, so no padding magic is required (even worse, the values do not match for some data combinations with SQLT_CHR due to the padding). However, for some strange historical reason, the CHAR() datatype is limited to 2000 bytes - and this carries over to SQLT_AFC, although this is not mentioned in the OCI documentation. What's even more interesting, this happened only when the value was specified as UTF16.

And yes, this is exactly the reason for the ORA-01041 - binding a 4000 bytes string to SQLT_AFC and setting OCI_ATTR_CHARSET_ID to OCI_UTF16ID did result in "inserted value too large for column" when the statement was executed...

Versions: Linux 64-bit, Oracle 10.2 and 11.2, using Oracle Instant Client 11.1.0.7.

Wednesday, April 27, 2011

Manual archiving of redo logs gone with 9i – or is it not?

Originally published on Dbvisit blog at http://blog.dbvisit.com/manual-archiving-of-redo-logs-gone-with-9i-or-is-it-not/.

If you memory dates back to the pre-10g times, you will maybe recall the steps to enable archivelog mode on the database at that time. Perhaps the greatest catch was that the online logs did not get archived automatically by default – you had to do it manually.

Or, let Oracle come to rescue, and let the ARCH process(es) do it for you by starting them with log_archive_start = true (or even dynamically with alter system archive log start/stop). You could even start ARCH in noarchivelog mode, so they did completely nothing, just writing error messages to alert.log.

One of the new features of 10g was getting rid of all of this – ARCH starting is governed solely by (no)archivelog mode and the DBA does not need to fiddle with anything else to get it right. Great applause, many thanks, scene over.

Well, all above is just a short transcript of my memory of those great times, and I lived with it happily and well till this week.

As it turns out, the story with 10g is not so straightforward – although it is true that if you just set no/archivelog mode that ARCH works as expected, the manual way of archiving was in fact not removed. But the way to enable it new:

To enable manual archiving mode, use alter database archivelog manual. The rest is the same as with 9i – use alter system archive log commands.

One consequence of this change – the log_mode in v$database can be archivelog, noarchivelog (as in 9i and before) or manual (from 10g on). This also means that if you want to detect whether a database runs in archivelog mode, you have to check for manual log_mode as well.

Sunday, April 24, 2011

Installing RAC on Oracle VM

Originally published on Dbvisit blog at http://blog.dbvisit.com/installing-rac-on-oracle-vm/.

One of the nice features of Oracle VM (and perhaps the most compelling one when you are evaluating different virtualization products for testing and development) is the comprehensive list of prepackaged templates available at Oracle e-delivery.

The list is not complete and probably cannot ever be, as Oracle is expanding it’s product offerings very fast. Of the missing basic configurations, there is for example no Oracle Database on Solaris template (although plain Solaris is available).

However, let’s look at the RAC templates. The list is comprehensive and you can choose between 11.1 and 11.2 on Oracle Enterprise Linux, both 32- and 64-bit.

The first-time setup is quite easy and easy if you follow supplied pdf documentation. The usual setup is 2-node RAC, but you can semi-manually add further nodes as well.

Still, there is a catch: the pdf declares that one of the prerequisities is to have two physical network cards. Strange, isn’t it? After all, a RAC node needs just one network card for the outside world, the second one connects just to the other nodes and should be virtual only, right? Well, not according to the docs. You have to set the second network interface of the RAC node to use xenbr1 (second network bridge) to connect to other nodes, but this bridge is simply not present in Oracle VM if you have just one physical network interface.

If you ignore the requirement and go on with the VM creation, you will face problems very soon: after you specify primary/secondary node (the first question after startup), the nodes don’t see each other. Or, they see each other, but after that, connectivity tests fail on eth1. (This differs on your choice of template used.)

The solution is in fact simple: if RACs need a separate xenbr1 bridge interface, just give it to them:

  • create the bridge: brctl addbr xenbr1
  • bring the bridge up: ifconfig xenbr1 up
And then, according to the docs, set the VIF1 interfaces to use xenbr1. If you do it by trial and error as I did, you may up end with VM assigning the VIFs incorrectly to the bridges (both to the same one, etc.). In that case edit vm.cfg for you virtual machine (in /OVS/running_pool/) and check that VIF0 is really set to xenbr0 and VIF1 to xenbr1. But if you first setup the xenbr1 and after that start creating your RAC, you should be fine.

As usual, to make the configuration permanent, you must make sure it persists reboots. Just adding the two aforementioned commands to /etc/rc.local should do the trick.

Tuesday, July 20, 2010

How to move flashback data archive to a different tablespace?

The documentation says to ALTER FLASHBACK ARCHIVE REMOVE TABLESPACE following:
If the tablespace to be removed contains any data within the retention period of the flashback archive, then that data will be dropped as well. Therefore, you should move your data to another tablespace before removing the tablespace with this clause.


However, the question is: how do you move the data?

Attempt to move the internal tables stored in flashback archive raises
ORA-55622: DML, ALTER and CREATE UNIQUE INDEX operations are not allowed on table "SCOTT"."SYS_FBA_DDL_COLMAP_69926"

And the response of Oracle support is simple:
What you are trying to do is not currently possible. You cannot move the FDA history tables from one tablespace to another, nor can you move the FDA to another tablespace using ALTER FLASHBACK ARCHIVE.

Development however acknowledges that this is a desired feature that they will look into adding for a future release.

Wednesday, December 2, 2009

ORA-27048: skgfifi: file header information is invalid

I was asked to analyze a situation, when an attempt to recover a 11g (standby) database resulted in bunch of "ORA-27048: skgfifi: file header information is invalid" errors.

I tried to reproduce the error on my test system, using different versions (EE, SE, 11.1.0.6, 11.1.0.7), but to no avail. Fortunately, I finally got to the failing system:

SQL> recover standby database;
ORA-00279: change 9614132 generated at 11/27/2009 17:59:06 needed for thread 1
ORA-00289: suggestion :
/u01/flash_recovery_area/T1/archivelog/2009_11_27/o1_mf_1_208_%u_.arc
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-00280: change 9614132 for thread 1 is in sequence #208


Interestingly, nothing interesting is written to alert.log nor the trace file - just the ORA-27048, with no detail, so you cannot see which file is affected from there.

However, don't lose hope, there is always one thing to try: strace the process (fortunately this was a Linux box, not a Windows one).


sqlplus / as sysdba
SQL> !ps -ef
SQL> !strace -o ~/strace.trc -p <PID> &
SQL> recover standby database;
SQL> exit

That is, run you sqlplus, get the PID of your server process and strace it. (With the usual caveat that you are running the sqlplus locally, that you have a dedicated server process... well you have to, you're doing a recovery.)

And voilĂ , it's there! The server process does a lot of gettimeofday() calls (Oracle wait interface measuring time of code pieces, perhaps), opens datafiles, opens alert.log/xml and trace, writes to them... and traverses the FRA! (the listing below is just a very short sample)

access("/u01/flash_recovery_area/T1/archivelog/", R_OK) = 0
stat64("/u01/flash_recovery_area/T1/archivelog/", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
readlink("/u01/flash_recovery_area/T1/archivelog/", 0xbfe0c5c0, 512) = -1 EINVAL (Invalid argument)
open("/u01/flash_recovery_area/T1/archivelog/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 21
fstat64(21, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
fcntl64(21, F_SETFD, FD_CLOEXEC) = 0
getdents(21, /* 20 entries */, 4096) = 476
access("/u01/flash_recovery_area/T1/archivelog/2007_08_15", R_OK) = 0
access("/u01/flash_recovery_area/T1/archivelog/2007_08_15", R_OK) = 0
stat64("/u01/flash_recovery_area/T1/archivelog/2007_08_15", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
readlink("/u01/flash_recovery_area/T1/archivelog/2007_08_15", 0xbfe0c100, 512) = -1 EINVAL (Invalid argument)
open("/u01/flash_recovery_area/T1/archivelog/2007_08_15", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 22
fstat64(22, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
fcntl64(22, F_SETFD, FD_CLOEXEC) = 0
getdents(22, /* 2 entries */, 4096) = 32
getdents(22, /* 0 entries */, 4096) = 0
close(22) = 0

This was the top-level directory and one of its children - an empty one (2 entries in directory, i.e. "." and "..").

When we get to a non-empty dir, the server process analyzes the actual files there:

access("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_19_3d9wj20f_.arc", R_OK) = 0
access("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_19_3d9wj20f_.arc", R_OK) = 0
stat64("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_19_3d9wj20f_.arc", {st_mode=S_IFREG|0775, st_size=47616, ...}) = 0


Well, interesting, but nothing groundbreaking. However, for some of the files, the server process does something different:

stat64("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_21_3d9wspto_.arc.gz", {st_mode=S_IFREG|0775, st_size=24228, ...}) = 0
statfs("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_21_3d9wspto_.arc.gz", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=10119516, f_bfree=4882707, f_bavail=4368756, f_files=5144576, f_ffree=4885233, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
open("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_21_3d9wspto_.arc.gz", O_RDONLY|O_LARGEFILE) = 21
fstatfs(21, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=10119516, f_bfree=4882707, f_bavail=4368756, f_files=5144576, f_ffree=4885233, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
lseek(21, 0, SEEK_SET) = 0
read(21, "\37\213\10\0106\363\304F\0\3o1_mf_1_21_3d9wspto_.a"..., 512) = 512
close(21) = 0


Yes, it opens them and reads first 512 bytes of them - the file header! And if the file header is invalid (as is for the file above, as it is a gzipped file, so no longer an Oracle file), the skgfifi function signals an error (as can be confirmed by removing all .gz files and trying the recovery again).

Just a final note - the server process skipped some of the .gz files, read the rest of them - I could not find a simple rule how it decided.

Just a few conclusing note:

  • The reaction of Oracle Support for this was: Oracle does not recommend compressing files within FRA.
  • If you use "set logsource" in SQL*plus, the scanning does not happen, as there is simply just one directory to look into (set logsource will not work with FRA, due to the YYYY_MM_DD dirs in FRA).
  • For the background info (who came up with the problem, how do they solve it), visit Dbvisit blog - they stumbled upon it while supporting their "standby on Standard Edition" solution.

Monday, November 9, 2009

11g OCM update

Just a quick note: Oracle set the 11g OCM upgrade release date to November 30, 2009. However, schedule is still "comming soon"...

Friday, October 23, 2009

OCM 11g

After all, I learned one new thing I did not expect to find out at OOW: at the OCP Lounge (where, contrary to promises, no networking took place), I was told new info regarding the 11g OCM exam.
While from supposedly trustworthy source, don't take the following as 100%-sure; still, it's worth to know, that:
- the 11g OCM Upgrade exam is scheduled to start late November this year, and will be delivered on 11gR1.
- the full 2-day 11g OCM has no schedule yet, and will be delivered on 11gR2 from it's start.

The only trouble I see is the attendance - with 10g OCM, every local Oracle was cancelling it if only one person signed for it - two persons was minimum. With the upgrade, how many people will sign up?

I plan to take it quite soon after it's available, somewhere in Europe (UK has usually densest exam schedule, while countries like Czech Republic/Slovakia/Italy etc. have lower prices for the exam).
Drop me a line if you would be interested in taking the exam, so we can force some Oracle local office to actually DO the exam instead of cancelling it again.