Skip to main content

ORA-27048: skgfifi: file header information is invalid

I was asked to analyze a situation, when an attempt to recover a 11g (standby) database resulted in bunch of "ORA-27048: skgfifi: file header information is invalid" errors.

I tried to reproduce the error on my test system, using different versions (EE, SE, 11.1.0.6, 11.1.0.7), but to no avail. Fortunately, I finally got to the failing system:

SQL> recover standby database;
ORA-00279: change 9614132 generated at 11/27/2009 17:59:06 needed for thread 1
ORA-00289: suggestion :
/u01/flash_recovery_area/T1/archivelog/2009_11_27/o1_mf_1_208_%u_.arc
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-27048: skgfifi: file header information is invalid
ORA-00280: change 9614132 for thread 1 is in sequence #208


Interestingly, nothing interesting is written to alert.log nor the trace file - just the ORA-27048, with no detail, so you cannot see which file is affected from there.

However, don't lose hope, there is always one thing to try: strace the process (fortunately this was a Linux box, not a Windows one).


sqlplus / as sysdba
SQL> !ps -ef
SQL> !strace -o ~/strace.trc -p <PID> &
SQL> recover standby database;
SQL> exit

That is, run you sqlplus, get the PID of your server process and strace it. (With the usual caveat that you are running the sqlplus locally, that you have a dedicated server process... well you have to, you're doing a recovery.)

And voilĂ , it's there! The server process does a lot of gettimeofday() calls (Oracle wait interface measuring time of code pieces, perhaps), opens datafiles, opens alert.log/xml and trace, writes to them... and traverses the FRA! (the listing below is just a very short sample)

access("/u01/flash_recovery_area/T1/archivelog/", R_OK) = 0
stat64("/u01/flash_recovery_area/T1/archivelog/", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
readlink("/u01/flash_recovery_area/T1/archivelog/", 0xbfe0c5c0, 512) = -1 EINVAL (Invalid argument)
open("/u01/flash_recovery_area/T1/archivelog/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 21
fstat64(21, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
fcntl64(21, F_SETFD, FD_CLOEXEC) = 0
getdents(21, /* 20 entries */, 4096) = 476
access("/u01/flash_recovery_area/T1/archivelog/2007_08_15", R_OK) = 0
access("/u01/flash_recovery_area/T1/archivelog/2007_08_15", R_OK) = 0
stat64("/u01/flash_recovery_area/T1/archivelog/2007_08_15", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
readlink("/u01/flash_recovery_area/T1/archivelog/2007_08_15", 0xbfe0c100, 512) = -1 EINVAL (Invalid argument)
open("/u01/flash_recovery_area/T1/archivelog/2007_08_15", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 22
fstat64(22, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
fcntl64(22, F_SETFD, FD_CLOEXEC) = 0
getdents(22, /* 2 entries */, 4096) = 32
getdents(22, /* 0 entries */, 4096) = 0
close(22) = 0

This was the top-level directory and one of its children - an empty one (2 entries in directory, i.e. "." and "..").

When we get to a non-empty dir, the server process analyzes the actual files there:

access("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_19_3d9wj20f_.arc", R_OK) = 0
access("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_19_3d9wj20f_.arc", R_OK) = 0
stat64("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_19_3d9wj20f_.arc", {st_mode=S_IFREG|0775, st_size=47616, ...}) = 0


Well, interesting, but nothing groundbreaking. However, for some of the files, the server process does something different:

stat64("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_21_3d9wspto_.arc.gz", {st_mode=S_IFREG|0775, st_size=24228, ...}) = 0
statfs("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_21_3d9wspto_.arc.gz", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=10119516, f_bfree=4882707, f_bavail=4368756, f_files=5144576, f_ffree=4885233, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
open("/u01/flash_recovery_area/T1/archivelog/2007_08_17/o1_mf_1_21_3d9wspto_.arc.gz", O_RDONLY|O_LARGEFILE) = 21
fstatfs(21, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=10119516, f_bfree=4882707, f_bavail=4368756, f_files=5144576, f_ffree=4885233, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
lseek(21, 0, SEEK_SET) = 0
read(21, "\37\213\10\0106\363\304F\0\3o1_mf_1_21_3d9wspto_.a"..., 512) = 512
close(21) = 0


Yes, it opens them and reads first 512 bytes of them - the file header! And if the file header is invalid (as is for the file above, as it is a gzipped file, so no longer an Oracle file), the skgfifi function signals an error (as can be confirmed by removing all .gz files and trying the recovery again).

Just a final note - the server process skipped some of the .gz files, read the rest of them - I could not find a simple rule how it decided.

Just a few conclusing note:

  • The reaction of Oracle Support for this was: Oracle does not recommend compressing files within FRA.
  • If you use "set logsource" in SQL*plus, the scanning does not happen, as there is simply just one directory to look into (set logsource will not work with FRA, due to the YYYY_MM_DD dirs in FRA).
  • For the background info (who came up with the problem, how do they solve it), visit Dbvisit blog - they stumbled upon it while supporting their "standby on Standard Edition" solution.

Comments

Popular posts from this blog

PDB saving state does not save its state on shutdown

When 12.1.0.1 came out, one of the gripes was that upon a CDB start, all the PDBs were in the mounted mode. The DBA had to open them manually, or use a database trigger to do that.

12.1.0.2 introduced SAVE STATE - according to the docs:
For example, if a PDB is in open read/write mode before the CDB is restarted, then the PDB is in open read/write mode after the CDB is restarted; if a PDB is in mounted mode before the CDB is restarted, then the PDB is in mounted mode after the CDB is restarted.
The trouble is that this is simply wrong, it does not work like this. Oracle has a table externalized as  DBA_PDB_SAVED_STATES and this stores the state. The table is updated only by the SAVE STATE command - and reflects the status when the SAVE STATE was issued, not when the database goes down.
It simply stores the open mode of the database and the CDB will open the database in this mode when the CDB opens. Lack of a row implies MOUNTED mode, i.e. the CDB won't do anything.
The row is dele…

A few thoughts about OCM 12c upgrade

Yesterday I sat for the 12c OCM upgrade exam, which I mentioned in few blog posts before. The first step after checking your ID is of course signing the NDA, and thus you won't find much real information here.

This time I chose Utrecht as the place to take the exam. Not that I have any special preference, I took each of the exams in a different place so far. The only requirements were convenient time and location defined as 'somewhere in Europe'. But in the end, Utrecht turned out to be a good place. Oracle NL headquarters are easy accessible, it's a very new building, the lunch was good:-)
And the city is nice to see.

Regarding the exam, the usual important notes still hold true:

Arrive on time. It's a long day and you will have a lot of things to do.You will work hard the whole day. Get a good sleep before, be well rested.Review the exam topics well. Note that they may have change over time. There is for example an update as of January 1, 2016: Flex ASM was added…