Planning for Birmingham….

Posted July 25, 2008 by mbobak
Categories: Uncategorized

Or should I say ‘Brum’?

Well, I’ve just been notified that one of my abstract submissions, “Introduction to Locks and Enqueues”, has been accepted by the UKOUG for the 2008 Annual Conference, coming up in December.  I’m really looking forward to it.  This will be my 4th year attending.  It’s also the first year the conference will be expanded to a full 5-day week.  There’s bound to be a ton of great material.  I have to say, even for someone coming from overseas, this conference is well worth your time and money.

See you in Birmingham, er, Brum!

So, there’s two posts, guess I’m on the blogging bandwagon…

Posted June 9, 2008 by mbobak
Categories: Uncategorized

As the subject says, there’s my first two real posts, so, I guess I’m blogging.  I won’t guarantee how active I’ll be here, or how much of what I write will be Oracle, as opposed to other stuff, but, for what it’s worth, here I am.

11g is more deadlock sensitive than 10g?

Posted June 9, 2008 by mbobak
Categories: Uncategorized

I ran into a situation over the weekend, where an application and schema, which were stable under 10.2.0.3, started hitting ORA-00060 deadlocks in 11.1.0.6, in spite of the fact that no application code changes had occurred.  It seems that 11g was more sensitive to deadlocks in this situation than 10gR2 was.

The situation developed this way.  We are starting to work with 11g.  We have a brand new application that is all new from the ground up, so, we thought we’d give 11g a try, as there is no legacy code base, and it seemed like a good opportunity to get our feet wet with 11g.  Well, we have an RMAN based backup system, so, to be able to backup the new 11g development database, I need to upgrade the RMAN catalog and catalog database to 11g.  When attempting to do so, I ran into an upgrade bug, which, without getting into the ugly details, meant that I needed to restore my catalog database from backup.  Open SR with Oracle, after much going round and round with them, they determined that my best course of action would be to upgrade the 10g database to 11g, and then NOT upgrade the 10g catalog to 11g (which is where I hit the bug initially).  Instead, leave the catalog owned by the ‘RMAN’ user at 10.2.0.3.  Create a new user, ‘RMAN11G’, and create a brand new 11g catalog there.  Now, I can continue to backup all my pre-11g databases to the RMAN user, and when I have an 11g database, I connect to the same rman catalog database, but as RMAN11g, rather than as RMAN.  So, I did the upgrade on Friday night, and everything seemed to go well.  I tried a few archive log backups, just to make sure everything seemed to be fine.

So, that’s a bit of the background.

On Saturday, I start getting pages, backups are failing.  And they’re failing w/ ORA-00060 deadlock detected.  Huh?  I’ve never heard of this happening before.  This makes no sense to me.  A quick look at the backup log, and it appears to happen on catalog resync.  So, I look at a sampling of the trace files, and they all seem to be deadlocking on TM enqueues with mode held ‘SX’ and mode waited on ‘SSX’.  So, this is definitely due to foreign keys referencing unindexed columns.  No doubt in my mind about that.  So, I go over to Steve Adams’ website, and find the script that identifies all the columns used in foreign key relationships that are missing indexes, and I slightly modify that script to generate DDL to create the missing indexes.   I run that, create all the missing indexes, and the deadlock problem goes away.

But, the mystery, in my mind, is why this happened at all.  Note that the deadlocks were happening when a 10.2.0.3 database tried to backup archive logs running 10.2.0.3 version of RMAN connecting to a 10.2.0.3 catalog living in a 11.1.0.6 database.  The same scenario worked fine for months and months, when the 10.2.0.3 catalog was in a 10.2.0.3 database.  So, I upgraded the database, but not the catalog or the RMAN binary that was being used.  That is, the “application” (RMAN binary and RMAN catalog) were not upgraded.

This would seem to imply that 11g is somehow more deadlock sensitive than 10gR2 is?  That strikes me as troublesome….and definitely concerns me, if that’s really the case.  I’m not sure I have enough information to prove that this is the case, but I’m definitely cautious and suspicious, at the moment…..  It wouldn’t be the first time that an Oracle upgrade exacerbated a situation, rather than improve it….

Bit of a stumper….

Posted June 9, 2008 by mbobak
Categories: Uncategorized

Well, I finally decided I have something noteworthy to blog about.  This was a bit of a stumper, that we ran into the other day….I did finally get to the bottom of it, and I thought it worth a mention, here.

We have a three node RAC running 10.2.0.3 on DL-585s.

This is a reasonably busy system, but, these boxes have lots of horsepower, so, no serious I/O or CPU bottlenecks are observed.  We seem to be humming along when we hit ORA-257 (archiver error, connect internal only until freed).  I think “Ok, the archivelog backup process is failing to run, and the archivelog area is full”.  But, I find that the 100GB archivelog area is nearly empty.  And we’re still stuck on ORA-257.  What the ….?  Some weird 10.2.0.3 bug, perhaps?  We notice that bouncing the stuck instance frees up the problem…..very strange indeed.

To make a long story short, we opened an SR, uploaded lots and lots of logs, and discovered…..when this server was set up, processes was set to 500, which is too small.  The process table filled up, and at archive time, apparently the archiver spawns a process to talk to ASM  (archivelog area is under ASM), and since the process table was full, it couldn’t do that, so it reports the (misleading) ORA-257 error.  We bumped processes to 1000 for all three nodes, and Voila! Problem solved.

So, anyhow, just thought I’d mention that scenario.  If you hit ORA-257 and your ASM managed archivelog area is not full, think PROCESSES parameter…

Hello world!

Posted April 30, 2008 by mbobak
Categories: Uncategorized

Just getting started here at WordPress.com.  Soon this blog will be filled with my insights into the Oracle database.