Unintended Consequences

Hi all,

My apologies for my extended absence.  Well, I ran into something interesting today, and, I thought it would be appropriate for a blog post, and apparently, I got inspired.  No startling revelation here, this is just a bit of a cautionary tale about the unintended consequences of using a new feature.  A quick search of the Oracle Documentation confirms that the skip locked directive of the select for update statement was introduced in Oracle 11g, version 11.1. But, before we dive into that, let’s review the functionality of select for update through versions of Oracle preceding 11g.

Select for update

First, a bit about select for update in general. The select for update functionality has been available in Oracle since some very ancient release of Oracle. It allows for the user to lock specific rows in a table, without changing them. This is generally used as part of a pessimistic locking strategy. (See here for an explanation of optimistic and pessimistic locking.) select for update default behavior is to wait for locks. That is, Oracle will do a read consistent view of the data, to identify the set of rows that will be locked, and then do current mode gets to impose row-level locks on that set of rows. If, when attempting to lock the set of rows, a row is encountered that is already locked by another session, the session will wait, indefinitely, on a TX enqueue, for a commit or rollback to free the lock, and then execution will continue.

Select for Update with nowait

Oracle also provides for a nowait option. If the developer codes select ... from .... for update nowait, then the select for update will proceed in much the same way described above, except, if it encounters a row locked by a different session, rather than waiting, it will raise an ORA-00054 resource busy and NOWAIT specified error and statement level rollback will occur.

Select for Update with wait n

The functionality of select for update, with the option of nowait, was unchanged from some very ancient version of Oracle, up until Oracle 9i Release 2. Starting with Oracle 9i Release 2, (version 9.2.0), Oracle introduced the wait n clause. This allows the developer to modify the behavior of the select for update to wait up to n seconds. If a lock cannot be acquired on a particular row for n seconds, Oracle will raise an ORA-30006 resource busy; acquire with WAIT timeout expired error, and statement level rollback will occur.

New in 11g!

Starting with 11g, (version 11.1.0), Oracle introduced the skip locked clause. This clause allows for the select for update to skip any rows it encounters that are already locked. In this way, it’s neither subject to waiting for locks, nor to the ORA-00054 or ORA-30006 error. It simply locks what it can, and moves on. This can be quite useful, when the application design calls for single or multiple producers of data (programs that add rows to a table) and multiple consumers (concurrently executing programs that consume, or delete, rows from the table). In this scenario, and absent any strategy to partition the data so that the consumers do not interfere with each other, implementing either a wait or nowait strategy with select for update simply won’t work. But, the skip locked functionality fits the bill very well. Each consumer can query the table, locking the rows that are not locked, and skipping over any rows already locked by someone else. All the consumers can work, and none of them serialize behind each other. Note that this functionality has actually been available in Oracle since at least version 8.0, (the Advanced Queueing feature is built on it), but was not documented, and therefore not supported by Oracle, until version 11.1.

Unintended Consequences

So, here begins my tale of unintended consequences. We have an application in which there are several concurrent processes, all running the same code, consuming rows from a table. The program does a select for update skip locked, does some processing, and then finally deletes the locked rows. The SQL is similar in form to:

select ...
from [table_name]
where [some predicates]
and rownum < 101
for update skip locked;

So, find the first 100 rows in the table that satisfy the predicates, skipping over any rows that are already locked. Now, one would expect, given that the number of rows in the table is far greater than 100 * (number of concurrent consumers), each consumer process should have no problem identifying 100 rows for processing. I was a bit puzzled then, when it was reported to me that Oracle was actually returning significantly fewer than 100 rows. The code is simple, the logic is simple, so, what’s going on? How could Oracle return fewer than 100 rows, when there’s plenty of data in the table? Note that there is no error being raised, Oracle is simply going through the table, locking what it can, skipping what it can’t, and coming up with fewer than 100 rows. This in spite of the fact that there are 50 or fewer consumer processes, and that there are 10s of thousands (sometimes 100s of thousands) of rows in the table. The most rows that could be locked at any one time would be 50 consumer processes * 100 rows per process = 5000 rows. So, what’s going on? After reviewing the application level log files, and and several discussions and walk-throughs of the code, we were stumped. So, I try enabling 10046 trace, and sure enough, significantly fewer than 100 rows are being locked. Simple queries prove that there are more than enough eligible rows in the table. So, again, we are stumped. Another test, running only one consumer process, shows that the code behaves as expected when run single-threaded. So, there’s a concurrency problem….but what? Is there some bizarre bug in the skip locked functionality? It seems premature to consider that, so I decided to rule out that possibility, as extremely unlikely, unless I can find more direct evidence of that fact. So, what next? Well, I wondered, perhaps there is a diagnostic event that would reveal what’s happening under the covers with skip locked? Well, I tried poking around and looking for such an event, but, to date, I’m not aware of any such functionality. So, how else to continue the investigation? Remember, as I said before, no error is being raised. The only symptom is that the number of rows returned is much smaller than expected. After another brainstorming session with the architect who designed and wrote the code, we came up with the idea to change the code in our test environment, and remove the skip locked clause, causing the code to block, and allowing us (hopefully) to clearly see where the locking is happening. Sure enough, when we did that, the problem revealed itself almost immediately, and the solution was clear and obvious. When we started multiple, concurrent consumer programs, one wait event was immediately obvious and overwhelming:

enq: TX - allocate ITL entry

So, clearly, the root cause of all my problems and headaches is a shortage of ITL entries in the header of the data block. So, the blocking is happening at the block-level, not the row-level. This explains why locks are not being taken on rows that are not locked, even when the cursor has not yet identified 100 candidate rows. Due to the high level of concurrency, the number of ITL slots (8k block size * 10% PCTFREE = 819 bytes free space per block / 24 bytes per ITL entry = a max of 34 ITL entries per block) is not sufficient. We are running up to 50 concurrent consumer processes, and may want to go as high as 100, and the number of rows per block is well over 100 in many cases. So, now that we’ve gotten to the bottom of the problem, the solution is fairly straightforward and simple. A simple alter table ... move nologging initrans 110 and a rebuild of the indexes, and we are back in business.

Conclusion

In my opinion, the interesting part of this little puzzle, is the effect of the skip locked functionality. The whole point, is to allow the code to skip rows that have been locked by another session, and continue running. That’s great, and allows for coding applications that can be highly scalable and performant. However, it’s important to understand that the skip locked clause means that Oracle will skip any time a TX lock is encountered, including TX locks that are happening at the block-level. The problem is, if you’re encountering non-row level locking, skip locked is still going to skip over the locked resource and move on. No error will be reported! This may lead to unpredictable results. The moral of the story is that skip locked doesn’t necessarily mean skip locked rows, it means skip any locked resource.

It seems that there is always how we think a feature works, and how it actually works.

So, this concludes my investigation. I hope it helps someone in the future avoid a similar problem.

I’m going to make an attempt to blog more regularly, but, I know better than to make any promises! Until next time….

About these ads

16 comments on “Unintended Consequences

  1. [...] after you’ve worked out what’s going on.” If you want to read more, here’s a link to the blog item he wrote about [...]

  2. Jeffrey Kemp says:

    Thanks for this! Certainly, it seems that the number of ITL slots is a blocking factor here.

    However, see Gwen’s comments here: http://jonathanlewis.wordpress.com/2010/05/31/skip-locked/

    It would seem that the (or at least, another) reason you’d get less than 100 rows is that the predicate (WHERE ROWNUM < 101) is applied *before* the lock checking is done. You said it yourself: "Oracle will do a read consistent view of the data, to identify the set of rows that will be locked, and then do current mode gets to impose row-level locks on that set of rows."

    It might be a better strategy to remove the ROWNUM predicate from the query, or at least increase it to some higher value (e.g. 1000), then fetch only the first 100 rows (in a bulk collect) from the cursor, then close the cursor.

    • mbobak says:

      Hi Jeffrey,

      Thanks for the reply. Yes, I saw Gwen’s comments on Jonathan’s blog. I replied to her there, but, the short answer is, yes, she’s correct, but I also believe I’m correct. Bottom line, I really was seeing ITL waits, as was evidenced by running a version of the code without the ‘skip locked’ clause. My real world scenario was slightly different than her simplified demo. See my reply on Jonathan’s blog for the full details.

      Thanks!

      -Mark

  3. [...] a bad idea. See: Select for update – unintended consequences – this is Mark Bobak’s Unintended Consequences Timing improvements in Oracle 11GR2 trace – this one is Alex Fatkulin’s Timing [...]

  4. lukaseder says:

    Thanks for this great post. I’m sure having read this will save me a lot of time when going live with my intended adding of a SKIP LOCKED clause

  5. rogflies says:

    This is a great post because it matches our application very closely. Unfortunately, we seem to be running into a different problem. We fetch the rows using skip locked from within a PL/SQL procedure using a bulk collect with a limit. Our limit needs to be on the small size, i.e., 20 rows. However, PL/SQL seems to be “helping” us by pre-fetching 100 rows, causing them to be locked and unavailable to other threads calling this same procedure. But, since our limit is 20, we don’t see the other 80 rows, either. They are eventually unlocked at the commit, but they are still unprocessed, lowering our overall throughput.

    Is there any way to lower the amount PL/SQL prefetches for us or turn it off altogether?

    Thanks,
    -Roger

    • Jeffrey Kemp says:

      Roger, do the locks on the 100 rows still occur if you add “WHERE ROWNUM <= 20" to your query?

      • rogflies says:

        Hi Jeffrey –
        You actually have to be very careful about using ROWNUM with the SKIP LOCKED clause. The problem is that ROWNUM counts locked and unlocked rows. So, if you want 20 rows, and the first 20 are locked by someone else, using ROWNUM <= 20 will return 0 rows.

  6. Jeffrey Kemp says:

    That’s right, the ROWNUM predicate will conflict with the purpose of skipping locked rows – but I was interested to see if you got the same results if you did add it. Certainly the ROWNUM predicate is not a solution, but I expect that if the underlying query cannot retrieve 100 rows, Oracle will not lock 100 rows.
    A possible solution would be to use an ordinary LOOP + FETCH, which would only fetch 1 row at a time. This is one way to stop the 100-row pre-fetch.

  7. [...] Note, two interesting articles here and here. [...]

  8. Just a note:
    “Ordinary LOOP + FETCH” would not be a solution, at least according to the docs:
    From http://docs.oracle.com/cd/E11882_01/appdev.112/e25518/adfns_sqlproc.htm#i1025003

    “If a SELECT FOR UPDATE statement is used when defining a cursor, the rows in the return set are locked when the cursor is opened (before the first fetch) rather than being locked as they are fetched from the cursor. Locks are only released when the transaction that opened the cursor is committed or rolled back, not when the cursor is closed.”

    • djstricker says:

      Actually, it appears that when SKIP LOCKED is specified, it does an inconsistent read, and hence does not need to lock on open:

      “Hi Tom, when we issue “select … for update skip locked”, the runtime seems to lock one row when it fetches that specified row. The underlying logic looks like engine fetches one row, checks if it has been locked. If it’s locked, skip that row; otherwise, lock that row. This is different with “select … for update nowait”. When it’s issued, instead of at fetching phase, at the cursor open phase , all the rows that hit the where clause are supposed to be locked. Is my understanding correct? Thanks.

      Followup December 9, 2010 – 10am Central time zone:

      yeah, skip locked is funny that way – you just made me realize something – that skip locked is in fact an inconsistent read (that is why select for update locks during the open – to ensure the consistent read).”

      From:
      http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2060739900346201280#2790664300346747150

  9. fewer rows, not less rows!

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s