Thursday, February 14, 2013

Postgres Deadlock

Hi all,

So I ran a few days ago into a pretty weird deadlock error in my logs.
It took me some time to figure out the exact situation that was causing this deadlock, and so I decided to share what I've learned.
What better way to do that than through a blog.

Without further ado, let's dive in.

The general setting we are talking about is Java, Spring-Data, Hibernate, Postgres, though the main issue is the Postgres DB.

Let's start with the error. This would look something like that:

ERROR:  deadlock detected
DETAIL:  Process 552 waits for ExclusiveLock on tuple (0,3) of relation 16393 
of database 12002; blocked by process 4924.
Process 4924 waits for ShareLock on transaction 689; blocked by process 552.
HINT:  See server log for query details.

Usually for me a deadlock error is a bad way to start the day.
These are usually hard to reproduce, and not simple to solve.

After reading a little about the different locks in Postgres and when they are used I came up with the following simple scenario which reproduces this error:

Given 2 tables, Parent and Child, we will try to add two children to the same parent from two different processes. After each insert we will also try to update the name of the parent.
(Assume the Parent has columns: ID, Name, and the Child has columns: ID, Parent_ID, Name)

Process 1:
BEGIN;
INSERT INTO "Child" VALUES (1, 1, 'CHILD_A');
SELECT * from pg_sleep(5);
UPDATE "Parent" SET "Name"='Parent_B' WHERE "ID"=1;
END;

Process 2:
BEGIN;
INSERT INTO "Child" VALUES (2, 1, 'CHILD_B');
SELECT * from pg_sleep(5);
UPDATE "Parent" SET "Name"='Parent_C' WHERE "ID"=1;
END;

Running these two simultaneously would result in the error above.

At first it seems a little weird that this scenario gives a deadlock.
I would just expect one of the processes to fail when trying to update the Parent row.

The deadlock comes from the fact that when inserting a new row to the Child table, Postres acquires a RowShareLock on the Parent row (because of the foreign key).
The insert from the second process also acquires a RowShareLock on the same row (acquiring two RowShareLocks is fine).
When both processes come to the update statement, things become interesting. An update required a RowExclusiveLock. This lock conflicts with the RowShareLock, and so each process waits for the second one to release the RowShareLock - thus the deadlock.
(You can read more about the different locks Postgres uses here)

The fact we are actually seeing a ShareLock on a transaction has to do with Postgres' deadlock detector, and the issue is explain here.

Now, in our case the solution was just synchronizing the code block which updates the entities and causes the deadlock.
The solution for your cases might be different, and is rather application specific.
Understanding the problem, though, will help you in choosing the right solution for your case.

That's it for today.
I hope this post will help someone debugging his Postgres deadlock - if it does, I did my good deed for the day.

25 comments:

  1. Can you pleas write solution for this situation?

    ReplyDelete
    Replies
    1. I'm afraid there is no general solution.
      For our case, we just added a `synchronized` on the object that we were updating, thus preventing this deadlock.

      The solution is application specific.

      Delete
    2. Can you please explain us how did you achieve that?

      Regards

      Delete
    3. Can you please explain us how did you achieve that?

      Regards

      Delete
    4. Sorry Sabri, but it was some time ago, and I don't have that code in front of me any more.
      The basic idea is just synchronizing the updates applicatively (lock on some object and prevent two threads from updating in parallel).

      Delete
  2. Synchronizing the code block?

    What happens if you cluster your application server into two nodes or more?

    ReplyDelete
    Replies
    1. You are of course right.
      I am not sure how I would've handled it in a cluster situation.
      If anyone else solved this, please feel free to comment.

      Delete
  3. This blog was very helpful. Thanks for sharing!

    ReplyDelete
  4. I guess it is pretty simple to solve: just SELECT FOR UPDATE parent with ID=1 in the very beginning of the transaction. That helps in case of distributed application.

    ReplyDelete
  5. Nice Article !
    This is my pleasure to read your article.
    Really this will help to people of PostgreSQL Community.

    I have also prepared one article about, How to generate Deadlock situation in PostgreSQL.
    You can also visit my article, your comments and reviews are most welcome.

    http://www.dbrnd.com/2016/11/postgresql-how-to-generate-a-deadlock-in-a-database-error-deadlock-detected/

    ReplyDelete