Sunday, January 29, 2012

How to Avoid Deadlocks

Deadlock can be illustrated by the classic dining philosophers problem[1].

In an operating system, a deadlock[2] is a situation which occurs when a process enters a waiting state because a resource requested by it is being held by another waiting process, which in turn is waiting for another resource. If a process is unable to change its state indefinitely because the resources requested by it are being used by other waiting process, then the system is said to be in a deadlock.

In Java programming, deadlocks may manifest themselves where multiple threads wait forever due to a cyclic locking dependency.  For example, when thread A holds lock L and tries to acquire lock M, but at the same time thread B holds M and tries to acquire L, both threads will wait forever.  Just as threads can deadlock when they are each waiting for a lock that the other holds and will not release, they can also deadlock when waiting for resources. Comparing to deadlock, starvation and livelock are much less common a problem, but are still problems that every designer of concurrent software is likely to encounter. Java applications do not recover from deadlock, so it is worthwhile to ensure that your design precludes the conditions that could cause it[6].

In this article, we will discuss some threading best practices to avoid deadlocks.

Necessary Conditions

A deadlock situation can arise only if all of the following conditions hold simultaneously in a system:[3]
  1. Mutual Exclusion: At least, one resource must be non-shareable.  Only one process can use the resource at any given instant of time.
  2. Hold and Wait: A process is currently holding at least one resource and requesting additional resources which are being held by other processes.
  3. No Preemption: The operation system can de-allocate resources once they have been allocated. They must released by the holding process voluntarily.
  4. Circular Wait: A process waiting for a resource which is being held by another process, which in turn is waiting for another process to release a resource.
Unfulfillment of any of these conditions is enough to preclude a deadlock from occurring.

How to Avoid Deadlocks

Never cede control to the client within a synchronized method or block[4]

Invoking an alien method with a lock held is asking for liveness trouble. The alien method might acquire other locks (risking deadlock) or block for an unexpectedly long time, stalling other threads that need the lock you hold.

In other words, inside a synchronized region, do not invoke a method that is designed to be overridden, or one provided by a client in the form of a function object (i.e., object references used to implement the Strategy pattern).  The class has no knowledge of what the alien method does and has no control over it. If you do call an alien method from within a synchronized region, you open the opportunities of deadlocks by allowing the following conditions to hold:
  • Hold and Wait
  • Circular Wait
Because you call the alien method from a shynchronized region, it holds a lock. If this alien method is overriden and it engages the services of another thread to do the deed,  Hold and Wait condition will be established.  Given another counterpart of Hold-and-Wait, it can possibly form a Circular Wait condition.

Calling a method with no locks held is called an open call[6], and classes that rely on open calls are more well-behaved and composable than classes that make calls with locks held.

More generally, try to limit the amount of work that you do from within synchronized regions. When you are designing a mutable class, think about whether it should do its own synchronization. Synchronize your class internally only if there is a good reason to do so, and document your decision clearly.

Ensure that resources are always acquired in some well-defined order[7]

Deadlocks can be avoided by assigning a partial order to the resources, and establishing the convention that all resources will be requested in order, and released in reverse order, and that no two resources unrelated by order will ever be used by a single unit of work at the same time.  This solution to the dining philosophers problem was originally proposed by Dijkstra.

There are different approaches to enforce a partial order to the resources:
  1. If resources are static and known in advanced
    • Just stick to a programming policy (i.e., all programmers apply the policy of acquiring the locks in some well-defined order)
    • Provide a method that combines acquisition and release of the multiple locks in the correct way (see here for an example)
  2. If resources are not known in advanced or acquired dynamically
    • We have to explicitly define some way of ordering them. For example,
      • If locking will only exist for the lifetime of the application
        • We can use the identity hash code (System.identityHashCode(r)) of the two objects.
        • In the rare case that two objects have the same hash code, we must use an arbitrary means of ordering the lock acquisitions, and this reintroduces the possibility of deadlock. To prevent inconsistent lock ordering in this case, a third tie breaking lock can be used[6].
Removing the mutual exclusion condition

Removing the mutual exclusion condition means that no process will have exclusive access to a resource. This proves impossible for resources that cannot be spooled. But even with spooled resources, deadlock could still occur. Algorithms that avoid mutual exclusion are called non-blocking synchronization algorithms.

For example, optimistic concurrency control[9] uses a pair of consistency markers in the data structure. Processes reading the data structure first read one consistency marker, then read the relevant data into an internal buffer, then read the other marker, and then compare the markers. The data is consistent if the two markers are identical. Markers may be non-identical when the read is interrupted by another process updating the data structure. In such a case, the process discards the data in the internal buffer and tries again.

Finally, immutability is great for multi-threading. Instances of immutable class appear constant. Therefore, no external synchronization is necessary. Examples include String, Long, and BigInteger.

Using timed lock acquisition to acquire multiple locks[6]

Another technique for detecting and recovering from deadlocks is to use the timed tryLock feature of the explicit Lock classes instead of intrinsic locking.

If a lock acquisition times out, you can release the locks, back off and wait for a while, and try again, possibly clearing the deadlock condition and allowing the program to recover. However, this technique works only when the two locks are acquired together; if multiple locks are acquired due to the nesting of method calls, you cannot just release the outer lock, even if you know you hold it.

Conclusion

We conclude this article by quoting Goetz et al.[6]:
Like many other concurrency hazards, deadlocks rarely manifest themselves immediately. The fact that a class has a potential deadlock doesn’t mean that it ever will deadlock, just that it can. When deadlocks do manifest themselves, it is often at the worst possible time—under heavy production load.
References
  1. Dining philosophers problem
  2. Deadlock
  3. Advanced Synchronization in Java Threads by Scott Oaks and Henry Wong
  4. Effective Java by Joshua Block
  5. Threading Best Practices
  6. Java Concurrency in Practice by Brian Goetz, Tim Peierls, Joshua Bloch, Joseph Bowbeer, David Holmes and Doug Lea
  7. Deadlock Tutorial
  8. The Java Tutorial (Concurrency)
  9. Optimistic concurrency control
  10. Concurrency: State Models & Java Programs (2nd Edition), by Jeff Magee and Jeff Kramer.
  11. Java Concurrent Animated
  12. Analyze Hanging Programs Using Java Thread Traces (Xml and More)

No comments: