Java Deadlock: Causes and Prevention

A deadlock is the failure mode of locking. Two or more threads each hold a lock the other one needs; neither can proceed; no exception is thrown; nothing in the log says "we're stuck." From the outside, the program appears to be doing nothing — exactly the same external symptom as a busy loop or a long network call.

Deadlocks happen in any program that acquires more than one lock at a time. They're frighteningly easy to write and frighteningly hard to reproduce — the schedule that triggers one may show up once a week in production and never in tests. The right strategy is not "debug them when they happen" but "structure the code so they can't happen."

The four conditions (Coffman's conditions)

A deadlock requires all four of these to be true at once:

Mutual exclusion. Some resource (a lock) can be held by only one thread at a time.
Hold and wait. A thread holds at least one resource while waiting to acquire another.
No preemption. Resources can't be taken away from the thread holding them; the thread must release voluntarily.
Circular wait. There's a cycle in the wait graph — A waits for B's lock, B waits for C's lock, ..., Z waits for A's lock.

Break any one and deadlocks become impossible. The standard prevention techniques each break one of the four:

Lock ordering (most common): break circular wait by always acquiring locks in a globally agreed order.
tryLock with timeout: break hold-and-wait by giving up if you can't get the second lock fast enough.
Single big lock: break the multi-lock structure entirely. Crude but works for small contention.
Lock-free / immutable data: break mutual exclusion by removing the resource. The atomics and concurrent collections later in this part of the book are this approach.

The two-account example

The canonical demonstration:

void transfer(Account from, Account to, int amount) {
  synchronized (from) {
    synchronized (to) {
      from.debit(amount);
      to.credit(amount);
    }
  }
}

// Thread A: transfer(accountX, accountY, 100)
// Thread B: transfer(accountY, accountX, 100)

Schedule:

Thread A acquires accountX's monitor.
Thread B acquires accountY's monitor.
Thread A tries to acquire accountY — blocked, held by B.
Thread B tries to acquire accountX — blocked, held by A.

Neither thread will ever release. Both are BLOCKED forever. The fix:

void transfer(Account from, Account to, int amount) {
  Account first  = from.id() < to.id() ? from : to;
  Account second = from.id() < to.id() ? to   : from;
  synchronized (first) {
    synchronized (second) {
      from.debit(amount);
      to.credit(amount);
    }
  }
}

Both threads now acquire accountX then accountY regardless of which direction the transfer goes. The circular wait can't form.

The ordering key doesn't have to be an id — System.identityHashCode(obj) works as a stable tiebreaker for any objects, but collisions are possible, so production code typically uses a real key (the database ID, the user ID, etc.) and falls back to a tie-breaker lock when keys match.

Lock ordering across the program

Lock ordering only works if every code path that takes two locks of the same kind takes them in the same order. One renegade method that does synchronized (b) { synchronized (a) { ... } } is enough to bring back the deadlock.

The way to enforce that consistently in a larger codebase:

Document the order. "Always acquire parent before child." Comment it on the class.
Funnel through a single helper. All "transfer" calls go through one method that does the ordering — so an individual call site can't get it wrong.
-XX:+PrintConcurrentLocks in a thread dump is one way to inspect actual lock-acquisition graphs in production.

The discipline matters as much as the rule.

`tryLock` with timeout

When you can't guarantee ordering — different libraries, different teams, complex object graphs — ReentrantLock.tryLock(timeout, unit) gives you an out:

boolean done = false;
while (!done) {
  if (firstLock.tryLock(100, TimeUnit.MILLISECONDS)) {
    try {
      if (secondLock.tryLock(100, TimeUnit.MILLISECONDS)) {
        try {
          doWork();
          done = true;
        } finally { secondLock.unlock(); }
      }
    } finally { firstLock.unlock(); }
  }
  // back off briefly, retry — eventually we'll get both
}

If the second lock can't be grabbed in 100 ms, the thread releases the first lock and tries again later. The hold-and-wait condition is broken — neither thread blocks forever, even if both try the same locks in opposite orders.

The cost is busy retries and the surrounding back-off code. Use lock ordering when you can; reach for tryLock when you can't.

How to detect a deadlock at runtime

Two main tools.

Thread dump. jstack <pid> or kill -3 <pid> prints every thread's state and stack. A deadlock shows up clearly: two threads with state BLOCKED, each - waiting to lock <0x...> on an object the other one shows - locked <0x...>. Java's JVM is even nice enough to flag obvious cycles at the bottom of the dump:

Found one Java-level deadlock:
=============================
"thread-2":
  waiting to lock monitor 0x00007fcd0e..., which is held by "thread-1"
"thread-1":
  waiting to lock monitor 0x00007fcd0e..., which is held by "thread-2"

ThreadMXBean.findDeadlockedThreads(). A programmatic version — useful for embedding in a health-check endpoint:

ThreadMXBean mx = ManagementFactory.getThreadMXBean();
long[] deadlocked = mx.findDeadlockedThreads();
if (deadlocked != null) log.error("deadlock detected: {} threads", deadlocked.length);

This finds only deadlocks on intrinsic monitors and ReentrantLock. It doesn't find livelocks or general "thread is just slow" cases.

Livelock and starvation — deadlock's cousins

Two failure modes that look like deadlocks but aren't:

Livelock. Threads keep changing state but make no progress. The classic case: two tryLock callers each retry forever because neither will yield first. The CPU is busy; the work isn't getting done.
Starvation. A thread is technically RUNNABLE or wakeable but the scheduler / lock policy never lets it actually run. Unfair locks under heavy contention can starve a writer while readers stream through.

Both have the same surface symptom as deadlock ("nothing seems to be making progress") but the diagnosis is different — the thread dump doesn't show BLOCKED on a mutual cycle; it shows threads churning or just one perpetually waiting.

A worked example: deadlock created and then prevented

The program below runs the transfer pattern both ways — first with the broken nested-lock version (which will deadlock under contention), and then with the lock-ordering fix that prevents it. The broken version is wrapped in a watchdog timeout so the demo doesn't hang forever.

java— editable, runs on the server

import java.lang.management.ManagementFactory;
import java.lang.management.ThreadMXBean;
import java.util.concurrent.atomic.AtomicInteger;

public class DeadlockDemo {
  static class Account {
    final int id;
    int balance;
    Account(int id, int balance) { this.id = id; this.balance = balance; }
    int id() { return id; }
    void debit(int x) { balance -= x; }
    void credit(int x) { balance += x; }
  }

static void transferBroken(Account from, Account to, int x) {
    synchronized (from) {
      try { Thread.sleep(1); }                            // increase deadlock probability
      catch (InterruptedException e) { Thread.currentThread().interrupt(); }
      synchronized (to) {
        from.debit(x);
        to.credit(x);
      }
    }
  }

static void transferSafe(Account from, Account to, int x) {
    Account first  = from.id() < to.id() ? from : to;
    Account second = first == from ? to : from;
    synchronized (first) {
      synchronized (second) {
        from.debit(x);
        to.credit(x);
      }
    }
  }

public static void main(String[] args) throws InterruptedException {
    runVariant("BROKEN  (nested locks, no ordering)", true);
    System.out.println();
    runVariant("FIXED   (locks in stable id order)", false);
  }

static void runVariant(String label, boolean broken) throws InterruptedException {
    Account a = new Account(1, 1000);
    Account b = new Account(2, 1000);
    AtomicInteger completed = new AtomicInteger();

Thread t1 = new Thread(() -> {
      for (int i = 0; i < 50; i++) {
        if (broken) transferBroken(a, b, 10); else transferSafe(a, b, 10);
        completed.incrementAndGet();
      }
    }, "t1");
    Thread t2 = new Thread(() -> {
      for (int i = 0; i < 50; i++) {
        if (broken) transferBroken(b, a, 10); else transferSafe(b, a, 10);
        completed.incrementAndGet();
      }
    }, "t2");

long t0 = System.nanoTime();
    t1.start(); t2.start();

// watchdog: if work isn't done in 3 seconds, dump deadlock info and bail
    boolean done = false;
    long deadline = t0 + 3_000_000_000L;
    while (System.nanoTime() < deadline) {
      if (completed.get() == 100) { done = true; break; }
      Thread.sleep(50);
    }

long elapsedMs = (System.nanoTime() - t0) / 1_000_000;
    System.out.println(label + ":");
    System.out.println("  completed: " + completed.get() + "/100  in " + elapsedMs + " ms");

if (!done) {
      ThreadMXBean mx = ManagementFactory.getThreadMXBean();
      long[] dl = mx.findDeadlockedThreads();
      if (dl != null) {
        System.out.println("  ThreadMXBean: deadlock detected, threads=" + dl.length);
      } else {
        System.out.println("  ThreadMXBean: no deadlock (threads may just be slow)");
      }
      // Don't actually wait forever — interrupt and move on
      t1.interrupt(); t2.interrupt();
    }
    t1.join(500); t2.join(500);
  }
}

What to take from the run:

The BROKEN variant did not complete all 100 transfers. Under contention, t1 ended up holding a and waiting for b while t2 held b and waited for a. The watchdog hit its 3-second deadline; findDeadlockedThreads() confirmed the cycle. That's deadlock — no exception, no log, nothing wrong with any individual line of code.
The FIXED variant finished cleanly. The ordering rule (first = id-min, second = id-max) means both threads acquire a first and b second, regardless of the direction of the transfer. The cycle can't form because both threads walk the lock graph the same direction.
Thread.sleep(1) inside the broken version's first synchronized makes the deadlock highly reproducible. In real code, you almost never see this kind of explicit sleep — but I/O, GC, or a context switch can produce the same window. That's why deadlocks reproduce intermittently in production and never in tests.
ThreadMXBean.findDeadlockedThreads() returned a non-null array for the broken variant and confirmed the count of cycling threads. That call is your safety-net for in-process detection — wire it into a health endpoint and you'll be told about the deadlock before the user is.
After the watchdog declared the broken variant stuck, the program interrupted both threads. interrupt() does not wake a thread that's blocked on a synchronized monitor — it only wakes threads in sleep, wait, join, or LockSupport.park. That's why interrupting a deadlock doesn't unstick it; you'd have to kill the JVM (or use ReentrantLock.lockInterruptibly).

What's next

The next chapter, Java volatile, turns to the visibility half of the safety story — the keyword that fixes "one thread writes, another thread reads the old value forever" without involving locks.

Practice

Which strategy directly breaks the 'circular wait' Coffman condition and is the most common deadlock-prevention technique in Java?

Always acquire multiple locks in a globally agreed order (e.g. by stable id) so the wait graph can't form a cycleSet every contending thread to the same priority so the OS scheduler can't pick the wrong oneWrap each `synchronized` block in a `try`/`finally` so locks are released on exceptionUse `notifyAll` instead of `notify` so all blocked threads have a chance to make progress