Java Virtual Threads
Lightweight, JVM-scheduled threads (Java 21+) for high-throughput concurrent applications — what they fix and how they change pool sizing.
Java Virtual Threads
Every chapter in this part of the book until now described a platform thread — a Java Thread that maps one-to-one onto an OS thread. Platform threads are powerful but expensive: each one takes about 1 MB of native stack, and the OS limits a process to roughly tens of thousands of them. For CPU-bound work that's plenty. For I/O-bound work — a web server with one thread per request that mostly waits on a database — it's a hard ceiling that's been the central tension in Java server design for two decades.
Java 21 introduced virtual threads to fix exactly this case. A virtual thread is a Java Thread scheduled by the JVM (not the OS) onto a small pool of OS-level carrier threads. They're cheap — millions per JVM are routine — and blocking on I/O parks the virtual thread without parking the carrier. The code looks the same as before; the cost model is different.
What changes (and what doesn't)
Virtual threads are java.lang.Threads. The class is the same; the methods are the same; Thread.currentThread() still works. What's different is how they're scheduled and how they cost:
- A platform thread costs about 1 MB of native stack and is scheduled by the OS.
- A virtual thread costs about 1 KB initially (grows as needed) and is scheduled by the JVM.
- Blocking a platform thread blocks the underlying OS thread.
- Blocking a virtual thread parks the virtual thread; the carrier OS thread goes off to run a different virtual thread.
That fourth point is the headline. When a virtual thread calls Socket.read(), Thread.sleep(), BlockingQueue.take(), Lock.lock(), or basically any blocking JDK API, the JVM unhooks it from its carrier and the carrier picks up another virtual thread to run. The blocked virtual thread costs almost nothing while it waits.
Creating virtual threads
Three ways:
// 1. Direct
Thread t = Thread.ofVirtual().start(() -> doWork());
// 2. Builder
Thread t2 = Thread.ofVirtual().name("vt-", 0).start(this::work); // names "vt-0", "vt-1", ...
// 3. Executor — the production form
try (ExecutorService es = Executors.newVirtualThreadPerTaskExecutor()) {
for (int i = 0; i < 10_000; i++) {
es.submit(() -> handleRequest());
}
}The executor form is what almost every server uses. It hands out one virtual thread per submitted task; there's no pool to size because the carrier pool sizes itself.
You can also get a platform thread when you specifically want one:
Thread t = Thread.ofPlatform().name("compute").start(() -> doCpu());Useful for genuinely CPU-bound work, where the one-to-one OS mapping is what you want.
When virtual threads win
The shape virtual threads are optimised for:
- Many concurrent tasks (hundreds, thousands, millions).
- Each task spends most of its time blocked on I/O, queues, or locks.
- The work isn't dominated by CPU.
This is exactly the web-server shape: each request is a task that mostly waits for a database, an upstream service, or the client. With platform threads, a server with 1000 concurrent slow requests needs 1000 platform threads — 1 GB of native stack and significant OS scheduler load. With virtual threads, the same workload runs on 8-or-so carriers; the 1000 virtual threads cost a few MB total.
The mental model: stop thinking about thread pools for I/O work. Submit one virtual thread per request and let the runtime handle the rest.
When virtual threads don't win
A few cases where they don't help or actively hurt:
- CPU-bound work. A virtual thread doing pure compute can't be parked — it has to run on a carrier the whole time. You'll be no faster than the carrier count, which is your CPU count. For CPU work, platform threads (and fork/join) remain the right tool.
- Synchronized blocks around I/O. A virtual thread inside
synchronized (obj) { blockingIO(); }pins to its carrier — the JVM can't unmount it during the blocking call because the monitor is tied to the OS thread. This is a real pitfall: a server that usessynchronizedto protect a database call will not scale with virtual threads. The fix is to useReentrantLockinstead (which the virtual-thread machinery handles correctly). ThreadLocalstorage with many threads. Virtual threads supportThreadLocal, but the count can explode — millions of virtual threads × N thread-locals × value size = lots of memory. Java 21 added scoped values (ScopedValue) as a structured alternative.- Code that assumes a thread is rare (e.g. that builds a per-thread connection). One connection per virtual thread is one connection per request, which the database hates. Use a real connection pool.
The summary: virtual threads make I/O-bound concurrency cheap, but they don't transform CPU-bound work and they expose code paths that pin to carriers.
Pinning: the one thing to watch for
A pinned virtual thread can't be unmounted. The two pinning causes:
synchronizedblocks that include a blocking call.- Native method calls that block in JNI.
You can detect pinning via the system property:
java -Djdk.tracePinnedThreads=full ...If a virtual thread blocks while pinned, the JVM prints a stack trace. In production, the fix is to replace synchronized with ReentrantLock around the blocking region. Future JDKs are working on unpinning synchronized (JEP 491 in progress); for now, treat any synchronized around an I/O call as a virtual-thread anti-pattern.
What about wait, notify, and join?
All of them work — virtual threads can wait on intrinsic monitors, get notified, and be joined. The runtime handles parking and unmounting the right way. The constraint is only on synchronised blocks: holding the monitor through a blocking call inside the block pins; calling wait() to release the monitor and park is fine.
synchronized (lock) {
lock.wait(); // OK — releases monitor, parks, no pin
}
synchronized (lock) {
socket.read(buf); // BAD — holds monitor through blocking read; pins
}Sizing the pool — there is no pool
The conceptual shift virtual threads enable: stop sizing. Every executor you've configured in this book had a thread count knob. With newVirtualThreadPerTaskExecutor, the count is "however many requests are in flight." The carrier pool (which you don't directly configure) sizes itself based on CPU count; the virtual threads are bookkeeping.
In a server using virtual threads:
- Connection pools still matter. A virtual thread waiting for a connection is fine; spawning 10,000 of them all wanting a 5-connection pool just makes the bottleneck visible.
- Rate limits still matter. Virtual threads remove the thread limit, not the downstream service's limit.
- Memory still matters. Each virtual thread has a stack and any
ThreadLocals. Millions of them is millions of stacks.
Virtual threads remove the thread-count ceiling; they don't remove the underlying constraints the ceiling was hiding.
A worked example: a million virtual threads vs. one platform thread
The program below sleeps 100,000 tasks for 200 ms each, in parallel. On platform threads (capped at a sensible count) this takes a long time and uses a lot of RAM. On virtual threads it finishes in barely longer than the per-task sleep itself.
What to take from the run:
- The 100,000 virtual threads finished in barely longer than the 200 ms each task slept — say, ~250 ms total wall-clock. That's the entire point of virtual threads: the concurrency (how many things are in flight) is decoupled from the parallelism (how many cores are running CPU work). A million simulated requests would still take ~200 ms.
- The 5,000-task platform-pool run, with 100 worker threads, took roughly
5000 / 100 * 200 = ~10 seconds— the tasks queued up because the pool could only run 100 at a time. To finish in the same wall-clock as the virtual-thread version, the platform pool would need 100,000 threads, which is close to or beyond the OS limit on most systems. Thread.currentThread().isVirtual()distinguished the two thread types at runtime. The names differ too — virtual threads typically have a generic representation rather than a user-set name, unless you set one via the builder. Useful for logging when you mix the two kinds.- The pinning warning is the single most important caveat for virtual threads in production. A
synchronizedblock around any blocking call (database I/O, file I/O, network) defeats most of the benefit because the carrier can't be released during the wait. ReplacingsynchronizedwithReentrantLockkeeps the virtual thread parkable. - The
try (ExecutorService vexec = ...)form did the right thing on close — it ranshutdown()and waited for every submitted task to finish. With 100,000 in-flight tasks that wait was real (200 ms each, all parked together, all completing nearly at once). Without the try-with-resources, the executor would have stayed alive holding non-daemon threads and the program would have hung.
End of part 15
This is the last chapter of the Multithreading and Concurrency part. We've gone from "a thread is an OS-level thing" through the locks, atomics, and concurrent collections you use to make shared state correct, into the executor framework that hides thread management, then CompletableFuture and ForkJoinPool for composition, and finally virtual threads for the I/O-heavy workload modern servers actually face.
The pattern across all of it: pick the smallest tool that fixes your specific problem. A counter? AtomicInteger. A flag? volatile. A producer/consumer? BlockingQueue. Many parallel I/O calls? Virtual threads. The keyword synchronized is still right when it's right; Lock is for when it isn't; the high-level executors and futures are for when you've outgrown both. Reach down the stack only when the abstraction above isn't doing what you need.
The next part of the book is Annotations — what the @ markers attached to classes, methods, and fields actually do, the built-in ones in java.lang, and the rules for writing your own.
Practice
In a server using virtual threads, you wrap a JDBC call in `synchronized (this) { jdbc.execute(sql); }`. What's the consequence?