Java Deserialization
Deserialize Java objects from bytes with ObjectInputStream and understand the security pitfalls of deserialization.
Java Deserialization
Deserialization is the mirror of the previous chapter: given a stream of bytes produced by ObjectOutputStream, reconstruct the object graph. The API is ObjectInputStream.readObject(), and the mechanism is — for "trusted bytes" — almost as simple as the write side. The complication is that deserialization is the part of the serialization design with the well-publicised security problem; the second half of this chapter is about that.
try (ObjectInputStream in = new ObjectInputStream(
new BufferedInputStream(Files.newInputStream(path)))) {
User u = (User) in.readObject(); // throws ClassNotFoundException, IOException
}That's the minimal recipe. The reader sees the bytes, looks up each class by name in its own class loader, allocates instances without calling their constructors, fills in the fields by reflection, and returns the root of the graph cast to Object. You cast it to the type you expect.
What readObject returns
It returns the root object of the graph the writer wrote. The static return type is Object — the reader can't know the type at compile time — so a cast is part of the idiom:
Object raw = in.readObject();
if (raw instanceof User u) { // pattern match, recommended
process(u);
} else {
throw new IOException("expected User, got " + raw.getClass());
}That instanceof check (or an explicit getClass() check) is the only place in normal code where you can verify the stream contained what you thought it would. Skip it and a crafted stream can hand you a different type, your code will ClassCastException, and you have no idea why.
Two checked exceptions
readObject declares two:
ClassNotFoundException— the stream named a class (com.example.User) that the reader's class loader can't find. You wroteUserto disk; the reader's classpath doesn't includeUser; the deserializer can't reconstruct it.IOException— anything else: truncated stream, wrong magic header, schema mismatch (InvalidClassException), stream corruption (StreamCorruptedException).
The schema-mismatch case is the common one. InvalidClassException is thrown when the reader's version of the class has a different serialVersionUID than the one in the stream — usually because the class evolved between write and read and the UID wasn't bumped (or was bumped accidentally). The message names the class and both UIDs; that's how you debug it.
Constructors don't run
This is the bit that surprises everyone: deserialization does not call your class's constructors. The JDK allocates a raw instance of the class, then fills in the fields directly via reflection from the bytes. Any invariants you established in the constructor — required-non-null fields, integer-in-range checks, idempotent initialisation — are silently bypassed.
class User implements Serializable {
private static final long serialVersionUID = 1L;
String name;
int age;
User(String name, int age) {
if (age < 0) throw new IllegalArgumentException("age >= 0"); // never runs on read
this.name = name;
this.age = age;
}
}Hand-craft a byte stream where age = -1, run readObject, and you'll get a User with age == -1. The constructor was skipped. If you need a class invariant to survive deserialization, you have to add a readObject hook:
private void readObject(ObjectInputStream in)
throws IOException, ClassNotFoundException {
in.defaultReadObject(); // do the normal field-by-field read
if (age < 0) throw new InvalidObjectException("age must be >= 0");
}The signature is exact: name, parameter type, exception list. It's a private method the JDK looks up by reflection — there's no interface to declare. If you write it correctly, it runs at the end of deserialization and you get a clean failure on bad data.
transient fields after the read
transient (and static) fields aren't in the stream, so the reader leaves them at their default values: null for references, 0 for numerics, false for booleans. The reconstructed object has those defaults — that's the rule from the serialization chapter, stated from the read side.
For caches, that's fine. For required fields you marked transient to avoid persisting (a Connection, a worker Thread, a derived Map), the deserialized instance is in an "incomplete" state until you finish initialising it. The readObject hook is the place to do that:
private void readObject(ObjectInputStream in)
throws IOException, ClassNotFoundException {
in.defaultReadObject();
this.cache = new ConcurrentHashMap<>(); // rebuild the transient
}Same hook, different reason — the previous section used it for validation; this one uses it for initialisation.
The security problem
Here is the warning that drives modern Java's stance on this whole API: deserialization can execute arbitrary code.
The reason: deserialization is "instantiate any class the bytes name, then run its readObject hook." Many classes in the JDK and on a typical classpath have readObject hooks that do consequential things — initialise a thread, open a file, build an object graph that triggers side effects via hashCode/equals. A carefully crafted stream can chain together (a "gadget chain") readObject calls that, on the right classpath, end with Runtime.getRuntime().exec(...).
This isn't theoretical. The 2015 Apache Commons Collections RCE, the WebSphere/JBoss/Jenkins/Weblogic vulnerabilities of 2016–2018, and most of the "Java deserialization" CVEs since are this exact pattern: the attacker gives you bytes; you call readObject on them; their gadget chain runs in your process.
The rule that came out of all of this:
Never call
readObjecton bytes you do not fully control.
"Fully control" means: you wrote them, on the same machine, into a file or pipe nobody else can touch. The moment the bytes cross any kind of trust boundary — a network socket, a user upload, a queue message — ObjectInputStream is the wrong tool. Use JSON or Protocol Buffers; those formats don't instantiate classes by name.
ObjectInputFilter: the partial mitigation
Java 9 added ObjectInputFilter, a hook that lets you reject classes during deserialization. Set a process-wide filter at startup and any class outside the allowlist raises InvalidClassException before its readObject hook runs:
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
"com.example.*;java.util.*;!*" // allow these packages; reject everything else
);
ObjectInputFilter.Config.setSerialFilter(filter);This narrows the attack surface — a gadget that needs a class outside the allowlist can't trigger. It does not make deserialization safe; gadgets exist inside java.util.*, and the allowlist has to include classes you didn't write. Use it as defence in depth, not as a primary control. The primary control is still "don't deserialize untrusted bytes."
For new code, the answer remains JSON.
A worked example: round-trip, evolution, and a failure
The program below extends the example from the serialization chapter by reading the bytes back. It deserializes the Department/Employee graph, verifies the back-references reconnected, demonstrates the transient field coming back as null, and finishes with the version-mismatch failure mode: a stream written with one serialVersionUID and read by a class with a different one.
What to take from the run:
readObject()reconstructed the fullDepartmentgraph in one call. The list ofEmployees came back populated, eachEmployee.departmentpointer was set correctly, and the back-reference (employee → same department instance) was preserved as object identity, not a copy. That last point is what makes serialization "graph-shaped" rather than "tree-shaped" — the JDK tracked which references it had seen and rewired them.- The
instanceof Department dcheck was the gate that turned a rawObjectinto a typedDepartment. Without it, a stream containing a different type would have failed at the(Department) rawcast withClassCastException— uglier and harder to diagnose. Theinstanceofform is the idiom. - All three
passwordHashfields came back asnull. Marking the fieldtransientexcluded it from the stream; the reader had no value to assign, so the field stayed at its default. That's the rule from the serialization chapter, confirmed here in the read direction. - The version-mismatch block produced the
InvalidClassExceptionyou should expect: the stream said "UID = 1" and the class said "UID = 2," so the JDK refused to instantiate. The error message names both UIDs — that's how you find out which class drifted. Production-grade code declaresserialVersionUIDexplicitly and bumps it only when the change is incompatible. - Nothing in this example called any
EmployeeorDepartmentconstructor. The objects came into existence via reflection, fields filled in directly. Any constructor-time validation (if (salary < 0) throw ...) was bypassed; if you need it to run on the read side, that's what theprivate readObjecthook is for. The Practice question at the bottom drills that point.
What's next
Serialization and deserialization closed out the streaming side of java.io — bytes, characters, and graphs of objects, all written as streams. The next chapter, Java NIO Overview, steps up to a different API family: java.nio and java.nio.file. NIO replaces some of java.io, complements the rest, and is the home of the modern Path and Files classes that the file-related chapters have been quietly using already.
Practice
A class invariant — 'salary must be greater than 0' — is enforced in the constructor of a `Serializable` class. An attacker hands your server a serialized byte stream where the salary field is encoded as -1. What happens when your code calls `readObject()`?