Java LinkedHashSet
Use LinkedHashSet in Java to maintain insertion order while keeping HashSet's near-constant-time operations.
Java LinkedHashSet
LinkedHashSet<E> is HashSet<E> with one extra promise: when you iterate, you get elements in the order you first inserted them. The hash-table machinery is identical — same buckets, same load factor, same near-constant-time add, remove, contains — but each entry carries two extra pointers (before, after) that thread the entries into a doubly-linked list as they're added. Iteration walks that list, not the bucket array.
If you want hash-set performance and deterministic, predictable iteration order, LinkedHashSet is the answer. It's almost a free upgrade for the cases where HashSet's unspecified order has bitten you.
The "first-insertion wins" rule
The order is fixed by the first time an element is inserted. Re-adding an existing element doesn't move it:
Set<String> s = new LinkedHashSet<>();
s.add(\"a\");
s.add(\"b\");
s.add(\"c\");
s.add(\"a\"); // already present — returns false, order unchanged
System.out.println(s); // [a, b, c]That makes it the right tool for "remember the order tags arrived" or "log unique events in time order." If you remove an element and re-add it, it goes to the end of the list — the position was tied to the current insertion, and the new one is the only one left.
The cost: pointers and pointers
The extra ordering machinery has a cost. Each entry stores not just (hash, key, next-in-bucket) like HashSet, but (hash, key, next-in-bucket, before, after). That's two extra references per element — roughly an extra 16 bytes on a 64-bit JVM. For a set of 10 million Longs, that's around 160 MB extra. For most application code it's nothing; for cache-sized data structures, it matters.
In exchange, you get O(1) on every operation (same as HashSet) plus a stable iteration order that doesn't depend on the load factor, the rehash, the hash spread, or the JVM version.
Iteration cost is proportional to size, not capacity
There's a subtle bonus over HashSet: walking a LinkedHashSet follows the linked list, so it visits exactly size entries. Iterating a HashSet walks every bucket, so it visits roughly capacity slots — including empty ones. For a sparsely populated set, that can be a significant difference. If you build a set, expand it well past the elements you'll keep, then iterate often, LinkedHashSet can actually iterate faster.
When to pick it
The decision flow:
- Order doesn't matter, you just need fast membership →
HashSet. Smaller, simpler. - You want insertion order remembered →
LinkedHashSet. Same speed foradd/contains, predictable iteration. - You want sorted order →
TreeSet. Different algorithm, log-time operations.
The most common reason to pick LinkedHashSet is defensive: you're building a public API that returns a Set, and you don't want callers depending on HashSet's arbitrary order. A LinkedHashSet is the kindest thing you can return — it has the same contract as a Set, but iteration is reproducible across runs and JVMs, which makes user-visible output stable and tests easier to write.
A worked example: unique tags in arrival order
The program below builds two sets from the same stream of tag inputs: one with HashSet, one with LinkedHashSet. The HashSet iteration order depends on the JVM (it's stable-but-arbitrary for a given JVM); the LinkedHashSet order is exactly the order the unique elements first appeared. Then it shows the "remove and re-add" rule, and finally builds an order-preserving deduplicator that's two lines long.
What to read out of the run:
- The
LinkedHashSetprinted the unique events in the order they first appeared. TheHashSetprinted them in some other order entirely — whatever the bucket layout dictated. - Re-adding
\"a\"did nothing to the order. Removing it and re-adding it moved it to the end. The first insertion is the one that anchors the position. - The order-preserving deduplicator is a one-liner once you know the trick: collect into a
LinkedHashSet, then back to a list. - The 10-element scan through a 2 000 000-bucket
LinkedHashSetwalked exactly 10 entries; aHashSetof the same shape would have scanned every empty bucket in between.
What's next
The third standard Set implementation gives you something neither HashSet nor LinkedHashSet can: sorted iteration and the ability to ask range questions like "every tag between a and m." TreeSet is up next.
Practice
What does `LinkedHashSet` give you that plain `HashSet` doesn't?