Java HashMap
Use the hash-table-backed HashMap for fast unordered key-value lookups in Java.
Java HashMap
HashMap<K, V> is the default Map implementation in the JDK and the most-used data structure in Java application code. It backs HashSet (which is a HashMap with all values set to a single dummy), it's what Collectors.toMap builds, and it's the structure behind every "lookup table" you write that isn't sorted or concurrent. Operations are expected O(1) — a hash, a bucket index, one or two equals checks — independent of size.
How the table is laid out
A HashMap keeps a power-of-two-sized array of buckets. Inserting an entry does five things:
- Compute
h = hashCode(key). Mix the upper and lower 16 bits together —h ^ (h >>> 16)— so a hash like0x12340000doesn't drop its top bits when masked. - Mask:
i = h & (table.length - 1). That'sh mod lengthfor power-of-two lengths, and it's faster than the modulo operator. - Walk the chain at
table[i]. If a node with an equal key exists, overwrite its value and return the old one. - Otherwise, prepend (or, since Java 8, append) a new node.
- If
size > capacity * loadFactor, resize: double the table and re-bucket every entry.
Up to Java 7 the bucket chain was a singly-linked list, full stop. From Java 8 on, once a chain reaches eight entries, the bucket is converted to a small balanced tree (a red-black tree) keyed by the hash. Lookup in that bucket becomes O(log n) instead of O(n), which caps the damage of a denial-of-service attack that crafts colliding hashes. If the tree shrinks back to six or fewer entries, it reverts to a list. You won't see this in normal code — it only matters when your hashCode is adversarial or pathologically bad.
Capacity, load factor, and pre-sizing
Same dials as HashSet:
- Initial capacity — default 16, rounded up to a power of two.
- Load factor — default 0.75. When
size > capacity * 0.75, the table doubles.
If you know the size up front, pre-size:
Map<String, User> users = new HashMap<>(expectedSize * 4 / 3); // skip the doublingsOr, since Java 19, the explicit factory:
Map<String, User> users = HashMap.newHashMap(expectedSize);That's the cleanest expression of intent — it computes the right initial capacity from a target size so the table doesn't need to grow.
Null keys and null values
HashMap allows one null key (it's stored in bucket 0 with hash 0) and any number of null values. That's a convenience over Hashtable (which rejects both) but it muddies the meaning of get(k) == null:
m.put(\"key\", null);
m.get(\"key\"); // returns null
m.containsKey(\"key\"); // returns trueThe disambiguation cost is real. Prefer to not store null values; use Optional, a sentinel, or just leave the key out. The Java 9+ factory Map.of(...) enforces this for you.
hashCode and equals are your contract
Putting your own class into a HashMap only works if hashCode and equals are consistent. The same rules as HashSet:
- Equal objects must have equal hash codes.
- Unequal objects may collide (it's fine, that's why buckets are chains).
- Mutating a key after insertion is undefined behaviour.
Use a record if you can — both methods are generated correctly. Or let the IDE generate them. Never hand-write hashCode if you can help it.
record UserId(String tenant, String localPart) {}
Map<UserId, User> directory = new HashMap<>();
directory.put(new UserId(\"acme\", \"alice\"), new User(/*...*/));
directory.get(new UserId(\"acme\", \"alice\")); // hitIteration order — explicitly undefined
HashMap makes no guarantee about iteration order. The order depends on the bucket layout, which depends on the hash, the capacity, and the resize history — it can change between runs and between JVM versions. If you rely on the order, your code is broken; if your tests rely on the order, they're flaky.
If iteration order matters, use LinkedHashMap for insertion order or TreeMap for sorted order. Both are drop-in replacements.
Not thread-safe
HashMap will corrupt itself under concurrent mutation — and historically a famously bad failure mode was an infinite loop during a concurrent resize. Don't share a HashMap between threads. The right structure for multi-threaded code is ConcurrentHashMap (covered later in the concurrency part). Collections.synchronizedMap(new HashMap<>()) exists but uses a single lock around every operation, which is slower and rarely the right answer.
A worked example: counter, lookup table, and the modern idioms
The program below uses a HashMap four ways: a word-count counter via merge, a memoization cache via computeIfAbsent, a value-null ambiguity demonstration, and the Java-19 newHashMap factory.
What to take from the run:
mergecollapses the three-step "get, default-or-add-one, put" into one call. Use it whenever you're maintaining a per-key counter or sum.- The Fibonacci cache is the cleanest possible memoizer: a recursive function plus
computeIfAbsentand you've turned exponential into linear. - The null ambiguity is real.
getreturnednullfor a present key and an absent key the same way. The only way to tell them apart iscontainsKey— or by deciding you don't store nulls in the first place. - Pre-sizing with
HashMap.newHashMap(1_000_000)lets a million inserts finish without any rehashes — the table starts at the right capacity. - The
UserIdrecord gives correctequals/hashCodefor free. That's the modern way to compose hash-map keys from multiple fields.
What's next
HashMap doesn't promise iteration order. If you need insertion order remembered — say you're serializing the map to JSON and want stable output — the right tool is LinkedHashMap. It's also the basis of a textbook LRU cache, which we cover in the same chapter.
Practice
You see `m.merge(key, 1, Integer::sum)` in code where `m` is a `Map<String, Integer>`. What does it do?