W3docs

Java HashMap

Use the hash-table-backed HashMap for fast unordered key-value lookups in Java.

Java HashMap

HashMap<K, V> is the default Map implementation in the JDK and the most-used data structure in Java application code. It backs HashSet (which is a HashMap with all values set to a single dummy), it's what Collectors.toMap builds, and it's the structure behind every "lookup table" you write that isn't sorted or concurrent. Operations are expected O(1) — a hash, a bucket index, one or two equals checks — independent of size.

How the table is laid out

A HashMap keeps a power-of-two-sized array of buckets. Inserting an entry does five things:

  1. Compute h = hashCode(key). Mix the upper and lower 16 bits together — h ^ (h >>> 16) — so a hash like 0x12340000 doesn't drop its top bits when masked.
  2. Mask: i = h & (table.length - 1). That's h mod length for power-of-two lengths, and it's faster than the modulo operator.
  3. Walk the chain at table[i]. If a node with an equal key exists, overwrite its value and return the old one.
  4. Otherwise, prepend (or, since Java 8, append) a new node.
  5. If size > capacity * loadFactor, resize: double the table and re-bucket every entry.

Up to Java 7 the bucket chain was a singly-linked list, full stop. From Java 8 on, once a chain reaches eight entries, the bucket is converted to a small balanced tree (a red-black tree) keyed by the hash. Lookup in that bucket becomes O(log n) instead of O(n), which caps the damage of a denial-of-service attack that crafts colliding hashes. If the tree shrinks back to six or fewer entries, it reverts to a list. You won't see this in normal code — it only matters when your hashCode is adversarial or pathologically bad.

Capacity, load factor, and pre-sizing

Same dials as HashSet:

  • Initial capacity — default 16, rounded up to a power of two.
  • Load factor — default 0.75. When size > capacity * 0.75, the table doubles.

If you know the size up front, pre-size:

Map<String, User> users = new HashMap<>(expectedSize * 4 / 3); // skip the doublings

Or, since Java 19, the explicit factory:

Map<String, User> users = HashMap.newHashMap(expectedSize);

That's the cleanest expression of intent — it computes the right initial capacity from a target size so the table doesn't need to grow.

Null keys and null values

HashMap allows one null key (it's stored in bucket 0 with hash 0) and any number of null values. That's a convenience over Hashtable (which rejects both) but it muddies the meaning of get(k) == null:

m.put(\"key\", null);
m.get(\"key\");          // returns null
m.containsKey(\"key\"); // returns true

The disambiguation cost is real. Prefer to not store null values; use Optional, a sentinel, or just leave the key out. The Java 9+ factory Map.of(...) enforces this for you.

hashCode and equals are your contract

Putting your own class into a HashMap only works if hashCode and equals are consistent. The same rules as HashSet:

  • Equal objects must have equal hash codes.
  • Unequal objects may collide (it's fine, that's why buckets are chains).
  • Mutating a key after insertion is undefined behaviour.

Use a record if you can — both methods are generated correctly. Or let the IDE generate them. Never hand-write hashCode if you can help it.

record UserId(String tenant, String localPart) {}
Map<UserId, User> directory = new HashMap<>();
directory.put(new UserId(\"acme\", \"alice\"), new User(/*...*/));
directory.get(new UserId(\"acme\", \"alice\")); // hit

Iteration order — explicitly undefined

HashMap makes no guarantee about iteration order. The order depends on the bucket layout, which depends on the hash, the capacity, and the resize history — it can change between runs and between JVM versions. If you rely on the order, your code is broken; if your tests rely on the order, they're flaky.

If iteration order matters, use LinkedHashMap for insertion order or TreeMap for sorted order. Both are drop-in replacements.

Not thread-safe

HashMap will corrupt itself under concurrent mutation — and historically a famously bad failure mode was an infinite loop during a concurrent resize. Don't share a HashMap between threads. The right structure for multi-threaded code is ConcurrentHashMap (covered later in the concurrency part). Collections.synchronizedMap(new HashMap<>()) exists but uses a single lock around every operation, which is slower and rarely the right answer.

A worked example: counter, lookup table, and the modern idioms

The program below uses a HashMap four ways: a word-count counter via merge, a memoization cache via computeIfAbsent, a value-null ambiguity demonstration, and the Java-19 newHashMap factory.

java— editable, runs on the server

What to take from the run:

  • merge collapses the three-step "get, default-or-add-one, put" into one call. Use it whenever you're maintaining a per-key counter or sum.
  • The Fibonacci cache is the cleanest possible memoizer: a recursive function plus computeIfAbsent and you've turned exponential into linear.
  • The null ambiguity is real. get returned null for a present key and an absent key the same way. The only way to tell them apart is containsKey — or by deciding you don't store nulls in the first place.
  • Pre-sizing with HashMap.newHashMap(1_000_000) lets a million inserts finish without any rehashes — the table starts at the right capacity.
  • The UserId record gives correct equals/hashCode for free. That's the modern way to compose hash-map keys from multiple fields.

What's next

HashMap doesn't promise iteration order. If you need insertion order remembered — say you're serializing the map to JSON and want stable output — the right tool is LinkedHashMap. It's also the basis of a textbook LRU cache, which we cover in the same chapter.

Practice

Practice

You see `m.merge(key, 1, Integer::sum)` in code where `m` is a `Map<String, Integer>`. What does it do?