How to Remove Duplicates from a List in Java
Remove duplicates from a Java list with a HashSet, LinkedHashSet, or distinct stream.
How to Remove Duplicates from a List in Java
A List in Java allows duplicate elements by design, so when you need each value to appear only once you have to remove the repeats yourself. This chapter shows the idiomatic ways to do that, with attention to whether the original insertion order is preserved.
Using a LinkedHashSet (order preserved)
The cleanest approach is to copy the list into a set, because a Set rejects duplicates automatically. Use LinkedHashSet rather than a plain HashSet so that the first-seen order of the elements is kept:
List<String> unique = new ArrayList<>(new LinkedHashSet<>(list));Wrapping the set back in an ArrayList gives you a List again, ready for indexing or further work. The LinkedHashSet does all the heavy lifting: as it is filled from the original list it silently drops any element it has already seen, while its linked structure remembers the order in which elements first arrived.
If you do not care about order, a plain HashSet is marginally faster and uses a little less memory. But it scrambles the element order, which is rarely what you want when displaying a list, so LinkedHashSet is the safe default.
Using the Stream API
From Java 8 onward, Stream.distinct() removes duplicates in a single, readable pipeline. Like LinkedHashSet, it keeps the encounter order of the elements:
List<String> unique = list.stream()
.distinct()
.collect(Collectors.toList());distinct() compares elements with equals() and hashCode(), exactly like a set does, so your objects must implement those methods correctly for custom types. This form shines when deduplication is one step in a larger pipeline — you can chain filter, map, or sorted around it without introducing a temporary collection.
Comparing the approaches
Both common techniques rely on equals/hashCode and both preserve insertion order; the difference is mostly style and context.
| Approach | Order kept? | Best when |
|---|---|---|
LinkedHashSet | Yes | A quick, dependency-free one-liner |
HashSet | No | Order does not matter and speed is critical |
stream().distinct() | Yes | Deduplication is part of a larger stream pipeline |
A key point for all of them: they build a new collection rather than mutating the source. If you need to deduplicate in place, you can instead clear the list and re-add the unique elements, or assign the result back to the same variable.
Worked example
What to take from the run:
- The original list keeps all 7 elements, including the repeated
javaandsql, because aListpermits duplicates. - The
LinkedHashSetresult has only 4 elements —[java, sql, api, rest]— and they appear in first-seen order, not sorted or shuffled. - The
stream().distinct()result is identical in both size and order, confirming the two techniques are interchangeable here. deduped.equals(viaStream)printstrue, since two lists are equal when they hold the same elements in the same order.- The original
tagslist is unchanged, so the dedup operations produced new lists rather than mutating the source.
Practice
Which collection type removes duplicates while preserving the original insertion order of the elements?