How to Remove Duplicates from a List in Java

A List in Java allows duplicate elements by design, so when you need each value to appear only once you have to remove the repeats yourself. This chapter shows the idiomatic ways to do that, with attention to whether the original insertion order is preserved.

Using a LinkedHashSet (order preserved)

The cleanest approach is to copy the list into a set, because a Set rejects duplicates automatically. Use LinkedHashSet rather than a plain HashSet so that the first-seen order of the elements is kept:

List<String> unique = new ArrayList<>(new LinkedHashSet<>(list));

Wrapping the set back in an ArrayList gives you a List again, ready for indexing or further work. The LinkedHashSet does all the heavy lifting: as it is filled from the original list it silently drops any element it has already seen, while its linked structure remembers the order in which elements first arrived.

If you do not care about order, a plain HashSet is marginally faster and uses a little less memory. But it scrambles the element order, which is rarely what you want when displaying a list, so LinkedHashSet is the safe default.

Using the Stream API

From Java 8 onward, Stream.distinct() removes duplicates in a single, readable pipeline. Like LinkedHashSet, it keeps the encounter order of the elements:

List<String> unique = list.stream()
        .distinct()
        .collect(Collectors.toList());

distinct() compares elements with equals() and hashCode(), exactly like a set does, so your objects must implement those methods correctly for custom types. This form shines when deduplication is one step in a larger pipeline — you can chain filter, map, or sorted around it without introducing a temporary collection.

Comparing the approaches

Both common techniques rely on equals/hashCode and both preserve insertion order; the difference is mostly style and context.

Approach	Order kept?	Best when
`LinkedHashSet`	Yes	A quick, dependency-free one-liner
`HashSet`	No	Order does not matter and speed is critical
`stream().distinct()`	Yes	Deduplication is part of a larger stream pipeline

A key point for all of them: they build a new collection rather than mutating the source. If you need to deduplicate in place, you can instead clear the list and re-add the unique elements, or assign the result back to the same variable.

Worked example

java— editable, runs on the server

import java.util.ArrayList;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.stream.Collectors;

public class RemoveDuplicates {
    public static void main(String[] args) {
        List<String> tags = new ArrayList<>(List.of(
                "java", "sql", "java", "api", "sql", "java", "rest"));
        System.out.println("Original (" + tags.size() + "): " + tags);

// 1. LinkedHashSet keeps first-seen order, drops duplicates.
        List<String> deduped = new ArrayList<>(new LinkedHashSet<>(tags));
        System.out.println("Deduped  (" + deduped.size() + "): " + deduped);

// 2. Streams with distinct() do the same thing, order preserved.
        List<String> viaStream = tags.stream().distinct().collect(Collectors.toList());
        System.out.println("Stream   (" + viaStream.size() + "): " + viaStream);

// 3. Both approaches produce equal results.
        System.out.println("Same result? " + deduped.equals(viaStream));

// 4. The original list is untouched; we built new lists.
        System.out.println("Original still has duplicates? "
                + (tags.size() != new LinkedHashSet<>(tags).size()));
    }
}

What to take from the run:

The original list keeps all 7 elements, including the repeated java and sql, because a List permits duplicates.
The LinkedHashSet result has only 4 elements — [java, sql, api, rest] — and they appear in first-seen order, not sorted or shuffled.
The stream().distinct() result is identical in both size and order, confirming the two techniques are interchangeable here.
deduped.equals(viaStream) prints true, since two lists are equal when they hold the same elements in the same order.
The original tags list is unchanged, so the dedup operations produced new lists rather than mutating the source.

Practice

Which collection type removes duplicates while preserving the original insertion order of the elements?

LinkedHashSetHashSetArrayListPriorityQueue