How to Remove Duplicates from a Python List

Duplicate values in a Python list are common when collecting user input, merging datasets, or reading files. This guide covers five practical techniques for removing duplicates — each with a different trade-off between speed, order preservation, and readability.

Method	Preserves order	Works with unhashable items	Readable
`set()`	No	No	Yes
`dict.fromkeys()`	Yes (Python 3.7+)	No	Yes
For-loop with seen set	Yes	No	Medium
List comprehension	Yes	No	Medium
`Counter` (find dups)	Yes	No	Yes

Using `set()` to Remove Duplicates

Converting a list to a set is the fastest and most concise way to deduplicate. A set stores only unique, hashable values, so any duplicates are automatically discarded.

python— editable, runs on the server

Output (order may vary):

[1, 2, 3, 4, 5]

When to use it: order does not matter and the list contains only hashable elements (numbers, strings, tuples).

Gotcha: sets are unordered. Even though small integer sets often print in sorted order, you cannot rely on it. If order matters, use one of the methods below.

Using `dict.fromkeys()` to Preserve Order

dict.fromkeys() creates a dictionary whose keys are the list elements. Because dictionary keys are unique and, since Python 3.7, insertion-ordered, this removes duplicates while keeping the original order.

python— editable, runs on the server

Output:

[1, 2, 3, 4, 5]

This is the idiomatic one-liner for order-preserving deduplication in modern Python (3.7+). It is also slightly faster than an explicit loop because the dictionary operations happen in C.

Using a For-Loop with a Seen Set

When you want full control — for example, to log skipped duplicates or apply custom equality logic — an explicit loop is the clearest approach.

def remove_duplicates(lst):
    seen = set()
    result = []
    for item in lst:
        if item not in seen:
            seen.add(item)
            result.append(item)
    return result

my_list = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
print(remove_duplicates(my_list))

Output:

[3, 1, 4, 5, 9, 2, 6]

This preserves insertion order and runs in O(n) time — the in check on a set is O(1). Compare this with checking if item not in result, which is O(n) per element and makes the whole function O(n²).

Using List Comprehension

You can write the same seen-set pattern as a one-liner using a list comprehension. The trick is that set.add() always returns None (falsy), so not (x in seen or seen.add(x)) is True only the first time each value appears.

my_list = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
seen = set()
unique = [x for x in my_list if not (x in seen or seen.add(x))]
print(unique)

Output:

[3, 1, 4, 5, 9, 2, 6]

This is compact but relies on a side effect inside the comprehension, which can be surprising to readers. The explicit for-loop above is often preferred in team code.

Using `Counter` to Find Which Values Are Duplicated

Sometimes you need to know which values appear more than once rather than simply removing them. collections.Counter counts occurrences and makes this easy.

from collections import Counter

my_list = [1, 2, 2, 3, 4, 4, 5, 5, 5]
counts = Counter(my_list)
print(counts)

duplicates = [item for item, count in counts.items() if count > 1]
print("Duplicated values:", duplicates)

Output:

Counter({5: 3, 2: 2, 4: 2, 1: 1, 3: 1})
Duplicated values: [2, 4, 5]

To get a deduplicated list from a Counter, use list(counts.keys()) — keys preserve insertion order in Python 3.7+.

Removing Duplicates from a DataFrame with Pandas

If you are working with tabular data, the Pandas library provides DataFrame.drop_duplicates(). It supports fine-grained control through its parameters.

import pandas as pd

data = {
    'name': ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob'],
    'score': [90, 85, 90, 78, 85],
}
df = pd.DataFrame(data)
df_unique = df.drop_duplicates()
print(df_unique)

Output:

      name  score
0    Alice     90
1      Bob     85
3  Charlie     78

Key parameters:

subset — a column name or list of column names to consider. Duplicates are detected only within those columns.
keep — 'first' (default) keeps the first occurrence; 'last' keeps the last; False drops all rows that have duplicates.
inplace=True — modifies the DataFrame in place instead of returning a new one.

# Keep only the last occurrence of each name
df_last = df.drop_duplicates(subset='name', keep='last')
print(df_last)

Output:

      name  score
2    Alice     90
3  Charlie     78
4      Bob     85

Choosing the Right Method

Fastest, order does not matter — set().
Order matters, one-liner — dict.fromkeys().
Custom logic or logging — explicit for-loop with a seen set.
Tabular data — pandas.DataFrame.drop_duplicates().
Need to inspect which values are duplicated — collections.Counter.

For more on working with lists see the Python Lists chapter and the full list methods reference. To learn about sets and their operations, see Python Sets and Set Methods.

Using set() to Remove Duplicates

Using dict.fromkeys() to Preserve Order

Using a For-Loop with a Seen Set

Using List Comprehension

Using Counter to Find Which Values Are Duplicated

Removing Duplicates from a DataFrame with Pandas

Choosing the Right Method

Using `set()` to Remove Duplicates

Using `dict.fromkeys()` to Preserve Order

Using `Counter` to Find Which Values Are Duplicated