W3docs

Python Dictionary and Set Comprehensions

Master Python dictionary comprehensions, set comprehensions, and generator expressions with clear syntax, real examples, and common gotchas.

Dictionary comprehensions, set comprehensions, and generator expressions extend the compact syntax of list comprehension to other data structures. This chapter explains each form in depth — syntax, filtering, nesting, real-world patterns, and when to avoid them.

Dictionary Comprehensions

A dictionary comprehension builds a new dict from any iterable in a single expression. Instead of writing a for loop that calls d[key] = value on every iteration, you describe the mapping concisely inside curly braces.

Syntax

new_dict = {key_expr: value_expr for item in iterable}

Add an optional if filter after the iterable:

new_dict = {key_expr: value_expr for item in iterable if condition}
PartRole
key_exprExpression that produces each key
value_exprExpression that produces each value
itemLoop variable — takes each value from iterable in turn
if conditionOptional filter — skips items where condition is False

Basic Example: Building a Square Map

squares = {x: x ** 2 for x in range(1, 6)}
print(squares)
# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

The equivalent for loop:

squares = {}
for x in range(1, 6):
    squares[x] = x ** 2

Both produce the same result; the comprehension form is more concise and expresses intent at a glance.

Filtering Entries

Keep only the pairs that satisfy a condition:

prices = {"apple": 1.20, "banana": 0.50, "orange": 0.80, "grape": 2.50}

expensive = {item: price for item, price in prices.items() if price >= 1.00}
print(expensive)
# {'apple': 1.2, 'grape': 2.5}

Transforming Values

Apply a function or arithmetic expression to every value:

prices = {"apple": 1.20, "banana": 0.50, "orange": 0.80}

# Apply a 10% discount and round to 2 decimal places
discounted = {item: round(price * 0.9, 2) for item, price in prices.items()}
print(discounted)
# {'apple': 1.08, 'banana': 0.45, 'orange': 0.72}

Swapping Keys and Values

Invert a dictionary (works correctly when all values are unique):

capitals = {"France": "Paris", "Germany": "Berlin", "Japan": "Tokyo"}

inverted = {city: country for country, city in capitals.items()}
print(inverted)
# {'Paris': 'France', 'Berlin': 'Germany', 'Tokyo': 'Japan'}

If values are not unique, later entries overwrite earlier ones. Use this only when you are certain the values form a one-to-one mapping.

Building from Two Iterables with zip()

zip() pairs items from two sequences, making it easy to build a dictionary from separate key and value lists:

keys = ["name", "age", "city"]
values = ["Alice", 30, "Berlin"]

profile = {k: v for k, v in zip(keys, values)}
print(profile)
# {'name': 'Alice', 'age': 30, 'city': 'Berlin'}

This is equivalent to dict(zip(keys, values)), but the comprehension form makes it easy to add a filter or transform values at the same time.

Normalising Keys

A common real-world task is cleaning up dictionary keys — for example, lowercasing all keys when merging data from different sources:

raw = {"Name": "Alice", "AGE": 30, "City": "Berlin"}

normalised = {k.lower(): v for k, v in raw.items()}
print(normalised)
# {'name': 'Alice', 'age': 30, 'city': 'Berlin'}

Conditional Value Expression (Ternary in Value)

You can use a ternary expression in the value position to choose between two values per item:

scores = {"Alice": 95, "Bob": 62, "Carol": 78, "Dave": 45}

grades = {name: "pass" if score >= 70 else "fail" for name, score in scores.items()}
print(grades)
# {'Alice': 'pass', 'Bob': 'fail', 'Carol': 'pass', 'Dave': 'fail'}

Nested Dictionary Comprehensions

You can nest a comprehension to handle multi-level data. For example, extracting one field from a list of nested dictionaries:

students = [
    {"name": "Alice", "score": 95, "grade": "A"},
    {"name": "Bob",   "score": 82, "grade": "B"},
    {"name": "Carol", "score": 91, "grade": "A"},
]

# Build a {name: score} mapping from the list
name_to_score = {s["name"]: s["score"] for s in students}
print(name_to_score)
# {'Alice': 95, 'Bob': 82, 'Carol': 91}

Keep nesting shallow. If the logic becomes hard to follow, extract a helper function or use a plain for loop.

Set Comprehensions

A set comprehension builds a set in a single expression. Because sets contain only unique elements, duplicates are removed automatically.

Syntax

new_set = {expression for item in iterable}
new_set = {expression for item in iterable if condition}

The only visual difference from a dict comprehension is the absence of a colon — there is one expression, not a key: value pair.

Basic Example: Unique Squares

numbers = [1, 2, 2, 3, 3, 3, 4]
unique_squares = {x ** 2 for x in numbers}
print(unique_squares)
# {1, 4, 9, 16}

The output order is not guaranteed — sets are unordered, so do not rely on a specific display order.

Deduplicating While Transforming

A common pattern is to normalise strings and deduplicate in one step:

tags = ["Python", "python", "PYTHON", "Data", "data", "DATA"]

unique_tags = {tag.lower() for tag in tags}
print(unique_tags)
# {'python', 'data'}

Filtering with a Condition

words = ["cat", "elephant", "dog", "rhinoceros", "ant"]

long_words = {w for w in words if len(w) > 4}
print(long_words)
# {'elephant', 'rhinoceros'}

Finding Common Elements

Set comprehensions combine naturally with set operations:

list_a = [1, 2, 3, 4, 5, 5, 6]
list_b = [4, 5, 6, 7, 8]

set_a = {x for x in list_a}
set_b = {x for x in list_b}

common = set_a & set_b
print(common)
# {4, 5, 6}

For a straightforward conversion without transformation, set(list_a) is simpler. Use a set comprehension when you need to filter or transform during the conversion.

Generator Expressions

A generator expression looks like a list comprehension but uses parentheses instead of square brackets. Instead of building the entire collection in memory at once, it produces one value at a time, on demand (lazily).

Syntax

gen = (expression for item in iterable)
gen = (expression for item in iterable if condition)

Why Use a Generator Expression?

ScenarioList comprehensionGenerator expression
Need to iterate the result multiple timesYesNo — generators are exhausted after one pass
Need random access by index (result[3])YesNo
Result is passed to sum(), max(), any(), etc.Works but allocates a listPreferred — streams values without building a list
Very large or infinite sequenceCan run out of memoryEfficient — one value at a time

Basic Example

# List comprehension — builds the entire list in memory
squares_list = [x ** 2 for x in range(1, 6)]
print(squares_list)   # [1, 4, 9, 16, 25]

# Generator expression — yields values one at a time
squares_gen = (x ** 2 for x in range(1, 6))
print(squares_gen)    # <generator object <genexpr> at 0x...>
print(list(squares_gen))  # [1, 4, 9, 16, 25]

The generator object itself is not the list — you consume it by iterating or passing it to a function.

Passing Directly to Built-in Functions

When you pass a generator expression as the only argument to a function, you can omit the extra set of parentheses:

total = sum(x ** 2 for x in range(1, 1001))
print(total)  # 333833500

maximum = max(len(word) for word in ["apple", "banana", "kiwi"])
print(maximum)  # 6

any_negative = any(x < 0 for x in [1, -2, 3])
print(any_negative)  # True

Memory Advantage

For large data, a generator expression avoids loading everything into RAM:

# Simulate a large log file as a list of strings
log_lines = [f"ERROR line {i}" if i % 100 == 0 else f"INFO line {i}" for i in range(1_000_000)]

# Generator scans lines without building an intermediate list
error_count = sum(1 for line in log_lines if line.startswith("ERROR"))
print(error_count)  # 10000

Generators Are Exhausted After One Pass

This is the most common gotcha:

gen = (x * 2 for x in range(5))

print(list(gen))  # [0, 2, 4, 6, 8]
print(list(gen))  # [] — the generator is now empty

If you need to iterate the result more than once, either use a list comprehension or recreate the generator.

Chaining Generator Expressions

Generator expressions can be composed without creating intermediate lists. Each stage pulls values from the previous one:

numbers = range(1, 11)

# Stage 1: filter even numbers
evens = (x for x in numbers if x % 2 == 0)

# Stage 2: square them
even_squares = (x ** 2 for x in evens)

print(list(even_squares))  # [4, 16, 36, 64, 100]

No intermediate list is created — values flow through the pipeline one at a time.

Choosing the Right Form

You need…Use
A list of transformed/filtered values[expr for item in it]
A dictionary from pairs of values{k: v for item in it}
A collection with no duplicates{expr for item in it}
Memory-efficient single-pass iteration(expr for item in it)
Complex, multi-statement logic per itemPlain for loop

Common Gotchas

Dict comprehension vs. set comprehension. Both use curly braces {}. The difference is whether you write key: value (dict) or a single expression (set). An empty {} always creates an empty dict, not an empty set — use set() for an empty set.

d = {}     # empty dict
s = set()  # empty set — NOT {}

Overwritten keys. If the key expression produces duplicates, later values silently overwrite earlier ones:

data = [("a", 1), ("b", 2), ("a", 99)]
d = {k: v for k, v in data}
print(d)  # {'a': 99, 'b': 2} — first 'a' is gone

Variable scope. In Python 3, the loop variable in any comprehension is local to the comprehension and does not leak into the surrounding scope:

x = "original"
result = {x: x.upper() for x in ["a", "b", "c"]}
print(x)  # 'original' — comprehension's x did not overwrite this

Readability limit. If you need more than one if condition or the expression is long, a plain for loop with descriptive variable names is usually clearer:

# Hard to read
result = {k: v for k, v in data.items() if k.startswith("user_") if v is not None}

# Easier to read
result = {}
for k, v in data.items():
    if k.startswith("user_") and v is not None:
        result[k] = v

Practice

Practice
Which of the following statements about Python comprehensions is correct?
Which of the following statements about Python comprehensions is correct?
Was this page helpful?