Python Dictionary and Set Comprehensions Explained

Dictionary comprehensions, set comprehensions, and generator expressions extend the compact syntax of list comprehension to other data structures. This chapter explains each form in depth — syntax, filtering, nesting, real-world patterns, and when to avoid them.

Dictionary Comprehensions

A dictionary comprehension builds a new dict from any iterable in a single expression. Instead of writing a for loop that calls d[key] = value on every iteration, you describe the mapping concisely inside curly braces.

Syntax

new_dict = {key_expr: value_expr for item in iterable}

Add an optional if filter after the iterable:

new_dict = {key_expr: value_expr for item in iterable if condition}

Part	Role
`key_expr`	Expression that produces each key
`value_expr`	Expression that produces each value
`item`	Loop variable — takes each value from `iterable` in turn
`if condition`	Optional filter — skips items where `condition` is `False`

Basic Example: Building a Square Map

squares = {x: x ** 2 for x in range(1, 6)}
print(squares)
# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

The equivalent for loop:

squares = {}
for x in range(1, 6):
    squares[x] = x ** 2

Both produce the same result; the comprehension form is more concise and expresses intent at a glance.

Filtering Entries

Keep only the pairs that satisfy a condition:

prices = {"apple": 1.20, "banana": 0.50, "orange": 0.80, "grape": 2.50}

expensive = {item: price for item, price in prices.items() if price >= 1.00}
print(expensive)
# {'apple': 1.2, 'grape': 2.5}

Transforming Values

Apply a function or arithmetic expression to every value:

prices = {"apple": 1.20, "banana": 0.50, "orange": 0.80}

# Apply a 10% discount and round to 2 decimal places
discounted = {item: round(price * 0.9, 2) for item, price in prices.items()}
print(discounted)
# {'apple': 1.08, 'banana': 0.45, 'orange': 0.72}

Swapping Keys and Values

Invert a dictionary (works correctly when all values are unique):

capitals = {"France": "Paris", "Germany": "Berlin", "Japan": "Tokyo"}

inverted = {city: country for country, city in capitals.items()}
print(inverted)
# {'Paris': 'France', 'Berlin': 'Germany', 'Tokyo': 'Japan'}

If values are not unique, later entries overwrite earlier ones. Use this only when you are certain the values form a one-to-one mapping.

Building from Two Iterables with `zip()`

zip() pairs items from two sequences, making it easy to build a dictionary from separate key and value lists:

keys = ["name", "age", "city"]
values = ["Alice", 30, "Berlin"]

profile = {k: v for k, v in zip(keys, values)}
print(profile)
# {'name': 'Alice', 'age': 30, 'city': 'Berlin'}

This is equivalent to dict(zip(keys, values)), but the comprehension form makes it easy to add a filter or transform values at the same time.

Normalising Keys

A common real-world task is cleaning up dictionary keys — for example, lowercasing all keys when merging data from different sources:

raw = {"Name": "Alice", "AGE": 30, "City": "Berlin"}

normalised = {k.lower(): v for k, v in raw.items()}
print(normalised)
# {'name': 'Alice', 'age': 30, 'city': 'Berlin'}

Conditional Value Expression (Ternary in Value)

You can use a ternary expression in the value position to choose between two values per item:

scores = {"Alice": 95, "Bob": 62, "Carol": 78, "Dave": 45}

grades = {name: "pass" if score >= 70 else "fail" for name, score in scores.items()}
print(grades)
# {'Alice': 'pass', 'Bob': 'fail', 'Carol': 'pass', 'Dave': 'fail'}

Nested Dictionary Comprehensions

You can nest a comprehension to handle multi-level data. For example, extracting one field from a list of nested dictionaries:

students = [
    {"name": "Alice", "score": 95, "grade": "A"},
    {"name": "Bob",   "score": 82, "grade": "B"},
    {"name": "Carol", "score": 91, "grade": "A"},
]

# Build a {name: score} mapping from the list
name_to_score = {s["name"]: s["score"] for s in students}
print(name_to_score)
# {'Alice': 95, 'Bob': 82, 'Carol': 91}

Keep nesting shallow. If the logic becomes hard to follow, extract a helper function or use a plain for loop.

Set Comprehensions

A set comprehension builds a set in a single expression. Because sets contain only unique elements, duplicates are removed automatically.

Syntax

new_set = {expression for item in iterable}
new_set = {expression for item in iterable if condition}

The only visual difference from a dict comprehension is the absence of a colon — there is one expression, not a key: value pair.

Basic Example: Unique Squares

numbers = [1, 2, 2, 3, 3, 3, 4]
unique_squares = {x ** 2 for x in numbers}
print(unique_squares)
# {1, 4, 9, 16}

The output order is not guaranteed — sets are unordered, so do not rely on a specific display order.

Deduplicating While Transforming

A common pattern is to normalise strings and deduplicate in one step:

tags = ["Python", "python", "PYTHON", "Data", "data", "DATA"]

unique_tags = {tag.lower() for tag in tags}
print(unique_tags)
# {'python', 'data'}

Filtering with a Condition

words = ["cat", "elephant", "dog", "rhinoceros", "ant"]

long_words = {w for w in words if len(w) > 4}
print(long_words)
# {'elephant', 'rhinoceros'}

Finding Common Elements

Set comprehensions combine naturally with set operations:

list_a = [1, 2, 3, 4, 5, 5, 6]
list_b = [4, 5, 6, 7, 8]

set_a = {x for x in list_a}
set_b = {x for x in list_b}

common = set_a & set_b
print(common)
# {4, 5, 6}

For a straightforward conversion without transformation, set(list_a) is simpler. Use a set comprehension when you need to filter or transform during the conversion.

Generator Expressions

A generator expression looks like a list comprehension but uses parentheses instead of square brackets. Instead of building the entire collection in memory at once, it produces one value at a time, on demand (lazily).

Syntax

gen = (expression for item in iterable)
gen = (expression for item in iterable if condition)

Why Use a Generator Expression?

Scenario	List comprehension	Generator expression
Need to iterate the result multiple times	Yes	No — generators are exhausted after one pass
Need random access by index (`result[3]`)	Yes	No
Result is passed to `sum()`, `max()`, `any()`, etc.	Works but allocates a list	Preferred — streams values without building a list
Very large or infinite sequence	Can run out of memory	Efficient — one value at a time

Basic Example

# List comprehension — builds the entire list in memory
squares_list = [x ** 2 for x in range(1, 6)]
print(squares_list)   # [1, 4, 9, 16, 25]

# Generator expression — yields values one at a time
squares_gen = (x ** 2 for x in range(1, 6))
print(squares_gen)    # <generator object <genexpr> at 0x...>
print(list(squares_gen))  # [1, 4, 9, 16, 25]

The generator object itself is not the list — you consume it by iterating or passing it to a function.

Passing Directly to Built-in Functions

When you pass a generator expression as the only argument to a function, you can omit the extra set of parentheses:

total = sum(x ** 2 for x in range(1, 1001))
print(total)  # 333833500

maximum = max(len(word) for word in ["apple", "banana", "kiwi"])
print(maximum)  # 6

any_negative = any(x < 0 for x in [1, -2, 3])
print(any_negative)  # True

Memory Advantage

For large data, a generator expression avoids loading everything into RAM:

# Simulate a large log file as a list of strings
log_lines = [f"ERROR line {i}" if i % 100 == 0 else f"INFO line {i}" for i in range(1_000_000)]

# Generator scans lines without building an intermediate list
error_count = sum(1 for line in log_lines if line.startswith("ERROR"))
print(error_count)  # 10000

Generators Are Exhausted After One Pass

This is the most common gotcha:

gen = (x * 2 for x in range(5))

print(list(gen))  # [0, 2, 4, 6, 8]
print(list(gen))  # [] — the generator is now empty

If you need to iterate the result more than once, either use a list comprehension or recreate the generator.

Chaining Generator Expressions

Generator expressions can be composed without creating intermediate lists. Each stage pulls values from the previous one:

numbers = range(1, 11)

# Stage 1: filter even numbers
evens = (x for x in numbers if x % 2 == 0)

# Stage 2: square them
even_squares = (x ** 2 for x in evens)

print(list(even_squares))  # [4, 16, 36, 64, 100]

No intermediate list is created — values flow through the pipeline one at a time.

Choosing the Right Form

You need…	Use
A list of transformed/filtered values	`[expr for item in it]`
A dictionary from pairs of values	`{k: v for item in it}`
A collection with no duplicates	`{expr for item in it}`
Memory-efficient single-pass iteration	`(expr for item in it)`
Complex, multi-statement logic per item	Plain `for` loop

Common Gotchas

Dict comprehension vs. set comprehension. Both use curly braces {}. The difference is whether you write key: value (dict) or a single expression (set). An empty {} always creates an empty dict, not an empty set — use set() for an empty set.

d = {}     # empty dict
s = set()  # empty set — NOT {}

Overwritten keys. If the key expression produces duplicates, later values silently overwrite earlier ones:

data = [("a", 1), ("b", 2), ("a", 99)]
d = {k: v for k, v in data}
print(d)  # {'a': 99, 'b': 2} — first 'a' is gone

Variable scope. In Python 3, the loop variable in any comprehension is local to the comprehension and does not leak into the surrounding scope:

x = "original"
result = {x: x.upper() for x in ["a", "b", "c"]}
print(x)  # 'original' — comprehension's x did not overwrite this

Readability limit. If you need more than one if condition or the expression is long, a plain for loop with descriptive variable names is usually clearer:

# Hard to read
result = {k: v for k, v in data.items() if k.startswith("user_") if v is not None}

# Easier to read
result = {}
for k, v in data.items():
    if k.startswith("user_") and v is not None:
        result[k] = v

List Comprehension — the foundation: syntax, filtering, nested loops, and when to use a plain for loop
Python Dictionaries — dictionary basics, creation, and access
Python Sets — set creation, operations, and use cases
Loop Dictionaries — every technique for iterating over dictionaries
Python Iterators — how Python's iterator protocol works under the hood

Practice

Which of the following statements about Python comprehensions is correct?

A set comprehension uses {} with a single expression and automatically removes duplicates.A dictionary comprehension uses [] with a key: value expression.A generator expression builds the entire result in memory before returning it.Generator expressions are exhausted after one iteration and cannot be reused.An empty {} literal creates an empty set in Python.Dictionary comprehensions can include an optional if condition to filter entries.

Dictionary Comprehensions

Syntax

Basic Example: Building a Square Map

Filtering Entries

Transforming Values

Swapping Keys and Values

Building from Two Iterables with zip()

Normalising Keys

Conditional Value Expression (Ternary in Value)

Nested Dictionary Comprehensions

Set Comprehensions

Syntax

Basic Example: Unique Squares

Deduplicating While Transforming

Filtering with a Condition

Finding Common Elements

Generator Expressions

Syntax

Why Use a Generator Expression?

Basic Example

Passing Directly to Built-in Functions

Memory Advantage

Generators Are Exhausted After One Pass

Chaining Generator Expressions

Choosing the Right Form

Common Gotchas

Related Topics

Practice

Building from Two Iterables with `zip()`