Python Generators and yield
Learn Python generators and the yield keyword with clear examples covering generator functions, expressions, send(), and real-world use cases.
A generator is a special kind of iterator that produces values one at a time, on demand, instead of computing them all upfront. Generators are defined using ordinary function syntax with yield in place of return. They are the idiomatic Python solution for large or infinite sequences where building a full list would waste memory or time.
This chapter covers the yield keyword, generator functions versus lists, generator expressions, sending values into a generator, chaining generators, and real-world patterns.
What Is a Generator?
When Python calls a regular function it runs the body to completion and returns one value. When Python calls a generator function it does not run the body at all — it returns a generator object. Each time you call next() on that object, execution resumes from where it last paused (the yield statement), runs until the next yield, and suspends again.
def count_up(start, stop):
while start <= stop:
yield start # pause here, emit the value
start += 1
gen = count_up(1, 3)
print(next(gen)) # 1
print(next(gen)) # 2
print(next(gen)) # 3
# next(gen) would now raise StopIterationKey mechanics:
- The function body does not run until the first
next()call. - Local variables and the instruction pointer are preserved between calls.
- When the function body ends (or hits a bare
return), Python raisesStopIterationautomatically. - A
forloop callsnext()for you and stops cleanly onStopIteration.
The yield Keyword
yield is the only syntax that distinguishes a generator function from a regular one. You can use yield anywhere a return could appear, including inside loops, conditionals, and try/except blocks.
yield vs return
return | yield | |
|---|---|---|
| Function type | Regular | Generator |
| Execution after call | Runs to completion | Pauses at yield |
| State between calls | Discarded | Preserved |
| Multiple values | One (or a tuple) | One per yield, sequentially |
| Memory for large data | Holds all values | Holds one value at a time |
yield Suspends, Not Terminates
def three_things():
print("about to yield first")
yield "first"
print("about to yield second")
yield "second"
print("about to yield third")
yield "third"
print("generator exhausted")
for item in three_things():
print("got:", item)Output:
about to yield first
got: first
about to yield second
got: second
about to yield third
got: third
generator exhaustedNotice the print statements between yields — normal code runs between each suspension.
Generator Functions vs Lists
Consider generating the first n square numbers. Using a list:
def squares_list(n):
result = []
for i in range(1, n + 1):
result.append(i * i)
return result
print(squares_list(5)) # [1, 4, 9, 16, 25]Using a generator:
def squares_gen(n):
for i in range(1, n + 1):
yield i * i
gen = squares_gen(5)
print(list(gen)) # [1, 4, 9, 16, 25]Both produce the same values, but the generator version:
- Uses O(1) memory regardless of
n(the list version uses O(n)) - Starts producing values immediately, without waiting to build the whole collection
- Can represent infinite sequences (a list cannot)
When to Choose a Generator
Use a generator when:
- You only need to iterate once over the values.
- The sequence is large enough that holding it all in memory matters.
- You are building a data pipeline (one generator feeds into another).
- The sequence is potentially infinite (e.g., reading log lines from a live file).
Use a list when:
- You need random access by index.
- You need to iterate the same sequence multiple times.
- You need
len(), slicing, or in-place sorting.
Generator Expressions
A generator expression is to generators what a list comprehension is to lists. The syntax is identical except you use parentheses instead of square brackets:
# List comprehension — builds the full list immediately
squares_list = [x * x for x in range(1, 6)]
# Generator expression — lazy, produces one value at a time
squares_gen = (x * x for x in range(1, 6))
print(type(squares_list)) # <class 'list'>
print(type(squares_gen)) # <class 'generator'>
print(list(squares_gen)) # [1, 4, 9, 16, 25]Generator expressions are most useful when passed directly to a function that consumes an iterable:
total = sum(x * x for x in range(1, 101)) # sum of squares 1..100
print(total) # 338350No extra parentheses are needed when the generator expression is the only argument to a function call.
Filtering with Generator Expressions
evens = (x for x in range(20) if x % 2 == 0)
print(list(evens)) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]Infinite Generators
Because a generator produces values lazily, it can represent a sequence with no end. The classic example is an infinite counter:
def counter(start=0):
n = start
while True:
yield n
n += 1
gen = counter(10)
print(next(gen)) # 10
print(next(gen)) # 11
print(next(gen)) # 12To consume only part of an infinite generator, use itertools.islice or break out of a loop:
import itertools
gen = counter(1)
first_five = list(itertools.islice(gen, 5))
print(first_five) # [1, 2, 3, 4, 5]A practical infinite generator — the Fibonacci sequence:
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fib = fibonacci()
print([next(fib) for _ in range(10)])
# [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]yield from — Delegating to a Sub-Generator
yield from lets a generator delegate to another iterable, forwarding each value transparently:
def first_part():
yield 1
yield 2
def second_part():
yield 3
yield 4
def combined():
yield from first_part()
yield from second_part()
print(list(combined())) # [1, 2, 3, 4]yield from also works with any iterable, not just generators:
def flatten(nested):
for sublist in nested:
yield from sublist
data = [[1, 2], [3, 4], [5, 6]]
print(list(flatten(data))) # [1, 2, 3, 4, 5, 6]yield from is cleaner than a nested for loop over the sub-iterable, and it correctly forwards send() and throw() calls to the delegated generator (important for coroutine patterns).
Sending Values into a Generator
Generators are two-way channels. The .send(value) method resumes the generator and passes a value back in as the result of the yield expression:
def accumulator():
total = 0
while True:
value = yield total # yield sends total out; receives value in
if value is None:
break
total += value
gen = accumulator()
next(gen) # prime the generator (advance to first yield)
print(gen.send(10)) # 10
print(gen.send(20)) # 30
print(gen.send(5)) # 35Rules for .send():
- You must call
next(gen)(orgen.send(None)) once to advance the generator to the firstyieldbefore you can send a non-Nonevalue. send(None)is equivalent tonext().- The value sent becomes the result of the
yieldexpression on the left-hand side.
Generator State and Exhaustion
A generator object has a lifecycle with four states:
| State | Description |
|---|---|
| Created | Generator function called, body not yet started |
| Running | Currently executing (inside a next() or send() call) |
| Suspended | Paused at a yield; will resume on next next() |
| Closed | Body finished or .close() called; raises StopIteration |
Once exhausted, re-iterating a generator produces nothing:
gen = (x for x in range(3))
print(list(gen)) # [0, 1, 2]
print(list(gen)) # [] — already exhaustedIf you need to iterate a generator's output more than once, either convert it to a list first or recreate the generator.
return Inside a Generator
A return statement inside a generator ends iteration cleanly. The value passed to return becomes the value attribute of the StopIteration exception (rarely used directly, but important for yield from delegation):
def limited():
yield 1
yield 2
return "done" # StopIteration.value = "done"
gen = limited()
print(next(gen)) # 1
print(next(gen)) # 2
try:
next(gen)
except StopIteration as e:
print(e.value) # doneReal-World Patterns
Reading a Large File Line by Line
def read_lines(filepath):
with open(filepath) as f:
for line in f:
yield line.rstrip("\n")
# Memory usage stays constant regardless of file size
for line in read_lines("/etc/hosts"):
if line.startswith("#"):
continue
print(line)Building a Data Pipeline
Generators compose naturally into pipelines where each stage transforms the stream:
def integers(n):
for i in range(1, n + 1):
yield i
def only_even(nums):
for n in nums:
if n % 2 == 0:
yield n
def squared(nums):
for n in nums:
yield n * n
# Compose: even squares from 1..20
pipeline = squared(only_even(integers(20)))
print(list(pipeline))
# [4, 16, 36, 64, 100, 144, 196, 256, 324, 400]Each stage is lazy — values flow through the pipeline one at a time without building intermediate lists.
Chunking an Iterable
def chunks(iterable, size):
chunk = []
for item in iterable:
chunk.append(item)
if len(chunk) == size:
yield chunk
chunk = []
if chunk:
yield chunk
data = list(range(10))
for batch in chunks(data, 3):
print(batch)
# [0, 1, 2]
# [3, 4, 5]
# [6, 7, 8]
# [9]Generators vs Iterators vs Comprehensions
| Feature | Iterator class | Generator function | Generator expression |
|---|---|---|---|
| Syntax | Class with __iter__/__next__ | def + yield | (expr for x in ...) |
| Verbosity | High | Low | Very low |
| State management | Manual | Automatic | Automatic |
| Multi-statement logic | Yes | Yes | No (single expression) |
| Infinite sequences | Yes | Yes | Yes |
| Readable for complex logic | Yes | Yes | No |
For anything more than a simple transformation or filter, a generator function is more readable than a generator expression. For complex stateful iteration, a generator function is almost always preferable to writing a full iterator class — see Python Iterators for the class-based approach.
Generator expressions pair naturally with list comprehensions and dictionary/set comprehensions. Decorators can also wrap generator functions to add caching or tracing behavior.