Python by Example

Generators and Iterators

Generators produce values lazily with yield, iterate once, and pair with itertools for constant-memory pipelines.

A generator is a function that uses yield instead of return. Calling it does not run the body - it returns a generator object, which is an iterator that produces values one at a time when asked. This means Python never needs to hold the entire sequence in memory, making generators the language-level primitive for streaming and pipelines.

A generator function looks like a regular function but contains at least one yield statement. Each time the caller calls next() on the generator object (or the for loop does it automatically), the function body runs up to the next yield, pauses, and hands the value back. The function's local state is frozen between yields.

def squares(n):
    for i in range(n):
        yield i * i
 
gen = squares(5)
 
# Pull one value at a time
print(next(gen))  # 0
print(next(gen))  # 1
print(next(gen))  # 4
 
# Or iterate with a for loop (calls next() automatically)
for value in squares(5):
    print(value)
# 0
# 1
# 4
# 9
# 16

A generator can only be iterated once. After it has yielded all its values (or a return statement is reached), it is exhausted - any further call to next() raises StopIteration, and converting it to a list returns an empty list. If multiple callers each need a fresh sequence, pass the producing function, not the generator object itself.

def count_up(n):
    for i in range(n):
        yield i
 
gen = count_up(3)
print(list(gen))  # [0, 1, 2]
print(list(gen))  # []  -- generator is exhausted
 
# Fix: pass the function (or a lambda/partial) so each caller
# creates a fresh generator
def make_gen():
    return count_up(3)
 
print(list(make_gen()))  # [0, 1, 2]
print(list(make_gen()))  # [0, 1, 2]

The itertools module in the standard library provides building blocks for working with iterators. They are all lazy - they produce values on demand and never materialise an intermediate list. islice limits how many values you take, chain concatenates multiple iterables, and groupby groups consecutive elements by a key.

import itertools
 
# islice: take the first 3 values from an infinite counter
counter = itertools.count(start=10, step=2)  # 10, 12, 14, 16, ...
first_three = list(itertools.islice(counter, 3))
first_three  # [10, 12, 14]
 
# chain: iterate multiple iterables as one
a = [1, 2, 3]
b = [4, 5, 6]
together = list(itertools.chain(a, b))
together  # [1, 2, 3, 4, 5, 6]
 
# groupby: group consecutive items by a key (sort first if needed)
data = [
    {"dept": "eng", "name": "Alice"},
    {"dept": "eng", "name": "Bob"},
    {"dept": "sales", "name": "Carol"},
]
for dept, members in itertools.groupby(data, key=lambda x: x["dept"]):
    names = [m["name"] for m in members]
    print(dept, names)
# eng ['Alice', 'Bob']
# sales ['Carol']

In production

A generator can only be iterated once - g = (x for x in xs); list(g); list(g) returns an empty list the second time. Pass the producing function (make_gen()) when consumers need a fresh iterator, not the generator object itself. For pipelines over large or infinite streams, itertools.chain, islice, and groupby plus generator expressions stay constant-memory where the equivalent list-comprehension chain materialises every intermediate sequence.

Enjoyed this? Get more essays on software craft delivered to your inbox.

Subscribe free