Sets
Python set basics: literals, the empty-set {} trap, set algebra, deduplication, and frozenset for immutable membership.
A Python set is an unordered collection of unique, hashable values. Two elements are "the same" if they compare equal and share the same hash. Sets are useful for membership testing (x in s is O(1)), deduplication, and set-algebra operations. frozenset is the immutable variant - it can be used as a dict key or a member of another set.
Set literals use curly braces with values inside: {1, 2, 3}. There is one important trap: {} is an empty dict, not an empty set. To create an empty set you must write set(). This silent type confusion is a real source of bugs because {} and set() look similar in code review.
s = {1, 2, 3}
type(s) # <class 'set'>
# The empty-set trap
empty_dict = {} # dict, NOT a set
type(empty_dict) # <class 'dict'>
empty_set = set() # correct empty set
type(empty_set) # <class 'set'>
# Basic operations
s.add(4) # {1, 2, 3, 4}
s.remove(2) # {1, 3, 4} -- raises KeyError if missing
s.discard(99) # no-op if missing, no error
3 in s # True
len(s) # 3Set algebra uses operators that mirror the mathematical notation. Union (|) gives all elements from both sets. Intersection (&) gives only elements present in both. Difference (-) removes elements of the right set from the left. Symmetric difference (^) gives elements that are in one set but not the other.
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
a | b # {1, 2, 3, 4, 5, 6} -- union
a & b # {3, 4} -- intersection
a - b # {1, 2} -- difference (in a but not b)
b - a # {5, 6} -- difference (in b but not a)
a ^ b # {1, 2, 5, 6} -- symmetric difference
# Subset and superset checks
{1, 2} <= a # True (subset)
a >= {1, 2} # True (superset)
{1, 2} < a # True (proper subset, not equal)Converting a list to a set and back is the fastest way to deduplicate, but it does not preserve order. For an "all needles present?" check, set(needles) <= set(haystack) is cleaner than a loop and runs in O(n+m) time. frozenset is the immutable variant and can be used wherever a hashable value is needed.
# Deduplication (order not preserved)
xs = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
unique = list(set(xs))
# e.g. [1, 2, 3, 4, 5, 6, 9] -- order is arbitrary
# "All needles present?" check
required = {"name", "email", "age"}
provided = {"name", "email", "age", "country"}
required <= set(provided) # True
# frozenset as a dict key
roles = frozenset({"admin", "editor"})
permissions = {
frozenset({"admin"}): ["read", "write", "delete"],
frozenset({"admin", "editor"}): ["read", "write"],
}
permissions[roles] # ["read", "write"]In production
{} is an empty dict, not an empty set - use set() for empty sets; only {1, 2, 3} with elements is a set literal, and the silent type confusion bites code review. Sets require elements to be hashable, so lists and dicts cannot be set members; for a "set of records" reach for tuples of items or frozenset(d.items()).
Enjoyed this? Get more essays on software craft delivered to your inbox.
Subscribe free