Classes and Dataclasses
Python classes group state and behaviour; dataclasses generate the boilerplate; frozen=True and slots=True give you hashable, memory-efficient value objects.
A class is a blueprint for objects. The class block defines attributes (data) and methods (behaviour). Every instance gets its own copy of the attributes set in __init__. The @dataclass decorator (available since Python 3.7) generates __init__, __repr__, and __eq__ for you, cutting out a lot of boilerplate for classes whose job is mainly to hold data.
A class is defined with class Name:. The special method __init__ runs when you create an instance and receives the new object as its first argument, conventionally called self. You attach per-instance data with self.attribute = value. Other methods access instance data the same way.
class Counter:
def __init__(self, start: int = 0) -> None:
self._count = start
def increment(self, by: int = 1) -> None:
self._count += by
def value(self) -> int:
return self._count
c = Counter()
c.increment()
c.increment(by=5)
print(c.value()) # 6
c2 = Counter(start=10)
c2.increment()
print(c2.value()) # 11 -- c and c2 are independent instancesFor classes that mainly hold data - configuration, domain values, events - @dataclass generates __init__, __repr__, and __eq__ from the type-annotated class body. Adding frozen=True prevents attribute reassignment after construction and automatically generates __hash__, making instances usable as dict keys or set members. Adding slots=True (Python 3.10+) reduces memory usage by roughly 40%.
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class Point:
x: float
y: float
p1 = Point(1.0, 2.0)
p2 = Point(1.0, 2.0)
p3 = Point(3.0, 4.0)
print(p1) # Point(x=1.0, y=2.0)
print(p1 == p2) # True (__eq__ generated)
print(p1 == p3) # False
# frozen=True makes Point hashable
locations = {p1, p2, p3}
print(len(locations)) # 2 (p1 and p2 are equal)
# Mutation is blocked
# p1.x = 99.0 # raises FrozenInstanceErrorPython requires __eq__ and __hash__ to stay consistent: if two objects compare equal, they must produce the same hash. If you define __eq__ without __hash__, Python sets __hash__ to None, making instances unhashable. This breaks set membership and dict keys in ways that are easy to miss in tests. Dataclasses handle this rule automatically: frozen=True generates both; mutable dataclasses set __hash__ to None when __eq__ is generated.
class BadPoint:
def __init__(self, x, y):
self.x = x
self.y = y
def __eq__(self, other):
return isinstance(other, BadPoint) and self.x == other.x and self.y == other.y
# No __hash__ defined -- Python sets __hash__ = None
p = BadPoint(1, 2)
try:
{p} # TypeError: unhashable type: 'BadPoint'
except TypeError as e:
print(e)
# Fix option 1: add __hash__ = __eq__ partner
class GoodPoint:
def __init__(self, x, y):
self.x = x
self.y = y
def __eq__(self, other):
return isinstance(other, GoodPoint) and self.x == other.x and self.y == other.y
def __hash__(self):
return hash((self.x, self.y))
# Fix option 2 (recommended): use @dataclass(frozen=True)
@dataclass(frozen=True, slots=True)
class BestPoint:
x: float
y: floatIn production
Reach for @dataclass(frozen=True, slots=True) for value-shaped objects (data transfer objects, events, config) - slots=True cuts memory by roughly 40% and frozen=True makes them hashable as dict keys or set members. Class-level attributes are shared across all instances: class Foo: items = [] is a single list shared by every Foo; use __init__ for per-instance state. Overriding __eq__ without __hash__ makes the type unhashable, which breaks set and dict membership in non-obvious ways - dataclasses handle this consistently for you.
Enjoyed this? Get more essays on software craft delivered to your inbox.
Subscribe free