W3docs

Python Dataclasses

Learn Python dataclasses: the @dataclass decorator, field() defaults, ordering, immutability, and inheritance with clear, runnable examples.

A dataclass is a regular Python class whose boilerplate — __init__, __repr__, and __eq__ — is generated automatically by the @dataclass decorator. The result is less code, fewer typos, and classes that are immediately readable.

This chapter covers:

  • Why dataclasses exist and when to use them
  • The @dataclass decorator
  • Field defaults and the field() helper
  • Controlling equality and ordering
  • Immutable dataclasses with frozen=True
  • Post-initialisation logic with __post_init__
  • Inheritance with dataclasses
  • Dataclasses vs. NamedTuple vs. plain classes

Before reading this chapter, make sure you are comfortable with Python classes and objects and Python inheritance.

Why Dataclasses?

Consider a class that stores a product in an online shop. Without dataclasses, you write the same attribute assignments three times — once in __init__, once in __repr__, and once in __eq__:

class Product:
    def __init__(self, name, price, stock):
        self.name = name
        self.price = price
        self.stock = stock

    def __repr__(self):
        return f"Product(name={self.name!r}, price={self.price}, stock={self.stock})"

    def __eq__(self, other):
        if not isinstance(other, Product):
            return NotImplemented
        return (self.name, self.price, self.stock) == (other.name, other.price, other.stock)

The @dataclass decorator generates all of the above from a single annotated list of fields:

from dataclasses import dataclass

@dataclass
class Product:
    name: str
    price: float
    stock: int

Both versions behave identically. The dataclass version is shorter, harder to get wrong, and instantly communicates that this class is primarily a data container.

The @dataclass Decorator

Import dataclass from the standard-library dataclasses module and apply it to your class. Each field is declared as a type-annotated class variable:

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float

p = Point(1.5, 2.0)
print(p)          # Point(x=1.5, y=2.0)
print(p.x)        # 1.5

p2 = Point(1.5, 2.0)
print(p == p2)    # True  — __eq__ compares field by field

The decorator generates:

MethodWhat it does
__init__Accepts each field as a parameter and assigns it to self
__repr__Returns a readable string like Point(x=1.5, y=2.0)
__eq__Compares two instances field by field

Type annotations are required but not enforced at runtime

Field declarations require a type annotation (x: float). Python does not check the type at runtime — you can still pass a string where a float is expected. The annotation is metadata used by type-checkers such as mypy and by the dataclasses machinery itself. For runtime type validation, see Python Type Hints.

Default Values

Assign a default value directly on the field to make it optional in __init__:

from dataclasses import dataclass

@dataclass
class Config:
    host: str = "localhost"
    port: int = 8080
    debug: bool = False

c1 = Config()
print(c1)   # Config(host='localhost', port=8080, debug=False)

c2 = Config(host="example.com", port=443)
print(c2)   # Config(host='example.com', port=443, debug=False)

Fields with defaults must appear after fields without defaults — exactly the same rule as for regular function parameters.

Mutable defaults and field()

You cannot use a mutable object (a list, dict, or set) as a plain default value. Python would share one list among all instances, which leads to subtle bugs:

from dataclasses import dataclass

# This raises a ValueError at class definition time:
# @dataclass
# class Bag:
#     items: list = []   # ValueError: mutable default is not allowed

Instead, use field(default_factory=...) to create a fresh object for every instance:

from dataclasses import dataclass, field

@dataclass
class Bag:
    items: list = field(default_factory=list)

b1 = Bag()
b2 = Bag()
b1.items.append("apple")

print(b1.items)   # ['apple']
print(b2.items)   # []  — b2 has its own separate list

default_factory accepts any zero-argument callable, including lambdas and your own functions.

The field() Helper

field() gives you fine-grained control over individual fields. Its most useful parameters are:

ParameterPurpose
defaultA simple default value (scalar only)
default_factoryA callable that produces the default
reprFalse to exclude this field from __repr__
compareFalse to exclude this field from __eq__ (and ordering)
initFalse to exclude this field from __init__
from dataclasses import dataclass, field
import time

@dataclass
class LogEntry:
    message: str
    level: str = "INFO"
    timestamp: float = field(default_factory=time.time, repr=False, compare=False)

entry = LogEntry("Server started")
print(entry)               # LogEntry(message='Server started', level='INFO')
# timestamp exists but is hidden from repr and ignored in comparisons
print(entry.timestamp > 0) # True

Ordering

By default dataclasses support equality (==, !=) but not ordering (<, >, <=, >=). Enable ordering by passing order=True to the decorator:

from dataclasses import dataclass

@dataclass(order=True)
class Version:
    major: int
    minor: int
    patch: int

v1 = Version(1, 2, 0)
v2 = Version(1, 3, 0)
v3 = Version(1, 2, 0)

print(v1 < v2)    # True
print(v1 == v3)   # True
print(v2 > v1)    # True

versions = [Version(2, 0, 0), Version(1, 9, 1), Version(1, 2, 3)]
print(sorted(versions))
# [Version(major=1, minor=2, patch=3),
#  Version(major=1, minor=9, patch=1),
#  Version(major=2, minor=0, patch=0)]

Python generates the comparison methods by comparing fields in the order they are declared, tuple-style. You can exclude a field from comparisons with field(compare=False).

Immutable Dataclasses with frozen=True

Pass frozen=True to make all fields read-only after creation. Any attempt to change a field raises a FrozenInstanceError:

from dataclasses import dataclass

@dataclass(frozen=True)
class Coordinate:
    lat: float
    lon: float

london = Coordinate(51.5074, -0.1278)
print(london)        # Coordinate(lat=51.5074, lon=-0.1278)

# london.lat = 0.0  # FrozenInstanceError: cannot assign to field 'lat'

Frozen dataclasses are also hashable (they implement __hash__), so you can use them as dictionary keys or set members:

from dataclasses import dataclass

@dataclass(frozen=True)
class Coordinate:
    lat: float
    lon: float

cities = {
    Coordinate(51.5074, -0.1278): "London",
    Coordinate(48.8566,  2.3522): "Paris",
}
print(cities[Coordinate(51.5074, -0.1278)])   # London

Regular (mutable) dataclasses are not hashable by default — Python sets __hash__ to None when __eq__ is defined without frozen=True.

Post-Initialisation Logic with __post_init__

Sometimes you need to derive a field's value from other fields, or validate the input after __init__ runs. Define a __post_init__ method — it is called automatically at the end of the generated __init__:

from dataclasses import dataclass, field
import math

@dataclass
class Circle:
    radius: float

    def __post_init__(self):
        if self.radius <= 0:
            raise ValueError(f"radius must be positive, got {self.radius}")

    @property
    def area(self):
        return math.pi * self.radius ** 2

c = Circle(5)
print(round(c.area, 4))   # 78.5398

# Circle(-1)  # ValueError: radius must be positive, got -1

You can also compute a derived field. Mark it with field(init=False) so it does not appear in __init__, then set it inside __post_init__:

from dataclasses import dataclass, field

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False, repr=True)

    def __post_init__(self):
        self.area = self.width * self.height

r = Rectangle(4, 6)
print(r)         # Rectangle(width=4, height=6, area=24)
print(r.area)    # 24

Inheritance with Dataclasses

A dataclass can inherit from another dataclass. The child class's __init__ includes fields from both classes — parent fields first, in the order they were declared:

from dataclasses import dataclass

@dataclass
class Animal:
    name: str
    age: int

@dataclass
class Dog(Animal):
    breed: str

rex = Dog(name="Rex", age=3, breed="Labrador")
print(rex)    # Dog(name='Rex', age=3, breed='Labrador')

Gotcha: if a parent class has a field with a default, all child fields must also have defaults. This is the same rule that applies to regular Python function signatures — a parameter without a default cannot follow one with a default.

from dataclasses import dataclass

@dataclass
class Animal:
    name: str
    age: int = 0   # has a default

# @dataclass
# class Dog(Animal):
#     breed: str   # TypeError: non-default argument 'breed' follows default argument

Work around this by giving the child field a default too, or by restructuring the hierarchy so default-having fields come last.

Decorator Parameters at a Glance

@dataclass(
    init=True,     # generate __init__       (default True)
    repr=True,     # generate __repr__       (default True)
    eq=True,       # generate __eq__         (default True)
    order=False,   # generate <, >, <=, >=   (default False)
    frozen=False,  # make fields immutable   (default False)
)
class MyClass:
    ...

You rarely need to touch most of these. The common ones are order=True and frozen=True.

Utility Functions

The dataclasses module also provides three handy functions:

fields()

Returns a tuple of Field objects describing every field in the class:

from dataclasses import dataclass, fields

@dataclass
class Point:
    x: float
    y: float

for f in fields(Point):
    print(f.name, f.type)
# x <class 'float'>
# y <class 'float'>

asdict()

Converts a dataclass instance to a plain dictionary (recursively):

from dataclasses import dataclass, asdict

@dataclass
class Address:
    street: str
    city: str

@dataclass
class Person:
    name: str
    address: Address

p = Person("Alice", Address("10 Downing St", "London"))
print(asdict(p))
# {'name': 'Alice', 'address': {'street': '10 Downing St', 'city': 'London'}}

This is useful when serialising to JSON or sending data to an API.

astuple()

Converts to a tuple (recursively):

from dataclasses import dataclass, astuple

@dataclass
class Point:
    x: float
    y: float

p = Point(3.0, 4.0)
print(astuple(p))   # (3.0, 4.0)

Dataclasses vs. NamedTuple vs. Plain Classes

FeaturePlain classNamedTupledataclass
Auto __init__NoYesYes
Auto __repr__NoYesYes
Auto __eq__NoYes (by value)Yes (by value)
MutableYesNoYes (default)
HashableNo (if __eq__ defined)YesOnly with frozen=True
OrderingManualYesorder=True
InheritanceYesLimitedYes
isinstance checkYesYes (also tuple)Yes
Unpacking (a, b = obj)NoYesNo

Use a dataclass when:

  • You want mutable data with optional immutability.
  • You need inheritance or post-init logic.
  • You want fine-grained field control (field()).

Use NamedTuple when:

  • You want an immutable record that also behaves as a tuple (positional unpacking, CSV rows).
  • You need compatibility with code that expects tuples.

Use a plain class when:

  • The class has significant behaviour and very little plain data.
  • You need a custom __init__ that cannot be expressed through __post_init__.

Common Gotchas

Mutable defaults. Using a list or dict as a plain default raises ValueError at class-definition time. Always use field(default_factory=...).

Hashing. Regular dataclasses are not hashable. If you need them as dict keys or in sets, use frozen=True or pass unsafe_hash=True (rarely recommended).

eq=False. If you disable equality generation (eq=False), Python falls back to identity comparison (is), which is almost never what you want for data objects.

Inherited defaults ordering. If a parent field has a default and a child field does not, Python raises a TypeError. Plan the field ordering in your hierarchy carefully.

Summary

ConceptWhat it does
@dataclassGenerates __init__, __repr__, __eq__ automatically
field()Fine-grained field control: defaults, repr, compare, init
default_factoryProvides a fresh mutable default for each instance
order=TrueAdds <, >, <=, >= based on field order
frozen=TrueMakes fields read-only and the instance hashable
__post_init__Runs after __init__ for validation or derived fields
fields()Returns metadata about every field
asdict()Converts instance to a plain dict (recursively)
astuple()Converts instance to a plain tuple (recursively)

For related topics, see Python classes and objects, Python inheritance, and Python abstract base classes.

Practice

Practice
Which decorator do you use to create a dataclass in Python?
Which decorator do you use to create a dataclass in Python?
Was this page helpful?