Pydantic v2 — A Complete Guide
Pydantic is a data validation and parsing library for Python that uses standard type hints to define the shape and constraints of your data. It is one of the most downloaded Python packages in the world, and for good reason: it eliminates an enormous amount of manual validation boilerplate while making your data contracts explicit and self-documenting.
This guide covers Pydantic v2, which was a significant rewrite of the original library. It is faster, more expressive, and more consistent in its design. All examples target Python 3.10 and above.
Phase 1: Foundations
Chapter 1: What is Pydantic?
Before writing any code, it is worth understanding what problem Pydantic actually solves and why it has become such a central tool in modern Python development.
The Core Problem
Every non-trivial Python application deals with data coming from external sources: API requests, form submissions, configuration files, database rows, message queues. This data arrives as raw strings, dictionaries, or JSON blobs. Before you can safely use it, you need to answer three questions:
- Is it valid? Does the data conform to the expected shape and constraints?
- Is it the right type? Should
"42"(a string) be treated as the integer42? - How do I send it back out? How do I convert my Python objects back into JSON or a dictionary?
Without a library, you write this logic by hand: a tangle of if isinstance(...), int(value), if value is None, and raise ValueError(...) calls. It is verbose, error-prone, and hard to maintain.
What Pydantic Does
Pydantic solves all three problems through a single, declarative interface. You define a class that describes the shape of your data using Python type hints, and Pydantic handles the rest automatically.
Parsing is the act of taking raw input (a dictionary, a JSON string) and transforming it into a structured Python object. Pydantic parses input data when you create a model instance.
Validation is the act of checking that parsed data satisfies your constraints. After parsing "42" into the integer 42, Pydantic can further verify that 42 > 0.
Serialization is the act of converting a Python object back into a portable format — a plain dictionary or a JSON string — for storage or transmission.
Pydantic does all three, and it does them based solely on the type annotations you write. There is no separate schema language to learn.
Why Pydantic v2?
Pydantic v2 (released in mid-2023) was a complete rewrite of the library’s core in Rust, via a sub-package called pydantic-core. The result is:
- 5x to 50x faster than v1 in most benchmarks.
- Stricter, more predictable behavior by default.
- A cleaner, more consistent API.
- Better support for Python’s standard typing constructs.
All code in this guide is written for Pydantic v2. If you see from pydantic import BaseModel, you are already using v2 if your installation is pydantic>=2.0.
Chapter 2: BaseModel Basics
The foundation of Pydantic is the BaseModel class. Every data model you define inherits from it.
Creating Your First Model
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
email: str
This is a complete, working Pydantic model. The three class-level annotations (name: str, age: int, email: str) define the fields of the model. There is no __init__ to write, no __repr__ to define — Pydantic generates all of that automatically.
To create an instance, pass keyword arguments:
user = User(name="Alice", age=30, email="[email protected]")
print(user)
# Output: name='Alice' age=30 email='[email protected]'
You can access fields as attributes:
print(user.name) # Alice
print(user.age) # 30
print(user.email) # [email protected]
How Type Hints Become the Schema
In a regular Python class, type hints are just documentation. Pydantic treats them as a runtime schema. When you instantiate the model, Pydantic reads the annotations, determines what each field should be, and validates the incoming data against them.
This means that if validation fails, Pydantic raises a ValidationError with a detailed description of what went wrong:
from pydantic import BaseModel, ValidationError
class User(BaseModel):
name: str
age: int
try:
user = User(name="Alice", age="not-a-number")
except ValidationError as e:
print(e)
The error output will clearly identify which field failed and why:
1 validation error for User age Input should be a valid integer, unable to parse string as an integer [type=int_parsing, ...]
Type Coercion (Lax Mode by Default)
One of Pydantic’s most useful features — and a source of occasional surprise — is type coercion. By default, Pydantic operates in “lax mode”: it will attempt to convert input values to the expected type rather than rejecting them immediately.
The most common example is numeric strings:
from pydantic import BaseModel
class Product(BaseModel):
name: str
price: float
quantity: int
product = Product(name="Widget", price="9.99", quantity="100")
print(product.price) # 9.99 (float, not "9.99")
print(product.quantity) # 100 (int, not "100")
The strings "9.99" and "100" were automatically converted to float and int respectively. This is extremely useful when processing form data or query parameters, where everything arrives as a string.
However, coercion is not unlimited. A string like "hello" cannot be converted to an integer, so Pydantic will raise a ValidationError. Only conversions that make unambiguous sense are performed.
Chapter 3: Required vs Optional Fields
Understanding how Pydantic handles required and optional fields is critical before building real models.
Required Fields
Any field annotated with a type but without a default value is required. Omitting it when creating an instance raises a ValidationError:
from pydantic import BaseModel, ValidationError
class User(BaseModel):
name: str
age: int
try:
user = User(name="Alice") # age is missing
except ValidationError as e:
print(e)
# 1 validation error for User
# age
# Field required [type=missing, ...]
Fields with Default Values
Provide a default value by assigning it directly in the class body:
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
is_active: bool = True
role: str = "viewer"
user = User(name="Alice", age=30)
print(user.is_active) # True
print(user.role) # viewer
is_active and role are optional — if not provided, they take the default value. name and age are still required.
Optional Fields That Can Be None
When a field can legitimately be absent or None, use the | None union type along with a default of None:
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
bio: str | None = None
website: str | None = None
user1 = User(name="Alice", age=30)
user2 = User(name="Bob", age=25, bio="Software developer", website="https://bob.dev")
print(user1.bio) # None
print(user2.bio) # Software developer
The | None tells Pydantic the field accepts either the stated type or None. The = None makes it optional (it defaults to None if not supplied).
Common Mistake — two distinct behaviors:
bio: str | None(no default) — The field is required, butNoneis a valid value. You must always passbio=...explicitly (even asbio=None).bio: str | None = None— The field is optional; if not supplied at all, it defaults toNone.Always pair
| Nonewith= Noneunless you genuinely want to force callers to explicitly acknowledge the absence of a value.
A Model Combining All Three
from pydantic import BaseModel, Field
class Article(BaseModel):
title: str # required
content: str # required
author: str = "Anonymous" # optional, has default
tags: list[str] = Field(default_factory=list) # safe mutable default
published: bool = False # optional, defaults to False
subtitle: str | None = None # optional, can be None
article = Article(title="Hello World", content="This is my first post.")
print(article.author) # Anonymous
print(article.tags) # []
print(article.subtitle) # None
Phase 2: Field Customization and Validation
Chapter 4: Field Configuration
Python’s type system can express what type a field is, but it cannot express constraints like “this string must be at least 3 characters” or “this number must be between 0 and 100”. For that, Pydantic provides the Field() function.
Using Field()
Field() replaces the default value in your annotation and lets you attach constraints and metadata:
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(min_length=2, max_length=50)
age: int = Field(ge=0, le=120)
email: str = Field(min_length=5)
Now, if you try to create a user with a name that is too short, an age that is negative, or a missing email, Pydantic will raise a ValidationError with a clear description.
Numeric Constraints
| Constraint | Meaning | Example |
|---|---|---|
gt | Greater than (exclusive) | Field(gt=0) — must be > 0 |
ge | Greater than or equal (inclusive) | Field(ge=0) — must be >= 0 |
lt | Less than (exclusive) | Field(lt=100) — must be < 100 |
le | Less than or equal (inclusive) | Field(le=100) — must be <= 100 |
multiple_of | Must be a multiple of N | Field(multiple_of=5) |
from pydantic import BaseModel, Field
class Product(BaseModel):
name: str = Field(min_length=1, max_length=100)
price: float = Field(gt=0)
discount: float = Field(ge=0, le=1) # 0% to 100%
quantity: int = Field(ge=0, multiple_of=5) # must be ordered in multiples of 5
product = Product(name="Widget", price=9.99, discount=0.1, quantity=50)
String Constraints
| Constraint | Meaning |
|---|---|
min_length | Minimum number of characters |
max_length | Maximum number of characters |
pattern | Must match a regular expression (Python regex) |
from pydantic import BaseModel, Field
class Account(BaseModel):
username: str = Field(min_length=3, max_length=20, pattern=r"^[a-zA-Z0-9_]+$")
password: str = Field(min_length=8)
Adding Metadata
Field() also supports informational metadata that does not affect validation but is extremely useful for documentation and OpenAPI schema generation (e.g., in FastAPI):
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(
min_length=2,
max_length=50,
title="Full Name",
description="The user's full display name.",
examples=["Alice Smith", "Bob Jones"]
)
age: int = Field(
ge=0,
le=120,
title="Age",
description="The user's age in years.",
examples=[25, 30]
)
Fields with Both a Default and Constraints
When you need both a default value and constraints, pass default= explicitly:
from pydantic import BaseModel, Field
class Post(BaseModel):
title: str = Field(default="Untitled", min_length=1, max_length=200)
views: int = Field(default=0, ge=0)
Chapter 5: Field Validators
Sometimes constraints expressible through Field() are not enough. You might need to run custom Python logic to validate or transform a field’s value. Pydantic v2 provides the @field_validator decorator for this purpose.
Basic Field Validator
from pydantic import BaseModel, field_validator
class User(BaseModel):
name: str
email: str
@field_validator("email")
@classmethod
def email_must_contain_at(cls, value: str) -> str:
if "@" not in value:
raise ValueError("Email must contain an @ symbol")
return value
Key points about @field_validator:
- It must be a
classmethod. - The first argument is
cls(the class), the second isvalue(the field’s value). - Return the (possibly transformed) value if it is valid.
- Raise
ValueErrororAssertionErrorif it is not.
Before vs After vs Wrap Validation
Pydantic v2 supports three modes for @field_validator:
| Mode | When it runs | Receives |
|---|---|---|
"after" (default) | After Pydantic’s type parsing | Coerced, typed value |
"before" | Before type parsing | Raw input value |
"wrap" | Wraps around the full validation pipeline | Raw value + a handler callable |
mode="before" — normalize raw input before Pydantic attempts type conversion:
from pydantic import BaseModel, field_validator
class Tag(BaseModel):
name: str
@field_validator("name", mode="before")
@classmethod
def strip_and_lowercase(cls, value: object) -> str:
if isinstance(value, str):
return value.strip().lower()
return value
mode="wrap" — intercept the validation pipeline; useful for logging, timing, or advanced transformations. The second argument is a handler callable that runs the rest of the validation chain:
from pydantic import BaseModel, field_validator
class User(BaseModel):
email: str
@field_validator("email", mode="wrap")
@classmethod
def log_and_validate(cls, value: object, handler) -> str:
print(f"Validating email: {value!r}")
result = handler(value) # run default Pydantic validation
print(f"Email validated successfully: {result!r}")
return result
Use mode="before" when you need to normalize raw input, mode="after" (default) when you need to validate already-coerced data, and mode="wrap" when you need to observe or intercept the full validation process.
Validating Multiple Fields with One Validator
You can apply a single validator to several fields by listing them all in the decorator:
from pydantic import BaseModel, field_validator
class Address(BaseModel):
city: str
state: str
country: str
@field_validator("city", "state", "country")
@classmethod
def must_not_be_empty(cls, value: str) -> str:
if not value.strip():
raise ValueError("This field cannot be blank")
return value.strip().title()
Practical Example: Normalizing and Validating
from pydantic import BaseModel, field_validator
class User(BaseModel):
username: str
email: str
age: int
@field_validator("username", mode="before")
@classmethod
def normalize_username(cls, v: object) -> object:
if isinstance(v, str):
return v.strip().lower()
return v
@field_validator("email")
@classmethod
def validate_email_format(cls, v: str) -> str:
v = v.strip().lower()
if "@" not in v or "." not in v.split("@")[-1]:
raise ValueError("Invalid email format")
return v
@field_validator("age")
@classmethod
def validate_age_range(cls, v: int) -> int:
if v < 18:
raise ValueError("User must be at least 18 years old")
return v
user = User(username=" Alice ", email="[email protected]", age=25)
print(user.username) # alice
print(user.email) # [email protected]
Chapter 6: Model Validators
While @field_validator handles individual fields, @model_validator lets you inspect and validate the entire model at once. This is essential when the validity of one field depends on the value of another.
Using @model_validator with mode=“after”
The mode="after" variant receives the fully-constructed model instance after all fields have been validated:
from pydantic import BaseModel, model_validator
class Event(BaseModel):
start_date: str
end_date: str
@model_validator(mode="after")
def check_date_order(self) -> "Event":
if self.end_date < self.start_date:
raise ValueError("end_date must not be before start_date")
return self
The validator receives self (the model instance) and must return self after any modifications.
Using @model_validator with mode=“before”
mode="before" runs before any field parsing. It receives the raw input (typically a dict) and must return the data to be used for parsing. This is useful for preprocessing the entire input object:
from pydantic import BaseModel, model_validator
from typing import Any
class Config(BaseModel):
host: str
port: int
debug: bool
@model_validator(mode="before")
@classmethod
def set_defaults_from_env(cls, data: Any) -> Any:
if isinstance(data, dict):
# If debug is not set, default based on host
if "debug" not in data:
data["debug"] = data.get("host") == "localhost"
return data
config = Config(host="localhost", port=8000)
print(config.debug) # True, because host is "localhost"
Cross-Field Validation: A Real Example
from pydantic import BaseModel, model_validator
class PasswordReset(BaseModel):
password: str
confirm_password: str
@model_validator(mode="after")
def passwords_must_match(self) -> "PasswordReset":
if self.password != self.confirm_password:
raise ValueError("Passwords do not match")
return self
try:
reset = PasswordReset(password="secret123", confirm_password="wrongpass")
except Exception as e:
print(e)
# 1 validation error for PasswordReset
# Value error, Passwords do not match [type=value_error, ...]
Phase 3: Advanced Modeling
Chapter 7: Nested Models
Real-world data is rarely flat. You often need to represent structured data with embedded sub-documents. Pydantic handles this naturally by allowing models to be used as field types within other models.
Basic Nesting
from pydantic import BaseModel
class Address(BaseModel):
street: str
city: str
country: str
zip_code: str
class User(BaseModel):
name: str
email: str
address: Address
user = User(
name="Alice",
email="[email protected]",
address={
"street": "123 Main St",
"city": "Springfield",
"country": "USA",
"zip_code": "12345"
}
)
print(user.address.city) # Springfield
print(user.address.country) # USA
Notice that even though we passed a plain dictionary for address, Pydantic automatically parsed it into an Address instance. This is one of the most powerful aspects of nested models: the input format is flexible, but the output is always a proper object.
Lists of Nested Models
It is equally natural to have a field that contains a list of nested model instances:
from pydantic import BaseModel
class Item(BaseModel):
name: str
price: float
quantity: int
class Order(BaseModel):
order_id: str
customer: str
items: list[Item]
notes: str | None = None
order = Order(
order_id="ORD-001",
customer="Alice",
items=[
{"name": "Widget", "price": 9.99, "quantity": 2},
{"name": "Gadget", "price": 24.99, "quantity": 1},
]
)
for item in order.items:
print(f"{item.name}: ${item.price} x {item.quantity}")
# Widget: $9.99 x 2
# Gadget: $24.99 x 1
Deeply Nested Models
Nesting can go as deep as needed:
from pydantic import BaseModel
class GPS(BaseModel):
latitude: float
longitude: float
class Address(BaseModel):
street: str
city: str
gps: GPS | None = None
class Company(BaseModel):
name: str
headquarters: Address
company = Company(
name="TechCorp",
headquarters={
"street": "1 Infinite Loop",
"city": "Cupertino",
"gps": {"latitude": 37.3317, "longitude": -122.0302}
}
)
print(company.headquarters.gps.latitude) # 37.3317
Chapter 8: Complex Types
Pydantic integrates with Python’s full typing system, so you can use all standard collection types and special types in your models.
Lists, Dicts, Sets, and Tuples
Important — Mutable Defaults: Never assign a mutable literal (
[],{},set()) as a default value directly for collection fields. In Python, mutable defaults are shared across all class instances, which can cause subtle bugs where one instance’s data bleeds into another’s. Pydantic v2 detects this and will warn you. Always useField(default_factory=...)for mutable defaults.
from pydantic import BaseModel, Field
class DataStore(BaseModel):
# A list of strings — use default_factory, not = []
tags: list[str] = Field(default_factory=list)
# A dictionary mapping strings to integers — use default_factory, not = {}
scores: dict[str, int] = Field(default_factory=dict)
# A set of unique strings — use default_factory, not = set()
permissions: set[str] = Field(default_factory=set)
# A fixed-size tuple: (x, y) coordinates — tuples are immutable, direct default is fine
coordinates: tuple[float, float] = (0.0, 0.0)
store = DataStore(
tags=["python", "pydantic", "python"], # duplicates allowed in list
scores={"alice": 95, "bob": 87},
permissions={"read", "write", "read"}, # duplicates removed in set
coordinates=(51.5074, -0.1278)
)
print(store.tags) # ['python', 'pydantic', 'python']
print(store.permissions) # {'read', 'write'}
print(store.coordinates) # (51.5074, -0.1278)
Union Types
A field that accepts more than one type:
from pydantic import BaseModel
class Response(BaseModel):
status: int
data: dict | list | str | None = None
r1 = Response(status=200, data={"key": "value"})
r2 = Response(status=200, data=["item1", "item2"])
r3 = Response(status=204) # data is None by default
Literal Types
Literal restricts a field to a specific set of allowed values:
from typing import Literal
from pydantic import BaseModel
class Task(BaseModel):
title: str
status: Literal["pending", "in_progress", "completed", "cancelled"]
priority: Literal[1, 2, 3] = 2 # Literals can be integers too
task = Task(title="Write tests", status="in_progress")
try:
bad_task = Task(title="Write tests", status="done")
except Exception as e:
print(e)
# status: Input should be 'pending', 'in_progress', 'completed' or 'cancelled'
Literal is particularly useful for status flags, modes, and any field where only specific values are semantically valid.
Enum Types
Python’s Enum integrates naturally with Pydantic and is often preferable to Literal when you need re-usable named constants:
from enum import Enum
from pydantic import BaseModel
class UserRole(str, Enum):
ADMIN = "admin"
EDITOR = "editor"
VIEWER = "viewer"
class User(BaseModel):
name: str
role: UserRole = UserRole.VIEWER
user1 = User(name="Alice", role="admin") # Pydantic accepts the string value
user2 = User(name="Bob", role=UserRole.EDITOR)
print(user1.role) # UserRole.ADMIN
print(user1.role.value) # admin
print(user2.role) # UserRole.EDITOR
By inheriting from both str and Enum, the enum values are also valid strings, which makes serialization to JSON seamless.
Chapter 9: Aliases and ORM Support
Field Aliases
Sometimes the names used in an external data source (a JSON API, a database column) do not match the Python naming conventions you want to use in your code. Field aliases solve this:
from pydantic import BaseModel, Field
class User(BaseModel):
user_id: int = Field(alias="userId")
full_name: str = Field(alias="fullName")
email_address: str = Field(alias="emailAddress")
# Parse using the alias names (from external source)
user = User.model_validate({"userId": 1, "fullName": "Alice", "emailAddress": "[email protected]"})
print(user.user_id) # 1
print(user.full_name) # Alice
The alias parameter tells Pydantic: “When reading input, look for this key name.” Inside your code, you still use the Python attribute name (user_id, not userId).
Validation Alias vs Serialization Alias
Pydantic v2 introduced separate aliases for input parsing and output serialization:
from pydantic import BaseModel, Field
class Product(BaseModel):
# Accept "product_name" when parsing, export as "name" when serializing
name: str = Field(validation_alias="product_name", serialization_alias="name")
price: float = Field(validation_alias="unit_price", serialization_alias="price")
product = Product.model_validate({"product_name": "Widget", "unit_price": 9.99})
print(product.name) # Widget
print(product.model_dump(by_alias=True))
# {'name': 'Widget', 'price': 9.99}
model_config
Model-level behavior is controlled through the model_config class attribute, which accepts a ConfigDict:
from pydantic import BaseModel, ConfigDict, Field
class User(BaseModel):
model_config = ConfigDict(populate_by_name=True)
user_id: int = Field(alias="userId")
full_name: str = Field(alias="fullName")
# With populate_by_name=True, both the alias and the Python name work
user1 = User.model_validate({"userId": 1, "fullName": "Alice"})
user2 = User(user_id=1, full_name="Alice") # Python name also works
print(user1.full_name) # Alice
print(user2.user_id) # 1
from_attributes (ORM Mode)
By default, Pydantic only parses dictionaries and similar mapping types. When working with ORM objects (like SQLAlchemy model instances), the data is stored as object attributes, not dictionary keys. The from_attributes configuration option enables this:
from pydantic import BaseModel, ConfigDict
class UserSchema(BaseModel):
model_config = ConfigDict(from_attributes=True)
id: int
name: str
email: str
# Simulating an ORM row object
class OrmUser:
def __init__(self, id: int, name: str, email: str) -> None:
self.id = id
self.name = name
self.email = email
orm_row = OrmUser(id=42, name="Alice", email="[email protected]")
# Parse from ORM object
user = UserSchema.model_validate(orm_row)
print(user.id) # 42
print(user.name) # Alice
With from_attributes=True, Pydantic reads data from object attributes instead of dictionary keys, which is the bridge between the ORM world and Pydantic schemas.
Phase 4: Serialization and Parsing
Chapter 10: Exporting Data
Once you have a populated Pydantic model, you often need to convert it back into a plain Python dictionary or a JSON string for storage or transmission.
model_dump()
model_dump() converts a model instance to a plain Python dictionary:
from pydantic import BaseModel
class Address(BaseModel):
street: str
city: str
class User(BaseModel):
name: str
age: int
address: Address
user = User(name="Alice", age=30, address=Address(street="123 Main St", city="NYC"))
print(user.model_dump())
# {'name': 'Alice', 'age': 30, 'address': {'street': '123 Main St', 'city': 'NYC'}}
Notice that nested models are also recursively converted to dictionaries.
Common options for model_dump():
# Include only specific fields
user.model_dump(include={"name", "age"})
# {'name': 'Alice', 'age': 30}
# Exclude specific fields
user.model_dump(exclude={"age"})
# {'name': 'Alice', 'address': {'street': '123 Main St', 'city': 'NYC'}}
# Exclude fields that are None
user.model_dump(exclude_none=True)
# Exclude fields that have their default value
user.model_dump(exclude_defaults=True)
# Use alias names as dictionary keys
user.model_dump(by_alias=True)
model_dump_json()
model_dump_json() serializes the model directly to a JSON string:
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
tags: list[str]
user = User(name="Alice", age=30, tags=["admin", "editor"])
json_str = user.model_dump_json()
print(json_str)
# {"name":"Alice","age":30,"tags":["admin","editor"]}
# Pretty-printed JSON
json_str_pretty = user.model_dump_json(indent=2)
print(json_str_pretty)
model_dump_json() is generally faster than calling json.dumps(model.model_dump()) because Pydantic’s Rust core handles the serialization directly.
Chapter 11: Parsing Data
model_validate()
model_validate() is the primary way to parse a dictionary into a model instance. It is equivalent to the constructor but more explicit:
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
data = {"name": "Alice", "age": 30}
user = User.model_validate(data)
print(user.name) # Alice
model_validate_json()
When your input is a JSON string (rather than an already-parsed dictionary), use model_validate_json():
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
tags: list[str]
json_string = '{"name": "Alice", "age": 30, "tags": ["admin", "editor"]}'
user = User.model_validate_json(json_string)
print(user.name) # Alice
print(user.tags) # ['admin', 'editor']
This is more efficient than User.model_validate(json.loads(json_string)) because Pydantic parses the JSON and validates in a single pass at the Rust level.
Parsing Lists of Models
A common pattern when working with APIs is receiving a list of objects. You can parse them using TypeAdapter:
from pydantic import BaseModel, TypeAdapter
from typing import Annotated
class User(BaseModel):
name: str
age: int
adapter = TypeAdapter(list[User])
users = adapter.validate_python([
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25},
])
for user in users:
print(user.name)
# Alice
# Bob
Phase 5: Advanced Features
Chapter 12: Custom Data Types
When you find yourself writing the same @field_validator logic across many models — for example, validating that a string is a valid URL, or that an integer represents a valid Unix timestamp — you can extract that logic into a custom type that can be reused anywhere.
Creating a Custom Type with Annotated
The most flexible approach in Pydantic v2 is to combine Python’s Annotated with a custom AfterValidator or BeforeValidator:
from pydantic import BaseModel
from pydantic.functional_validators import AfterValidator
from typing import Annotated
def must_be_positive(v: float) -> float:
if v <= 0:
raise ValueError("Value must be positive")
return v
PositiveFloat = Annotated[float, AfterValidator(must_be_positive)]
class Measurement(BaseModel):
length: PositiveFloat
width: PositiveFloat
height: PositiveFloat
m = Measurement(length=10.0, width=5.0, height=3.0)
PositiveFloat is now a reusable type that you can drop anywhere in your models without duplicating the validator.
A More Complete Custom Type Example
from pydantic import BaseModel
from pydantic.functional_validators import AfterValidator, BeforeValidator
from typing import Annotated
def strip_whitespace(v: object) -> object:
if isinstance(v, str):
return v.strip()
return v
def must_not_be_empty(v: str) -> str:
if not v:
raise ValueError("String must not be empty")
return v
NonEmptyStr = Annotated[str, BeforeValidator(strip_whitespace), AfterValidator(must_not_be_empty)]
class Company(BaseModel):
name: NonEmptyStr
description: NonEmptyStr
company = Company(name=" TechCorp ", description=" We build things. ")
print(company.name) # TechCorp
print(company.description) # We build things.
Custom Types with Pydantic’s Built-in Types
Pydantic v2 also ships with a rich set of pre-built types in pydantic and pydantic.networks:
from pydantic import BaseModel, EmailStr, AnyHttpUrl, PositiveInt, NegativeFloat
class UserProfile(BaseModel):
email: EmailStr
website: AnyHttpUrl
followers: PositiveInt
balance: NegativeFloat | None = None
profile = UserProfile(
email="[email protected]",
website="https://alice.dev",
followers=1500
)
These built-in types come with their own validation logic already defined — you do not need to write it yourself.
Chapter 13: Computed Fields
Sometimes a model needs to expose derived data — values that are calculated from other fields rather than stored directly. Pydantic v2 provides @computed_field for this:
from pydantic import BaseModel, computed_field
class Rectangle(BaseModel):
width: float
height: float
@computed_field
@property
def area(self) -> float:
return self.width * self.height
@computed_field
@property
def perimeter(self) -> float:
return 2 * (self.width + self.height)
rect = Rectangle(width=5.0, height=3.0)
print(rect.area) # 15.0
print(rect.perimeter) # 16.0
print(rect.model_dump())
# {'width': 5.0, 'height': 3.0, 'area': 15.0, 'perimeter': 16.0}
Key behaviors of computed fields:
- Read-only: You cannot assign to a
@computed_fieldproperty directly after the model is created. - Included in serialization by default:
model_dump()andmodel_dump_json()include computed fields automatically. You can exclude them selectively usingmodel_dump(exclude={"area"})or control them per-field withreprand other options.
Practical Use Case
from pydantic import BaseModel, computed_field, Field
class UserProfile(BaseModel):
first_name: str
last_name: str
age: int = Field(ge=0)
@computed_field
@property
def full_name(self) -> str:
return f"{self.first_name} {self.last_name}"
@computed_field
@property
def is_adult(self) -> bool:
return self.age >= 18
profile = UserProfile(first_name="Alice", last_name="Smith", age=25)
print(profile.full_name) # Alice Smith
print(profile.is_adult) # True
print(profile.model_dump())
# {'first_name': 'Alice', 'last_name': 'Smith', 'age': 25,
# 'full_name': 'Alice Smith', 'is_adult': True}
Chapter 14: Configuration Settings
The model_config attribute, populated with a ConfigDict, controls the global behavior of a model. We have seen a few options already. Here is a fuller tour of the most important settings.
strict Mode
By default, Pydantic operates in lax mode and coerces compatible types. With strict=True, automatic type conversion is disabled and exact type matching is enforced — the input must already be the exact expected type:
from pydantic import BaseModel, ConfigDict, ValidationError
class StrictUser(BaseModel):
model_config = ConfigDict(strict=True)
name: str
age: int
try:
user = StrictUser(name="Alice", age="30") # "30" is a string, not int
except ValidationError as e:
print(e)
# age: Input should be a valid integer [type=int_type, ...]
Note: Strict mode prevents automatic type conversion (e.g.,
"30"→30), but Pydantic’s internal parsing rules for constrained types (likePositiveInt) still apply within their own domain.
Use strict mode when you are confident about your input format (e.g., you control both the producer and consumer) and want to avoid any implicit type coercions.
frozen Models
With frozen=True, model instances become immutable after creation — like a named tuple. Attempting to modify a field raises a ValidationError:
from pydantic import BaseModel, ConfigDict
class Config(BaseModel):
model_config = ConfigDict(frozen=True)
host: str
port: int
config = Config(host="localhost", port=8080)
try:
config.port = 9090 # This will raise an error
except Exception as e:
print(e)
# "Config" is frozen and does not support item assignment
Frozen models are also hashable, which means they can be used as dictionary keys or stored in sets.
extra Fields
The extra setting controls what happens when you pass fields that are not defined in the model:
from pydantic import BaseModel, ConfigDict
class Strict(BaseModel):
model_config = ConfigDict(extra="forbid")
name: str
class Lenient(BaseModel):
model_config = ConfigDict(extra="ignore") # default
name: str
class Accepting(BaseModel):
model_config = ConfigDict(extra="allow")
name: str
# "forbid" raises a validation error for unknown fields
try:
Strict(name="Alice", unknown_field="value")
except Exception as e:
print(e) # extra inputs are not permitted
# "ignore" silently drops unknown fields
l = Lenient(name="Alice", unknown_field="value")
print(l.model_dump()) # {'name': 'Alice'}
# "allow" stores extra fields
a = Accepting(name="Alice", unknown_field="value")
print(a.model_dump()) # {'name': 'Alice', 'unknown_field': 'value'}
validate_assignment
By default, field values are only validated when the model is created. With validate_assignment=True, Pydantic also validates values whenever a field is reassigned:
from pydantic import BaseModel, ConfigDict, ValidationError
class User(BaseModel):
model_config = ConfigDict(validate_assignment=True)
name: str
age: int
user = User(name="Alice", age=30)
try:
user.age = "not-a-number"
except ValidationError as e:
print(e)
# age: Input should be a valid integer
Phase 6: Performance and Best Practices
Chapter 15: When to Use Pydantic vs Plain Dicts
Pydantic is powerful, but it is not free. Every validation run consumes CPU time. Understanding when to use Pydantic and when plain Python structures are sufficient is important for writing efficient code.
Use Pydantic When:
- You are parsing external input (API requests, form data, configuration files, JSON payloads). This is its primary purpose.
- You need type coercion — converting “42” to 42 automatically.
- You need validation — enforcing constraints on data.
- You need self-documenting schemas — the model serves as code-as-documentation and drives API schema generation (e.g., FastAPI).
- You are modeling domain entities that have invariants (e.g., “start_date must be before end_date”).
Use Plain Dicts or Dataclasses When:
- You are working with internal data that you have already validated and you know its type.
- You are in a hot path where performance is critical and the data is already trusted.
- The data structure is simple and temporary — used for a single computation and discarded.
- You only need data storage without any validation or coercion logic.
Avoid Re-validating Already-Validated Data
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
# Efficient: validate once at the boundary
user = User.model_validate(raw_api_response)
# Now work with 'user' as a trusted object.
# Do NOT create a new User from user.model_dump() inside the same function.
# That would pay the validation cost twice for no benefit.
process_user(user)
Validate at the entry point of your application (or service layer), then pass validated model instances throughout. Avoid re-validating data that has already been validated.
Chapter 16: Designing Scalable Schemas
As your application grows, keeping your Pydantic models organized and maintainable requires deliberate design choices.
Separate Input and Output Schemas
It is a common and recommended practice to maintain separate schema classes for reading input (from a client or external source) and writing output (returning data to a client):
from pydantic import BaseModel, Field
from datetime import datetime
# Used when creating a user (what the client sends)
class UserCreate(BaseModel):
name: str = Field(min_length=2, max_length=50)
email: str
password: str = Field(min_length=8)
# Used when returning user data (what the API responds with) — no password
class UserRead(BaseModel):
id: int
name: str
email: str
created_at: datetime
# Used when updating a user (all fields optional)
class UserUpdate(BaseModel):
name: str | None = None
email: str | None = None
This pattern (sometimes called “schema-per-operation”) prevents accidental exposure of sensitive fields and gives each operation a clear, unambiguous contract.
Use Inheritance to Share Common Fields
from pydantic import BaseModel, Field
from datetime import datetime
class TimestampMixin(BaseModel):
created_at: datetime | None = None
updated_at: datetime | None = None
class UserBase(BaseModel):
name: str
email: str
class UserCreate(UserBase):
password: str = Field(min_length=8)
class UserRead(UserBase, TimestampMixin):
id: int
Inheritance is a clean way to avoid duplicating common fields like timestamps, IDs, or audit metadata across many models.
Keep Models Focused
Avoid creating “god models” that contain every possible field for every possible operation. A model should represent one clear concept in one clear context. This makes code easier to reason about, test, and document.
Bonus Section
Chapter 17: Pydantic Settings
Pydantic provides a companion package, pydantic-settings, designed specifically for loading and validating application configuration from environment variables, .env files, and other sources.
Install it separately:
pip install pydantic-settings
Basic Usage
from pydantic_settings import BaseSettings
class AppConfig(BaseSettings):
database_url: str
secret_key: str
debug: bool = False
port: int = 8000
# Pydantic reads these automatically from environment variables
config = AppConfig()
print(config.port) # from PORT env var, or 8000 if not set
print(config.debug) # from DEBUG env var, or False if not set
Environment variable names are matched case-insensitively. If your model has a field database_url, it reads the DATABASE_URL environment variable.
Reading from a .env File
from pydantic_settings import BaseSettings, SettingsConfigDict
class AppConfig(BaseSettings):
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8")
database_url: str
secret_key: str
debug: bool = False
port: int = 8000
With env_file=".env", Pydantic automatically reads from the specified file in addition to environment variables. Environment variables take precedence over .env file values.
Chapter 18: Integration with Databases (ORM Mapping)
The most common pattern for integrating Pydantic with a database ORM (like SQLAlchemy) involves a clear separation between the ORM model and the Pydantic schema.
The ORM model defines the database table structure. The Pydantic schema defines the API contract. The bridge is from_attributes=True.
from pydantic import BaseModel, ConfigDict
from datetime import datetime
# --- SQLAlchemy ORM model (simplified) ---
# class UserOrm(Base):
# __tablename__ = "users"
# id = Column(Integer, primary_key=True)
# name = Column(String)
# email = Column(String, unique=True)
# created_at = Column(DateTime, default=datetime.utcnow)
# --- Pydantic schema ---
class UserRead(BaseModel):
model_config = ConfigDict(from_attributes=True)
id: int
name: str
email: str
created_at: datetime
# --- In a FastAPI route ---
# @app.get("/users/{user_id}", response_model=UserRead)
# def get_user(user_id: int, db: Session = Depends(get_db)):
# user_orm = db.query(UserOrm).filter(UserOrm.id == user_id).first()
# return UserRead.model_validate(user_orm) # ORM object → Pydantic schema
The key line is UserRead.model_validate(user_orm). Because from_attributes=True is set, Pydantic reads the attributes from the SQLAlchemy object rather than expecting a dictionary.
Chapter 19: Testing Pydantic Models
Pydantic models are straightforward to test because validation is deterministic and eager — errors appear immediately when you try to create an instance with invalid data.
Testing Valid Data
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(min_length=2, max_length=50)
age: int = Field(ge=0, le=120)
def test_valid_user():
user = User(name="Alice", age=30)
assert user.name == "Alice"
assert user.age == 30
def test_default_values():
class Config(BaseModel):
host: str = "localhost"
port: int = 8080
config = Config()
assert config.host == "localhost"
assert config.port == 8080
Testing Invalid Data
import pytest
from pydantic import BaseModel, Field, ValidationError
class User(BaseModel):
name: str = Field(min_length=2)
age: int = Field(ge=0)
def test_name_too_short():
with pytest.raises(ValidationError) as exc_info:
User(name="A", age=30)
errors = exc_info.value.errors()
assert any(e["loc"] == ("name",) for e in errors)
def test_negative_age():
with pytest.raises(ValidationError) as exc_info:
User(name="Alice", age=-1)
errors = exc_info.value.errors()
assert any(e["loc"] == ("age",) for e in errors)
def test_missing_required_field():
with pytest.raises(ValidationError):
User(name="Alice") # age is missing
Testing Serialization
from pydantic import BaseModel
class Address(BaseModel):
street: str
city: str
class User(BaseModel):
name: str
address: Address
def test_model_dump():
user = User(name="Alice", address=Address(street="123 Main St", city="NYC"))
data = user.model_dump()
assert data == {
"name": "Alice",
"address": {"street": "123 Main St", "city": "NYC"}
}
def test_model_dump_json():
import json
user = User(name="Alice", address=Address(street="123 Main St", city="NYC"))
json_str = user.model_dump_json()
parsed = json.loads(json_str)
assert parsed["name"] == "Alice"
assert parsed["address"]["city"] == "NYC"
Quick Reference Table
| Feature | Import | Purpose |
|---|---|---|
BaseModel | from pydantic import BaseModel | Define a data model |
Field() | from pydantic import Field | Add constraints and metadata to a field |
field_validator | from pydantic import field_validator | Custom per-field validation logic |
model_validator | from pydantic import model_validator | Cross-field or whole-model validation |
computed_field | from pydantic import computed_field | Derived, read-only properties |
ConfigDict | from pydantic import ConfigDict | Model-level configuration |
ValidationError | from pydantic import ValidationError | Exception raised on validation failure |
TypeAdapter | from pydantic import TypeAdapter | Validate non-model types (e.g., list[User]) |
AfterValidator | from pydantic.functional_validators import AfterValidator | Build custom reusable types |
BeforeValidator | from pydantic.functional_validators import BeforeValidator | Pre-parse normalization in custom types |
EmailStr | from pydantic import EmailStr | Built-in email validation type |
AnyHttpUrl | from pydantic import AnyHttpUrl | Built-in URL validation type |
PositiveInt | from pydantic import PositiveInt | Integer > 0 |
BaseSettings | from pydantic_settings import BaseSettings | Load config from env vars / .env files |
model_dump() | method on model instance | Serialize to dict |
model_dump_json() | method on model instance | Serialize to JSON string |
model_validate() | classmethod on model | Parse dict into model |
model_validate_json() | classmethod on model | Parse JSON string into model |