Pydantic

Last updated: Mar 24, 2026

Rationale

We use Pydantic as the standard library for data modeling, validation, and serialization across our Python components.

It is open source.
It defines data models using plain Python classes and type hints, requiring no separate schema language or configuration format.
It validates data at runtime, catching invalid inputs at the boundary of the system before they propagate further.
It supports complex validation scenarios, including field-level constraints, cross-field dependencies, and custom validator functions.
It provides first-class serialization and deserialization to and from JSON, with fine-grained control over field aliasing and output shape.
Its v2 implementation is backed by a Rust core, making it significantly faster than pure Python alternatives.
It has a very large community and is one of the most downloaded Python packages, making it reliable and well-supported.
It integrates naturally with modern Python tooling.

Alternatives

dataclasses

dataclasses is a standard library module for defining simple data containers using type annotations.

It is part of the Python standard library, requiring no additional dependency.
It does not perform any runtime validation; fields accept any value regardless of the annotated type.
It has no built-in serialization support.
It does not support field constraints, custom validators, or cross-field validation out of the box.

attrs

attrs is a mature library for writing concise and correct Python classes.

It is open source.
It supports validators and converters, but requires more boilerplate compared to Pydantic's type-annotation-driven approach.
It does not provide native JSON serialization or deserialization.
Its API is less intuitive than Pydantic's for developers already familiar with Python type hints.
It has a smaller community than Pydantic.

marshmallow

marshmallow is an object serialization and deserialization library for Python.

It is open source.
It defines schemas separately from the data classes themselves, which increases verbosity and introduces duplication.
It supports custom validation and field transformations, but requires more manual wiring than Pydantic.
It does not use Python type hints as its primary interface, reducing integration with static analysis tools.
It is slower than Pydantic v2 in most benchmarks.
It has a smaller community than Pydantic.

confuse

confuse is a configuration library for Python applications, focused on loading and validating settings from YAML files, environment variables, and command-line arguments.

It is open source.
It is purpose-built for application configuration rather than general data modeling, making it a natural fit for managing settings loaded from multiple sources with a defined priority order.
It supports YAML-based configuration files out of the box, which is a common format for developer-facing settings.
It provides a layered configuration system where environment variables and command-line overrides can take precedence over file-based defaults.
It does not use Python type hints as its primary interface, reducing integration with static analysis tools.
Its validation API is more limited than Pydantic's, lacking support for complex cross-field constraints or custom validators.
It does not provide serialization support.
It has a significantly smaller community and lower adoption than Pydantic.

Usage

We use Pydantic across our Python components for:

Defining typed data models for structured data such as configuration files, vulnerability schemas, and API payloads.
Validating user and external inputs at system boundaries, including CLI arguments and data ingested from third-party sources.
Managing typed application settings loaded from environment variables via pydantic-settings.
Serializing internal models to standards-compliant output formats such as JSON, SARIF and YAML.