
new in 0.8.0

Using Pandera Schemas in Pydantic Models

DataFrameModel is fully compatible with pydantic. You can specify a DataFrameModel in a pydantic BaseModel as you would any other field:

import pandas as pd
import pandera as pa
from pandera.typing import DataFrame, Series
import pydantic

class SimpleSchema(pa.DataFrameModel):
    str_col: Series[str] = pa.Field(unique=True)

class PydanticModel(pydantic.BaseModel):
    x: int
    df: DataFrame[SimpleSchema]

valid_df = pd.DataFrame({"str_col": ["hello", "world"]})
PydanticModel(x=1, df=valid_df)

invalid_df = pd.DataFrame({"str_col": ["hello", "hello"]})
PydanticModel(x=1, df=invalid_df)
ValidationError                           Traceback (most recent call last)
Cell In[1], line 20
     17 PydanticModel(x=1, df=valid_df)
     19 invalid_df = pd.DataFrame({"str_col": ["hello", "hello"]})
---> 20 PydanticModel(x=1, df=invalid_df)

File ~/checkouts/, in BaseModel.__init__(self, **data)
    174 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    175 __tracebackhide__ = True
--> 176 self.__pydantic_validator__.validate_python(data, self_instance=self)

ValidationError: 1 validation error for PydanticModel
  Value error, series 'str_col' contains duplicate values:
0    hello
1    hello
Name: str_col, dtype: object [type=value_error, input_value=  str_col
0   hello
1   hello, input_type=DataFrame]
    For further information visit

Other pandera components are also compatible with pydantic:


The SeriesSchema, DataFrameSchema and schema_components types validates the type of a schema object, e.g. if your pydantic BaseModel contained a schema object, not a pandas object.

Using Pydantic Models in Pandera Schemas

new in 0.10.0

You can also use a pydantic BaseModel in a pandera schema. Suppose you had a Record model:

from pydantic import BaseModel

import pandera as pa

class Record(BaseModel):
    name: str
    xcoord: int
    ycoord: int

The PydanticModel datatype enables you to specify the Record model as a row-wise type.

import pandas as pd
from pandera.engines.pandas_engine import PydanticModel

class PydanticSchema(pa.DataFrameModel):
    """Pandera schema using the pydantic model."""

    class Config:
        """Config with dataframe-level data type."""

        dtype = PydanticModel(Record)
        coerce = True  # this is required, otherwise a SchemaInitError is raised


By combining dtype=PydanticModel(...) and coerce=True, pandera will apply the pydantic model validation process to each row of the dataframe, converting the model back to a dictionary with the BaseModel.dict() method.

The equivalent pandera schema would look like this:

class PanderaSchema(pa.DataFrameModel):
    """Pandera schema that's equivalent to PydanticSchema."""

    name: pa.typing.Series[str]
    xcoord: pa.typing.Series[int]
    ycoord: pa.typing.Series[int]


Since the PydanticModel datatype applies the BaseModel constructor to each row of the dataframe, using PydanticModel might not scale well with larger datasets.

If you want to help benchmark, consider contributing a benchmark script