pandera.decorators.check_input#
- pandera.decorators.check_input(schema, obj_getter=None, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)[source]#
Validate function argument when function is called.
This is a decorator function that validates the schema of a dataframe argument in a function.
- Parameters
schema (
Union
[DataFrameSchema
,SeriesSchema
]) – dataframe/series schema objectobj_getter (
Union
[str
,int
,None
]) – (Default value = None) if int, obj_getter refers to the the index of the pandas dataframe/series to be validated in the args part of the function signature. If str, obj_getter refers to the argument name of the pandas dataframe/series in the function signature. This works even if the series/dataframe is passed in as a positional argument when the function is called. If None, assumes that the dataframe/series is the first argument of the decorated functionhead (
Optional
[int
]) – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.tail (
Optional
[int
]) – validate the last n rows. Rows overlapping with head or sample are de-duplicated.sample (
Optional
[int
]) – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.random_state (
Optional
[int
]) – random seed for thesample
argument.lazy (
bool
) – if True, lazily evaluates dataframe against all validation checks and raises aSchemaErrors
. Otherwise, raiseSchemaError
as soon as one occurs.inplace (
bool
) – if True, applies coercion to the object of validation, otherwise creates a copy of the data.
- Return type
Callable
[[~F], ~F]- Returns
wrapped function
- Example
Check the input of a decorated function.
>>> import pandas as pd >>> import pandera as pa >>> >>> >>> schema = pa.DataFrameSchema({"column": pa.Column(int)}) >>> >>> @pa.check_input(schema) ... def transform_data(df: pd.DataFrame) -> pd.DataFrame: ... df["doubled_column"] = df["column"] * 2 ... return df >>> >>> df = pd.DataFrame({ ... "column": range(5), ... }) >>> >>> transform_data(df) column doubled_column 0 0 0 1 1 2 2 2 4 3 3 6 4 4 8
See here for more usage details.