pandera.decorators.check_output#
- pandera.decorators.check_output(schema, obj_getter=None, head=None, tail=None, sample=None, random_state=None, lazy=False, inplace=False)[source]#
Validate function output.
Similar to input validator, but validates the output of the decorated function.
- Parameters
schema (
Union
[DataFrameSchema
,SeriesSchema
]) – dataframe/series schema objectobj_getter (
Union
[str
,int
,Callable
,None
]) – (Default value = None) if int, assumes that the output of the decorated function is a list-like object, where obj_getter is the index of the pandas data dataframe/series to be validated. If str, expects that the output is a dict-like object, and obj_getter is the key pointing to the dataframe/series to be validated. If a callable is supplied, it expects the output of decorated function and should return the dataframe/series to be validated.head (
Optional
[int
]) – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.tail (
Optional
[int
]) – validate the last n rows. Rows overlapping with head or sample are de-duplicated.sample (
Optional
[int
]) – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.random_state (
Optional
[int
]) – random seed for thesample
argument.lazy (
bool
) – if True, lazily evaluates dataframe against all validation checks and raises aSchemaErrors
. Otherwise, raiseSchemaError
as soon as one occurs.inplace (
bool
) – if True, applies coercion to the object of validation, otherwise creates a copy of the data.
- Return type
Callable
[[~F], ~F]- Returns
wrapped function
- Example
Check the output a decorated function.
>>> import pandas as pd >>> import pandera as pa >>> >>> >>> schema = pa.DataFrameSchema( ... columns={"doubled_column": pa.Column(int)}, ... checks=pa.Check( ... lambda df: df["doubled_column"] == df["column"] * 2 ... ) ... ) >>> >>> @pa.check_output(schema) ... def transform_data(df: pd.DataFrame) -> pd.DataFrame: ... df["doubled_column"] = df["column"] * 2 ... return df >>> >>> df = pd.DataFrame({"column": range(5)}) >>> >>> transform_data(df) column doubled_column 0 0 0 1 1 2 2 2 4 3 3 6 4 4 8
See here for more usage details.