pydantic

pypi license gitter

Current Version: v0.14

Data validation and settings management using python type hinting.

Define how data should be in pure, canonical python; validate it with pydantic.

PEP 484 introduced type hinting into python 3.5, PEP 526 extended that with syntax for variable annotation in python 3.6.

pydantic uses those annotations to validate that untrusted data takes the form you want.

There’s also support for an extension to dataclasses where the input data is validated.

Example:

from datetime import datetime
from typing import List
from pydantic import BaseModel

class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime = None
    friends: List[int] = []

external_data = {'id': '123', 'signup_ts': '2017-06-01 12:22', 'friends': [1, '2', b'3']}
user = User(**external_data)
print(user)
# > User id=123 name='John Doe' signup_ts=datetime.datetime(2017, 6, 1, 12, 22) friends=[1, 2, 3]
print(user.id)
# > 123

(This script is complete, it should run “as is”)

What’s going on here:

  • id is of type int; the annotation only declaration tells pydantic that this field is required. Strings, bytes or floats will be coerced to ints if possible, otherwise an exception would be raised.
  • name is inferred as a string from the default, it is not required as it has a default.
  • signup_ts is a datetime field which is not required (None if it’s not supplied), pydantic will process either a unix timestamp int (e.g. 1496498400) or a string representing the date & time.
  • friends uses python’s typing system, it is required to be a list of integers, as with id integer-like objects will be converted to integers.

If validation fails pydantic with raise an error with a breakdown of what was wrong:

from pydantic import ValidationError
try:
    User(signup_ts='broken', friends=[1, 2, 'not number'])
except ValidationError as e:
    print(e.json())

"""
[
  {
    "loc": [
      "id"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": [
      "signup_ts"
    ],
    "msg": "invalid datetime format",
    "type": "type_error.datetime"
  },
  {
    "loc": [
      "friends",
      2
    ],
    "msg": "value is not a valid integer",
    "type": "type_error.integer"
  }
]
"""

Rationale

So pydantic uses some cool new language feature, but why should I actually go and use it?

no brainfuck
no new schema definition micro-language to learn. If you know python (and perhaps skim read the type hinting docs) you know how to use pydantic.
plays nicely with your IDE/linter/brain
because pydantic data structures are just instances of classes you define; auto-completion, linting, mypy and your intuition should all work properly with your validated data.
dual use
pydantic’s BaseSettings class allows it to be used in both a “validate this request data” context and “load my system settings” context. The main difference being that system settings can have defaults changed by environment variables and more complex objects like DSNs and python objects are often required.
fast
In benchmarks pydantic is faster than all other tested libraries.
validate complex structures
use of recursive pydantic models, typing’s List and Dict etc. and validators allow complex data schemas to be clearly and easily defined and then checked.
extendible
pydantic allows custom data types to be defined or you can extend validation with methods on a model decorated with the validator decorator.

Install

Just:

pip install pydantic

pydantic has no required dependencies except python 3.6 or 3.7 (and the dataclasses package in python 3.6). If you’ve got python 3.6 and pip installed - you’re good to go.

If you want pydantic to parse json faster you can add ujson as an optional dependency. Similarly if pydantic’s email validation relies on email-validator

pip install pydantic[ujson]
# or
pip install pydantic[email]
# or just
pip install pydantic[ujson,email]

Of course you can also install these requirements manually with pip install ....

Usage

PEP 484 Types

pydantic uses typing types to define more complex objects.

from typing import Dict, List, Optional, Set, Tuple, Union

from pydantic import BaseModel


class Model(BaseModel):
    simple_list: list = None
    list_of_ints: List[int] = None

    simple_tuple: tuple = None
    tuple_of_different_types: Tuple[int, float, str, bool] = None

    simple_dict: dict = None
    dict_str_float: Dict[str, float] = None

    simple_set: set = None
    set_bytes: Set[bytes] = None

    str_or_bytes: Union[str, bytes] = None
    none_or_str: Optional[str] = None

    compound: Dict[Union[str, bytes], List[Set[int]]] = None

print(Model(simple_list=['1', '2', '3']).simple_list)  # > ['1', '2', '3']
print(Model(list_of_ints=['1', '2', '3']).list_of_ints)  # > [1, 2, 3]

print(Model(simple_dict={'a': 1, b'b': 2}).simple_dict)  # > {'a': 1, b'b': 2}
print(Model(dict_str_float={'a': 1, b'b': 2}).dict_str_float)  # > {'a': 1.0, 'b': 2.0}

print(Model(simple_tuple=[1, 2, 3, 4]).simple_tuple)  # > (1, 2, 3, 4)
print(Model(tuple_of_different_types=[1, 2, 3, 4]).tuple_of_different_types)  # > (1, 2.0, '3', True)

(This script is complete, it should run “as is”)

dataclasses

Note

New in version v0.14.0.

If you don’t want to use pydantic’s BaseModel you can instead get the same data validation on standard dataclasses (introduced in python 3.7).

Dataclasses work in python 3.6 using the dataclasses backport package.

from datetime import datetime
from pydantic.dataclasses import dataclass

@dataclass
class User:
    id: int
    name: str = 'John Doe'
    signup_ts: datetime = None


user = User(id='42', signup_ts='2032-06-21T12:00')
print(user)
# > User(id=42, name='John Doe', signup_ts=datetime.datetime(2032, 6, 21, 12, 0))

(This script is complete, it should run “as is”)

You can use all the standard pydantic field types and the resulting dataclass will be identical to the one created by the standard library dataclass decorator.

pydantic.dataclasses.dataclass’s arguments are the same as the standard decorator, except one extra key word argument validate_assignment which has the same meaning as Config.validate_assignment.

Currently validators don’t work on validators, if it’s something you want please create an issue on github.

Choices

pydantic uses python’s standard enum classes to define choices.

from enum import Enum, IntEnum

from pydantic import BaseModel


class FruitEnum(str, Enum):
    pear = 'pear'
    banana = 'banana'


class ToolEnum(IntEnum):
    spanner = 1
    wrench = 2


class CookingModel(BaseModel):
    fruit: FruitEnum = FruitEnum.pear
    tool: ToolEnum = ToolEnum.spanner


print(CookingModel())
# > CookingModel fruit=<FruitEnum.pear: 'pear'> tool=<ToolEnum.spanner: 1>
print(CookingModel(tool=2, fruit='banana'))
# > CookingModel fruit=<FruitEnum.banana: 'banana'> tool=<ToolEnum.wrench: 2>
print(CookingModel(fruit='other'))
# will raise a validation error

(This script is complete, it should run “as is”)

Validators

Custom validation and complex relationships between objects can achieved using the validator decorator.

from pydantic import BaseModel, ValidationError, validator


class UserModel(BaseModel):
    name: str
    password1: str
    password2: str

    @validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @validator('password2')
    def passwords_match(cls, v, values, **kwargs):
        if 'password1' in values and v != values['password1']:
            raise ValueError('passwords do not match')
        return v


print(UserModel(name='samuel colvin', password1='zxcvbn', password2='zxcvbn'))
# > UserModel name='Samuel Colvin' password1='zxcvbn' password2='zxcvbn'

try:
    UserModel(name='samuel', password1='zxcvbn', password2='zxcvbn2')
except ValidationError as e:
    print(e)
"""
2 validation errors
name
  must contain a space (type=value_error)
password2
  passwords do not match (type=value_error)
"""

(This script is complete, it should run “as is”)

A few things to note on validators:

  • validators are “class methods”, the first value they receive here will be the UserModel not an instance of UserModel
  • their signature can with be (cls, value) or (cls, value, *, values, config, field)
  • validator should either return the new value or raise a ValueError or TypeError
  • where validators rely on other values, you should be aware that:
    • Validation is done in the order fields are defined, eg. here password2 has access to password1 (and name), but password1 does not have access to password2. You should heed the warning below regarding field order and required fields.
    • If validation fails on another field (or that field is missing) it will not be included in values, hence if 'password1' in values and ... in this example.

Pre and Whole Validators

Validators can do a few more complex things:

import json
from typing import List

from pydantic import BaseModel, ValidationError, validator


class DemoModel(BaseModel):
    numbers: List[int] = []
    people: List[str] = []

    @validator('people', 'numbers', pre=True, whole=True)
    def json_decode(cls, v):
        if isinstance(v, str):
            try:
                return json.loads(v)
            except ValueError:
                pass
        return v

    @validator('numbers')
    def check_numbers_low(cls, v):
        if v > 4:
            raise ValueError(f'number too large {v} > 4')
        return v

    @validator('numbers', whole=True)
    def check_sum_numbers_low(cls, v):
        if sum(v) > 8:
            raise ValueError(f'sum of numbers greater than 8')
        return v


print(DemoModel(numbers='[1, 1, 2, 2]'))
# > DemoModel numbers=[1, 1, 2, 2] people=[]

try:
    DemoModel(numbers='[1, 2, 5]')
except ValidationError as e:
    print(e)
"""
1 validation error
numbers -> 2
  number too large 5 > 4 (type=value_error)
"""

try:
    DemoModel(numbers=[3, 3, 3])
except ValidationError as e:
    print(e)
"""
1 validation error
numbers
  sum of numbers greater than 8 (type=value_error)
"""

(This script is complete, it should run “as is”)

A few more things to note:

  • a single validator can apply to multiple fields, either by defining multiple fields or by the special value '*' which means that validator will be called for all fields.
  • the keyword argument pre will cause validators to be called prior to other validation
  • the whole keyword argument will mean validators are applied to entire objects rather than individual values (applies for complex typing objects eg. List, Dict, Set)

Validate Always

For performance reasons by default validators are not called for fields where the value is not supplied. However there are situations where it’s useful or required to always call the validator, e.g. to set a dynamic default value.

from datetime import datetime

from pydantic import BaseModel, validator


class DemoModel(BaseModel):
    ts: datetime = None

    @validator('ts', pre=True, always=True)
    def set_ts_now(cls, v):
        return v or datetime.now()


print(DemoModel())
# > DemoModel ts=datetime.datetime(2017, 11, 8, 13, 59, 11, 723629)

print(DemoModel(ts='2017-11-08T14:00'))
# > DemoModel ts=datetime.datetime(2017, 11, 8, 14, 0)

(This script is complete, it should run “as is”)

You’ll often want to use this together with pre since otherwise the with always=True pydantic would try to validate the default None which would cause an error.

Field Checks

On class creation validators are checked to confirm that the fields they specify actually exist on the model.

Occasionally however this is not wanted: when you define a validator to validate fields on inheriting models. In this case you should set check_fields=False on the validator.

Recursive Models

More complex hierarchical data structures can be defined using models as types in annotations themselves.

The ellipsis ... just means “Required” same as annotation only declarations above.

from typing import List
from pydantic import BaseModel

class Foo(BaseModel):
    count: int = ...
    size: float = None

class Bar(BaseModel):
    apple = 'x'
    banana = 'y'

class Spam(BaseModel):
    foo: Foo = ...
    bars: List[Bar] = ...


m = Spam(foo={'count': 4}, bars=[{'apple': 'x1'}, {'apple': 'x2'}])
print(m)
# > Spam foo=<Foo count=4 size=None> bars=[<Bar apple='x1' banana='y'>, <Bar apple='x2' banana='y'>]
print(m.dict())
# {'foo': {'count': 4, 'size': None}, 'bars': [{'apple': 'x1', 'banana': 'y'}, {'apple': 'x2', 'banana': 'y'}]}

(This script is complete, it should run “as is”)

Schema Creation

Pydantic allows auto creation of schemas from models:

from enum import IntEnum
from pydantic import BaseModel, Schema

class FooBar(BaseModel):
    count: int
    size: float = None

class Gender(IntEnum):
    male = 1
    female = 2
    other = 3
    not_given = 4

class MainModel(BaseModel):
    """
    This is the description of the main model
    """
    foo_bar: FooBar = Schema(...)
    gender: Gender = Schema(
        None,
        alias='Gender',
        choice_names={3: 'Other Gender', 4: "I'd rather not say"}
    )
    snap: int = Schema(
        42,
        title='The Snap',
        description='this is the value of snap'
    )

    class Config:
        title = 'Main'

print(MainModel.schema())
# > {
#       'type': 'object',
#       'title': 'Main',
#       'properties': {
#           'foo_bar': {
#           ...
print(MainModel.schema_json(indent=2))

Outputs:

{
  "type": "object",
  "title": "Main",
  "description": "This is the description of the main model",
  "properties": {
    "foo_bar": {
      "type": "object",
      "title": "FooBar",
      "properties": {
        "count": {
          "type": "int",
          "title": "Count",
          "required": true
        },
        "size": {
          "type": "float",
          "title": "Size",
          "required": false
        }
      },
      "required": true
    },
    "Gender": {
      "type": "int",
      "title": "Gender",
      "required": false,
      "choices": [
        [1, "Male"],
        [2, "Female"],
        [3, "Other"],
        [4, "I'd rather not say"]
      ]
    },
    "snap": {
      "type": "int",
      "title": "The Snap",
      "required": false,
      "default": 42,
      "description": "this is the value of snap"
    }
  }
}

(This script is complete, it should run “as is”)

schema will return a dict of the schema, while schema_json will return a JSON representation of that.

“submodels” are recursively included in the schema.

The description for models is taken from the docstring of the class.

Enums are shown in the schema as choices, optionally the choice_names argument can be used to provide human friendly descriptions for the choices. If choice_names is omitted or misses values, descriptions will be generated by calling .title() on the name of the member.

Optionally the Schema class can be used to provide extra information about the field, arguments:

  • default (positional argument), since the Schema is replacing the field’s default, its first argument is used to set the default, use ellipsis (...) to indicate the field is required
  • title if omitted field_name.title() is used
  • choice_names as described above
  • alias - the public name of the field.
  • ** any other keyword arguments (eg. description) will be added verbatim to the field’s schema

Instead of using Schema, the fields property of the Config class can be used to set all the arguments above except default.

The schema is generated by default using aliases as keys, it can also be generated using model property names not aliases with MainModel.schema/schema_json(by_alias=False).

Error Handling

Pydantic will raise ValidationError whenever it finds an error in the data it’s validating.

Note

Validation code should not raise ValidationError itself, but rather raise ValueError or TypeError (or subclasses thereof) which will be caught and used to populate ValidationError.

One exception will be raised regardless of the number of errors found, that ValidationError will contain information about all the errors and how they happened.

You can access these errors in a several ways:

e.errors():method will return list of errors found in the input data.
e.json():method will return a JSON representation of errors.
str(e):method will return a human readable representation of the errors.

Each error object contains:

loc:the error’s location as a list, the first item in the list will be the field where the error occurred, subsequent items will represent the field where the error occurred in sub models when they’re used.
type:a unique identifier of the error readable by a computer.
msg:a human readable explanation of the error.
ctx:an optional object which contains values required to render the error message.

To demonstrate that:

from typing import List
from pydantic import BaseModel, ValidationError, conint

class Location(BaseModel):
    lat = 0.1
    lng = 10.1

class Model(BaseModel):
    is_required: float
    gt_int: conint(gt=42)
    list_of_ints: List[int] = None
    a_float: float = None
    recursive_model: Location = None

data = dict(
    list_of_ints=['1', 2, 'bad'],
    a_float='not a float',
    recursive_model={'lat': 4.2, 'lng': 'New York'},
    gt_int=21,
)

try:
    Model(**data)
except ValidationError as e:
    print(e)
"""
5 validation errors
list_of_ints -> 2
  value is not a valid integer (type=type_error.integer)
a_float
  value is not a valid float (type=type_error.float)
is_required
  field required (type=value_error.missing)
recursive_model -> lng
  value is not a valid float (type=type_error.float)
gt_int
  ensure this value is greater than 42 (type=value_error.number.gt; limit_value=42)
"""

try:
    Model(**data)
except ValidationError as e:
    print(e.json())

"""
[
  {
    "loc": ["is_required"],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": ["gt_int"],
    "msg": "ensure this value is greater than 42",
    "type": "value_error.number.gt",
    "ctx": {
      "limit_value": 42
    }
  },
  {
    "loc": ["list_of_ints", 2],
    "msg": "value is not a valid integer",
    "type": "type_error.integer"
  },
  {
    "loc": ["a_float"],
    "msg": "value is not a valid float",
    "type": "type_error.float"
  },
  {
    "loc": ["recursive_model", "lng"],
    "msg": "value is not a valid float",
    "type": "type_error.float"
  }
]
"""

(This script is complete, it should run “as is”. json() has indent=2 set by default, but I’ve tweaked the JSON here and below to make it slightly more concise.)

In your custom data types or validators you should use TypeError and ValueError to raise errors:

from pydantic import BaseModel, ValidationError, validator

class Model(BaseModel):
    foo: str

    @validator('foo')
    def name_must_contain_space(cls, v):
        if v != 'bar':
            raise ValueError('value must be "bar"')

        return v

try:
    Model(foo='ber')
except ValidationError as e:
    print(e.errors())

"""
[
    {
        'loc': ('foo',),
        'msg': 'value must be "bar"',
        'type': 'value_error',
    },
]
"""

(This script is complete, it should run “as is”)

You can also define your own error class with abilities to specify custom error code, message template and context:

from pydantic import BaseModel, PydanticValueError, ValidationError, validator

class NotABarError(PydanticValueError):
    code = 'not_a_bar'
    msg_template = 'value is not "bar", got "{wrong_value}"'

class Model(BaseModel):
    foo: str

    @validator('foo')
    def name_must_contain_space(cls, v):
        if v != 'bar':
            raise NotABarError(wrong_value=v)
        return v

try:
    Model(foo='ber')
except ValidationError as e:
    print(e.json())
"""
[
  {
    "loc": ["foo"],
    "msg": "value is not \"bar\", got \"ber\"",
    "type": "value_error.not_a_bar",
    "ctx": {
      "wrong_value": "ber"
    }
  }
]
"""

(This script is complete, it should run “as is”)

Exotic Types

Pydantic comes with a number of utilities for parsing or validating common objects.

import uuid
from decimal import Decimal
from pathlib import Path
from uuid import UUID

from pydantic import (DSN, UUID1, UUID3, UUID4, UUID5, BaseModel, DirectoryPath, EmailStr, FilePath, NameEmail,
                      NegativeFloat, NegativeInt, PositiveFloat, PositiveInt, PyObject, UrlStr, condecimal, confloat,
                      conint, constr)


class Model(BaseModel):
    cos_function: PyObject = None

    path_to_something: Path = None
    path_to_file: FilePath = None
    path_to_directory: DirectoryPath = None

    short_str: constr(min_length=2, max_length=10) = None
    regex_str: constr(regex='apple (pie|tart|sandwich)') = None
    strip_str: constr(strip_whitespace=True)

    big_int: conint(gt=1000, lt=1024) = None
    pos_int: PositiveInt = None
    neg_int: NegativeInt = None

    big_float: confloat(gt=1000, lt=1024) = None
    unit_interval: confloat(ge=0, le=1) = None
    pos_float: PositiveFloat = None
    neg_float: NegativeFloat = None

    email_address: EmailStr = None
    email_and_name: NameEmail = None

    url: UrlStr = None

    db_name = 'foobar'
    db_user = 'postgres'
    db_password: str = None
    db_host = 'localhost'
    db_port = '5432'
    db_driver = 'postgres'
    db_query: dict = None
    dsn: DSN = None
    decimal: Decimal = None
    decimal_positive: condecimal(gt=0) = None
    decimal_negative: condecimal(lt=0) = None
    decimal_max_digits_and_places: condecimal(max_digits=2, decimal_places=2) = None
    uuid_any: UUID = None
    uuid_v1: UUID1 = None
    uuid_v3: UUID3 = None
    uuid_v4: UUID4 = None
    uuid_v5: UUID5 = None

m = Model(
    cos_function='math.cos',
    path_to_something='/home',
    path_to_file='/home/file.py',
    path_to_directory='home/projects',
    short_str='foo',
    regex_str='apple pie',
    strip_str='   bar',
    big_int=1001,
    pos_int=1,
    neg_int=-1,
    big_float=1002.1,
    pos_float=2.2,
    neg_float=-2.3,
    unit_interval=0.5,
    email_address='Samuel Colvin <s@muelcolvin.com >',
    email_and_name='Samuel Colvin <s@muelcolvin.com >',
    url='http://example.com',
    decimal=Decimal('42.24'),
    decimal_positive=Decimal('21.12'),
    decimal_negative=Decimal('-21.12'),
    decimal_max_digits_and_places=Decimal('0.99'),
    uuid_any=uuid.uuid4(),
    uuid_v1=uuid.uuid1(),
    uuid_v3=uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org'),
    uuid_v4=uuid.uuid4(),
    uuid_v5=uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
)
print(m.dict())
"""
{
    'cos_function': <built-in function cos>,
    'path_to_something': PosixPath('/home'),
    'path_to_file': PosixPath('/home/file.py'),
    'path_to_directory': PosixPath('/home/projects'),
    'short_str': 'foo',
    'regex_str': 'apple pie',
    'strip_str': 'bar',
    'big_int': 1001,
    'pos_int': 1,
    'neg_int': -1,
    'big_float': 1002.1,
    'pos_float': 2.2,
    'neg_float': -2.3,
    'unit_interval': 0.5,
    'email_address': 's@muelcolvin.com',
    'email_and_name': <NameEmail("Samuel Colvin <s@muelcolvin.com>")>,
    'url': 'http://example.com',
    ...
    'dsn': 'postgres://postgres@localhost:5432/foobar',
    'decimal': Decimal('42.24'),
    'decimal_positive': Decimal('21.12'),
    'decimal_negative': Decimal('-21.12'),
    'decimal_max_digits_and_places': Decimal('0.99'),
    'uuid_any': UUID('ebcdab58-6eb8-46fb-a190-d07a33e9eac8'),
    'uuid_v1': UUID('c96e505c-4c62-11e8-a27c-dca90496b483'),
    'uuid_v3': UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e'),
    'uuid_v4': UUID('22209f7a-aad1-491c-bb83-ea19b906d210'),
    'uuid_v5': UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d'),
}
"""

(This script is complete, it should run “as is”)

Json Type

You can use Json data type - Pydantic will first parse raw JSON string and then will validate parsed object against defined Json structure if it’s provided.

from typing import List

from pydantic import BaseModel, Json, ValidationError

class SimpleJsonModel(BaseModel):
    json_obj: Json

class ComplexJsonModel(BaseModel):
    json_obj: Json[List[int]]

print(SimpleJsonModel(json_obj='{"b": 1}'))
# > SimpleJsonModel json_obj={'b': 1}

print(ComplexJsonModel(json_obj='[1, 2, 3]'))
# > ComplexJsonModel json_obj=[1, 2, 3]


try:
    ComplexJsonModel(json_obj=12)
except ValidationError as e:
    print(e)
"""
1 validation error
json_obj
  JSON object must be str, bytes or bytearray (type=type_error.json)
"""

try:
    ComplexJsonModel(json_obj='[a, b]')
except ValidationError as e:
    print(e)
"""
1 validation error
json_obj
  Invalid JSON (type=value_error.json)
"""

try:
    ComplexJsonModel(json_obj='["a", "b"]')
except ValidationError as e:
    print(e)
"""
2 validation errors
json_obj -> 0
  value is not a valid integer (type=type_error.integer)
json_obj -> 1
  value is not a valid integer (type=type_error.integer)
"""

(This script is complete, it should run “as is”)

Custom Data Types

You can also define your own data types. Class method get_validators will be called to get validators to parse and validate the input data.

from pydantic import BaseModel, ValidationError


class StrictStr(str):
    @classmethod
    def get_validators(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if not isinstance(v, str):
            raise ValueError(f'strict string: str expected not {type(v)}')
        return v


class Model(BaseModel):
    s: StrictStr


print(Model(s='hello'))
# > Model s='hello'

try:
    print(Model(s=123))
except ValidationError as e:
    print(e.json())
"""
[
  {
    "loc": [
      "s"
    ],
    "msg": "strict string: str expected not <class 'int'>",
    "type": "value_error"
  }
]
"""

(This script is complete, it should run “as is”)

Helper Functions

Pydantic provides three classmethod helper functions on models for parsing data:

parse_obj:this is almost identical to the __init__ method of the model except if the object passed is not a dict ValidationError will be raised (rather than python raising a TypeError).
parse_raw:takes a str or bytes parses it as json, or pickle data and then passes the result to parse_obj. The data type is inferred from the content_type argument, otherwise json is assumed.
parse_file:reads a file and passes the contents to parse_raw, if content_type is omitted it is inferred from the file’s extension.
import pickle
from datetime import datetime
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: datetime = None

m = User.parse_obj({'id': 123, 'name': 'James'})
print(m)
# > User id=123 name='James' signup_ts=None

try:
    User.parse_obj(['not', 'a', 'dict'])
except ValidationError as e:
    print(e)
# > error validating input
# > User expected dict not list (error_type=TypeError)

m = User.parse_raw('{"id": 123, "name": "James"}')  # assumes json as no content type passed
print(m)
# > User id=123 name='James' signup_ts=None

pickle_data = pickle.dumps({'id': 123, 'name': 'James', 'signup_ts': datetime(2017, 7, 14)})
m = User.parse_raw(pickle_data, content_type='application/pickle', allow_pickle=True)
print(m)
# > User id=123 name='James' signup_ts=datetime.datetime(2017, 7, 14, 0, 0)

(This script is complete, it should run “as is”)

Note

Since pickle allows complex objects to be encoded, to use it you need to explicitly pass allow_pickle to the parsing function.

Model Config

Behaviour of pydantic can be controlled via the Config class on a model.

Options:

anystr_strip_whitespace:
 strip or not trailing and leading whitespace for str & byte types (default: False)
min_anystr_length:
 min length for str & byte types (default: 0)
max_anystr_length:
 max length for str & byte types (default: 2 ** 16)
validate_all:whether or not to validate field defaults (default: False)
ignore_extra:whether to ignore any extra values in input data (default: True)
allow_extra:whether or not too allow (and include on the model) any extra values in input data (default: False)
allow_mutation:whether or not models are faux-immutable, e.g. __setattr__ fails (default: True)
use_enum_values:
 whether to populate models with the value property of enums, rather than the raw enum - useful if you want to serialise model.dict() later (default: False)
fields:schema information on each field, this is equivilant to using the schema class (default: None)
validate_assignment:
 whether to perform validation on assignment to attributes or not (default: False)
allow_population_by_alias:
 whether or not an aliased field may be populated by its name as given by the model attribute, rather than strictly the alias; please be sure to read the warning below before enabling this (default: False)
error_msg_templates:
 let’s you to override default error message templates. Pass in a dictionary with keys matching the error messages you want to override (default: {})
arbitrary_types_allowed:
 whether to allow arbitrary user types for fields (they are validated simply by checking if the value is instance of that type). If False - RuntimeError will be raised on model declaration (default: False)
json_encoders:customise the way types are encoded to json, see JSON Serialisation for more details.

Warning

Think twice before enabling allow_population_by_alias! Enabling it could cause previously correct code to become subtly incorrect. As an example, say you have a field named card_number with the alias cardNumber. With population by alias disabled (the default), trying to parse an object with only the key card_number will fail. However, if you enable population by alias, the card_number field can now be populated from cardNumber or card_number, and the previously-invalid example object would now be valid. This may be desired for some use cases, but in others (like the one given here, perhaps!), relaxing strictness with respect to aliases could introduce bugs.

from pydantic import BaseModel, ValidationError


class Model(BaseModel):
    v: str

    class Config:
        max_anystr_length = 10
        error_msg_templates = {
            'value_error.any_str.max_length': 'max_length:{limit_value}',
        }


try:
    Model(v='x' * 20)
except ValidationError as e:
    print(e)
"""
1 validation error
v
  max_length:10 (type=value_error.any_str.max_length; limit_value=10)
"""

(This script is complete, it should run “as is”)

Settings

One of pydantic’s most useful applications is to define default settings, allow them to be overridden by environment variables or keyword arguments (e.g. in unit tests).

This usage example comes last as it uses numerous concepts described above.

from typing import Set

from pydantic import BaseModel, DSN, BaseSettings, PyObject


class SubModel(BaseModel):
    foo = 'bar'
    apple = 1


class Settings(BaseSettings):
    redis_host = 'localhost'
    redis_port = 6379
    redis_database = 0
    redis_password: str = None

    auth_key: str = ...

    invoicing_cls: PyObject = 'path.to.Invoice'

    db_name = 'foobar'
    db_user = 'postgres'
    db_password: str = None
    db_host = 'localhost'
    db_port = '5432'
    db_driver = 'postgres'
    db_query: dict = None
    dsn: DSN = None

    # to override domains:
    # export MY_PREFIX_DOMAINS = '["foo.com", "bar.com"]'
    domains: Set[str] = set()

    # to override more_settings:
    # export MY_PREFIX_MORE_SETTINGS = '{"foo": "x", "apple": 1}'
    more_settings: SubModel = SubModel()

    class Config:
        env_prefix = 'MY_PREFIX_'  # defaults to 'APP_'
        fields = {
            'auth_key': {
                'alias': 'my_api_key'
            }
        }

(This script is complete, it should run “as is”)

Here redis_port could be modified via export MY_PREFIX_REDIS_PORT=6380 or auth_key by export my_api_key=6380.

Complex types like list, set, dict and submodels can be set by using JSON environment variables.

Dynamic model creation

There are some occasions where the shape of a model is not known until runtime, for this pydantic provides the create_model method to allow models to be created on the fly.

from pydantic import BaseModel, create_model

DynamicFoobarModel = create_model('DynamicFoobarModel', foo=(str, ...), bar=123)


class StaticFoobarModel(BaseModel):
    foo: str
    bar: int = 123

Here StaticFoobarModel and DynamicFoobarModel are identical.

Fields are defined by either a a tuple of the form (<type>, <default value>) or just a default value. The special key word arguments __config__ and __base__ can be used to customise the new model. This includes extending a base model with extra fields.

from pydantic import BaseModel, create_model


class FooModel(BaseModel):
    foo: str
    bar: int = 123


BarModel = create_model('BarModel', apple='russet', banana='yellow', __base__=FooModel)
print(BarModel)
# > <class 'pydantic.main.BarModel'>
print(', '.join(BarModel.__fields__.keys()))
# > foo, bar, apple, banana

Usage with mypy

Pydantic works with mypy provided you use the “annotation only” version of required variables:

from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel, NoneStr

class Model(BaseModel):
    age: int
    first_name = 'John'
    last_name: NoneStr = None
    signup_ts: Optional[datetime] = None
    list_of_ints: List[int]

m = Model(age=42, list_of_ints=[1, '2', b'3'])
print(m.age)
# > 42

Model()
# will raise a validation error for age and list_of_ints

(This script is complete, it should run “as is”)

This script is complete, it should run “as is”. You can also run it through mypy with:

mypy --ignore-missing-imports --follow-imports=skip --strict-optional pydantic_mypy_test.py

Strict Optional

For your code to pass with --strict-optional you need to to use Optional[] or an alias of Optional[] for all fields with None default, this is standard with mypy.

Pydantic provides a few useful optional or union types:

  • NoneStr aka. Optional[str]
  • NoneBytes aka. Optional[bytes]
  • StrBytes aka. Union[str, bytes]
  • NoneStrBytes aka. Optional[StrBytes]

If these aren’t sufficient you can of course define your own.

Required Fields and mypy

The ellipsis notation ... will not work with mypy, you need to use annotation only fields as in the example above.

Warning

Be aware that using annotation only fields will alter the order of your fields in metadata and errors: annotation only fields will always come first, but still in the order they were defined.

To get round this you can use the Required (via from pydantic import Required) field as an alias for ellipses or annotation only.

Faux Immutability

Models can be configured to be immutable via allow_mutation = False this will prevent changing attributes of a model.

Warning

Immutability in python is never strict. If developers are determined/stupid they can always modify a so-called “immutable” object.

from pydantic import BaseModel


class FooBarModel(BaseModel):
    a: str
    b: dict

    class Config:
        allow_mutation = False


foobar = FooBarModel(a='hello', b={'apple': 'pear'})

try:
    foobar.a = 'different'
except TypeError as e:
    print(e)
    # > "FooBarModel" is immutable and does not support item assignment

print(foobar.a)
# > hello

print(foobar.b)
# > {'apple': 'pear'}

foobar.b['apple'] = 'grape'
print(foobar.b)
# > {'apple': 'grape'}

Trying to change a caused an error and it remains unchanged, however the dict b is mutable and the immutability of foobar doesn’t stop being changed.

Copying

The dict function returns a dictionary containing the attributes of a model. Sub-models are recursively converted to dicts, copy allows models to be duplicated, this is particularly useful for immutable models.

dict, copy, and json (described below) all take the optional include and exclude keyword arguments to control which attributes are returned or copied, respectively. copy accepts extra keyword arguments, update, which accepts a dict mapping attributes to new values that will be applied as the model is duplicated and deep to make a deep copy of the model.

from pydantic import BaseModel

class BarModel(BaseModel):
    whatever: int

class FooBarModel(BaseModel):
    banana: float
    foo: str
    bar: BarModel

m = FooBarModel(banana=3.14, foo='hello', bar={'whatever': 123})

print(m.dict())
# (returns a dictionary)
# > {'banana': 3.14, 'foo': 'hello', 'bar': {'whatever': 123}}

print(m.dict(include={'foo', 'bar'}))
# > {'foo': 'hello', 'bar': {'whatever': 123}}

print(m.dict(exclude={'foo', 'bar'}))
# > {'banana': 3.14}

print(m.copy())
# > FooBarModel banana=3.14 foo='hello' bar=<BarModel whatever=123>

print(m.copy(include={'foo', 'bar'}))
# > FooBarModel foo='hello' bar=<BarModel whatever=123>

print(m.copy(exclude={'foo', 'bar'}))
# > FooBarModel banana=3.14

print(m.copy(update={'banana': 0}))
# > FooBarModel banana=0 foo='hello' bar=<BarModel whatever=123>

print(id(m.bar), id(m.copy().bar))
# normal copy gives the same object reference for `bar`
# > 140494497582280 140494497582280

print(id(m.bar), id(m.copy(deep=True).bar))
# deep copy gives a new object reference for `bar`
# > 140494497582280 140494497582856

Serialisation

pydantic has native support for serialisation to JSON and Pickle, you can of course serialise to any other format you like by processing the result of dict().

JSON Serialisation

The json() method will serialise a model to JSON, json() in turn calls dict() and serialises its result.

Serialisation can be customised on a model using the json_encoders config property, the keys should be types and the values should be functions which serialise that type, see the example below.

If this is not sufficient, json() takes an optional encoder argument which allows complete control over how non-standard types are encoded to JSON.

from datetime import datetime, timedelta
from pydantic import BaseModel
from pydantic.json import timedelta_isoformat

class BarModel(BaseModel):
    whatever: int

class FooBarModel(BaseModel):
    foo: datetime
    bar: BarModel

m = FooBarModel(foo=datetime(2032, 6, 1, 12, 13, 14), bar={'whatever': 123})
print(m.json())
# (returns a str)
# > {"foo": "2032-06-01T12:13:14", "bar": {"whatever": 123}}

class WithCustomEncoders(BaseModel):
    dt: datetime
    diff: timedelta

    class Config:
        json_encoders = {
            datetime: lambda v: (v - datetime(1970, 1, 1)).total_seconds(),
            timedelta: timedelta_isoformat,
        }

m = WithCustomEncoders(dt=datetime(2032, 6, 1), diff=timedelta(hours=100))
print(m.json())
# > {"dt": 1969660800.0, "diff": "P4DT4H0M0.000000S"}

(This script is complete, it should run “as is”)

By default timedelta’s are encoded as a simple float of total seconds. The timedelta_isoformat is provided as an optional alternative which implements ISO 8601 time diff encoding.

Pickle Serialisation

Using the same plumbing as copy() pydantic models support efficient pickling and unpicking.

import pickle
from pydantic import BaseModel


class FooBarModel(BaseModel):
    a: str
    b: int


m = FooBarModel(a='hello', b=123)
print(m)
# > FooBarModel a='hello' b=123

data = pickle.dumps(m)
print(data)
# > b'\x80\x03c...'

m2 = pickle.loads(data)
print(m2)
# > FooBarModel a='hello' b=123

(This script is complete, it should run “as is”)

Abstract Base Classes

Pydantic models can be used alongside Python’s Abstract Base Classes (ABCs).

import abc
from pydantic import BaseModel


class FooBarModel(BaseModel, abc.ABC):
    a: str
    b: int

    @abc.abstractmethod
    def my_abstract_method(self):
        pass

(This script is complete, it should run “as is”)

Benchmarks

Below are the results of crude benchmarks comparing pydantic to other validation libraries.

Package Relative Performance Mean validation time std. dev.
pydantic   24.8μs 0.257μs
toasted-marshmallow 1.6x slower 39.5μs 0.134μs
marshmallow 1.9x slower 47.4μs 0.181μs
trafaret 2.3x slower 56.1μs 0.253μs
django-restful-framework 16.0x slower 397.5μs 0.382μs

(See the benchmarks code for more details on the test case. Feel free to submit more benchmarks or improve an existing one.)

History

v0.14 (2018-10-02)

v0.13.1 (2018-09-21)

  • fix issue where int_validator doesn’t cast a bool to an int #264 by @nphyatt
  • add deep copy support for BaseModel.copy() #249, @gangefors

v0.13.0 (2018-08-25)

  • raise an exception if a field’s name shadows an existing BaseModel attribute #242
  • add UrlStr and urlstr types #236
  • timedelta json encoding ISO8601 and total seconds, custom json encoders #247, by @cfkanesan and @samuelcolvin
  • allow timedelta objects as values for properties of type timedelta (matches datetime etc. behavior) #247

v0.12.1 (2018-07-31)

  • fix schema generation for fields defined using typing.Any #237

v0.12.0 (2018-07-31)

  • add by_alias argument in .dict() and .json() model methods #205
  • add Json type support #214
  • support tuples #227
  • major improvements and changes to schema #213

v0.11.2 (2018-07-05)

  • add NewType support #115
  • fix list, set & tuple validation #225
  • separate out validate_model method, allow errors to be returned along with valid values #221

v0.11.1 (2018-07-02)

v0.11.0 (2018-06-28)

  • make list, tuple and set types stricter #86
  • breaking change: remove msgpack parsing #201
  • add FilePath and DirectoryPath types #10
  • model schema generation #190
  • JSON serialisation of models and schemas #133

v0.10.0 (2018-06-11)

  • add Config.allow_population_by_alias #160, thanks @bendemaree
  • breaking change: new errors format #179, thanks @Gr1N
  • breaking change: removed Config.min_number_size and Config.max_number_size #183, thanks @Gr1N
  • breaking change: correct behaviour of lt and gt arguments to conint etc. #188 for the old behaviour use le and ge #194, thanks @jaheba
  • added error context and ability to redefine error message templates using Config.error_msg_templates #183, thanks @Gr1N
  • fix typo in validator exception #150
  • copy defaults to model values, so different models don’t share objects #154

v0.9.1 (2018-05-10)

  • allow custom get_field_config on config classes #159
  • add UUID1, UUID3, UUID4 and UUID5 types #167, thanks @Gr1N
  • modify some inconsistent docstrings and annotations #173, thanks @YannLuo
  • fix type annotations for exotic types #171, thanks @Gr1N
  • re-use type validators in exotic types #171
  • scheduled monthly requirements updates #168
  • add Decimal, ConstrainedDecimal and condecimal types #170, thanks @Gr1N

v0.9.0 (2018-04-28)

  • tweak email-validator import error message #145
  • fix parse error of parse_date() and parse_datetime() when input is 0 #144, thanks @YannLuo
  • add Config.anystr_strip_whitespace and strip_whitespace kwarg to constr, by default values is False #163, thanks @Gr1N
  • add ConstrainedFloat, confloat, PositiveFloat and NegativeFloat types #166, thanks @Gr1N

v0.8.0 (2018-03-25)

  • fix type annotation for inherit_config #139
  • breaking change: check for invalid field names in validators #140
  • validate attributes of parent models #141
  • breaking change: email validation now uses email-validator #142

v0.7.1 (2018-02-07)

  • fix bug with create_model modifying the base class

v0.7.0 (2018-02-06)

  • added compatibility with abstract base classes (ABCs) #123
  • add create_model method #113 #125
  • breaking change: rename .config to .__config__ on a model
  • breaking change: remove deprecated .values() on a model, use .dict() instead
  • remove use of OrderedDict and use simple dict #126
  • add Config.use_enum_values #127
  • add wildcard validators of the form @validate('*') #128

v0.6.4 (2018-02-01)

  • allow python date and times objects #122

v0.6.3 (2017-11-26)

  • fix direct install without README.rst present

v0.6.2 (2017-11-13)

  • errors for invalid validator use
  • safer check for complex models in Settings

v0.6.1 (2017-11-08)

  • prevent duplicate validators, #101
  • add always kwarg to validators, #102

v0.6.0 (2017-11-07)

  • assignment validation #94, thanks petroswork!
  • JSON in environment variables for complex types, #96
  • add validator decorators for complex validation, #97
  • depreciate values(...) and replace with .dict(...), #99

v0.5.0 (2017-10-23)

  • add UUID validation #89
  • remove index and track from error object (json) if they’re null #90
  • improve the error text when a list is provided rather than a dict #90
  • add benchmarks table to docs #91

v0.4.0 (2017-07-08)

  • show length in string validation error
  • fix aliases in config during inheritance #55
  • simplify error display
  • use unicode ellipsis in truncate
  • add parse_obj, parse_raw and parse_file helper functions #58
  • switch annotation only fields to come first in fields list not last

v0.3.0 (2017-06-21)

  • immutable models via config.allow_mutation = False, associated cleanup and performance improvement #44
  • immutable helper methods construct() and copy() #53
  • allow pickling of models #53
  • setattr is removed as __setattr__ is now intelligent #44
  • raise_exception removed, Models now always raise exceptions #44
  • instance method validators removed
  • django-restful-framework benchmarks added #47
  • fix inheritance bug #49
  • make str type stricter so list, dict etc are not coerced to strings. #52
  • add StrictStr which only always strings as input #52

v0.2.1 (2017-06-07)

  • pypi and travis together messed up the deploy of v0.2 this should fix it

v0.2.0 (2017-06-07)

  • breaking change: values() on a model is now a method not a property, takes include and exclude arguments
  • allow annotation only fields to support mypy
  • add pretty to_string(pretty=True) method for models

v0.1.0 (2017-06-03)

  • add docs
  • add history