Skip to content

Implement Pydantic V2 protocol #54034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ Other enhancements
- Many read/to_* functions, such as :meth:`DataFrame.to_pickle` and :func:`read_csv`, support forwarding compression arguments to lzma.LZMAFile (:issue:`52979`)
- Performance improvement in :func:`concat` with homogeneous ``np.float64`` or ``np.float32`` dtypes (:issue:`52685`)
- Performance improvement in :meth:`DataFrame.filter` when ``items`` is given (:issue:`52941`)
-
- Support Pydantic V2 protocol for :class:`Series` (:issue:`54034`)

.. ---------------------------------------------------------------------------
.. _whatsnew_210.notable_bug_fixes:
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ dependencies:
- py
- psycopg2>=2.9.3
- pyarrow>=7.0.0
- pydantic>=2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a few more files where this needs to be added

- pymysql>=1.0.2
- pyreadstat>=1.1.5
- pytables>=3.7.0
Expand Down
18 changes: 18 additions & 0 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
Union,
cast,
overload,
Type
)
import warnings
import weakref
Expand All @@ -37,6 +38,7 @@
)
from pandas._libs.lib import is_range_indexer
from pandas.compat import PYPY
from pandas.compat._optional import import_optional_dependency
from pandas.compat.numpy import function as nv
from pandas.errors import (
ChainedAssignmentError,
Expand Down Expand Up @@ -148,6 +150,10 @@
import pandas.plotting

if TYPE_CHECKING:
import pydantic_core
from pydantic import GetCoreSchemaHandler
from pydantic_core.core_schema import CoreSchema

from pandas._libs.internals import BlockValuesRefs
from pandas._typing import (
AggFuncType,
Expand Down Expand Up @@ -6215,3 +6221,15 @@ def cumsum(self, axis: Axis | None = None, skipna: bool = True, *args, **kwargs)
@doc(make_doc("cumprod", 1))
def cumprod(self, axis: Axis | None = None, skipna: bool = True, *args, **kwargs):
return NDFrame.cumprod(self, axis, skipna, *args, **kwargs)

@classmethod
def __get_pydantic_core_schema__(cls, source_type: Type[Any], handler: GetCoreSchemaHandler) -> CoreSchema:
pyd_core = cast("pydantic_core", import_optional_dependency("pydantic_core"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you casting to pydantic_core? is that a type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pydantic_core is a package: https://github.com/pydantic/pydantic-core - a dependency of Pydantic.

You can use cast() with the module to have your IDE understand what you are doing:

image

core_schema = pyd_core.core_schema

return core_schema.union_schema(
[
core_schema.is_instance_schema(Series),
core_schema.no_info_plain_validator_function(Series)
]
)
Comment on lines +6230 to +6235
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain what these do please?

20 changes: 20 additions & 0 deletions pandas/tests/test_downstream.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,3 +274,23 @@ def __radd__(self, other):

assert right.__add__(left) is NotImplemented
assert right + left is left


@td.skip_if_no("pydantic")
@td.skip_if_no("pydantic_core")
@pytest.mark.parametrize("series", [
Series([1, 2, 3]),
Series(np.array([1, 2, 3])),
Comment on lines +282 to +283
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two are the same

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I actually should have used [1, 2,3] and np.array[1, 2, 3] without the Series. I'll fix it.

Series({'a': 1, 'b': 2, 'c': 3}),
Series(3),
])
def test_pydantic_protocol(series: Series) -> None:
from pydantic import BaseModel

class Model(BaseModel):
series: Series


model = Model(series=series)
assert model.model_dump() == {}
assert model.model_json_schema() == {}
Comment on lines +295 to +296
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain these two lines please?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is not right. I couldn't set up pandas locally, so I was expecting the pipeline to show me the results 👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.