Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataCatalog deprecation warning #4510

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ElenaKhaustova
Copy link
Contributor

Description

Solves #4469

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
@ElenaKhaustova ElenaKhaustova marked this pull request as ready for review February 21, 2025 10:58
Signed-off-by: Elena Khaustova <[email protected]>
Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
It shows up twice now when you do kedro run but I suppose it's because of the deepcopy shallow copy?

@ElenaKhaustova
Copy link
Contributor Author

LGTM! It shows up twice now when you do kedro run but I suppose it's because of the deepcopy?

That's because we recreate catalog object to inject patterns for the old catalog:

catalog = catalog.shallow_copy(

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking: If a user is using 0.19.11 already, what would be the best way to make the code forward compatible?

Something like this?

try:
    import KedroDataCatalog as DataCatalog
except ImportError:
    import DataCatalog
  • If old Kedro, then old DataCatalog
  • If 0.19.x Kedro, then KedroDataCatalog
  • If Kedro >=1.x, then new DataCatalog

However, is KedroDataCatalog meant to be a 100 % compatible drop-in replacement?

I understand that this is not as terrible as the DataSet migration, but still worth thinking how the settings.py etc should look like for folks that have an explicit DATA_CATALOG_CLASS set and so on

@ElenaKhaustova
Copy link
Contributor Author

I'm thinking: If a user is using 0.19.11 already, what would be the best way to make the code forward compatible?

Something like this?

try:
    import KedroDataCatalog as DataCatalog
except ImportError:
    import DataCatalog
  • If old Kedro, then old DataCatalog
  • If 0.19.x Kedro, then KedroDataCatalog
  • If Kedro >=1.x, then new DataCatalog

However, is KedroDataCatalog meant to be a 100 % compatible drop-in replacement?

I understand that this is not as terrible as the DataSet migration, but still worth thinking how the settings.py etc should look like for folks that have an explicit DATA_CATALOG_CLASS set and so on

They are compatible now but that won't be the case after replacement, because we will remove all the temporal API kept for compatibility. Even if we suggest something now for importing to be forward-compatible it will not help much because of the API changes.

Comment on lines +166 to +168
"`DataCatalog` has been deprecated and will be replaced by an improved alternative, `KedroDataCatalog`, in Kedro 1.0.0."
"After this change, the `DataCatalog` name will persist, but its functionality will align with `KedroDataCatalog`. "
"For more details, refer to the documentation: https://docs.kedro.org/en/stable/data/index.html#kedrodatacatalog-experimental-feature",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"`DataCatalog` has been deprecated and will be replaced by an improved alternative, `KedroDataCatalog`, in Kedro 1.0.0."
"After this change, the `DataCatalog` name will persist, but its functionality will align with `KedroDataCatalog`. "
"For more details, refer to the documentation: https://docs.kedro.org/en/stable/data/index.html#kedrodatacatalog-experimental-feature",
"`DataCatalog` has been deprecated and will be replaced by `KedroDataCatalog` in Kedro 1.0.0."
"After this change, the `DataCatalog` name will persist, but its functionality will align with `KedroDataCatalog`. "
"For more details, refer to the documentation: https://docs.kedro.org/en/stable/data/index.html#kedrodatacatalog-experimental-feature",

While I agree it's an improvement it's up to the user to decide if it is 😉 this way the message is shorter as well.

@astrojuanlu
Copy link
Member

I'm thinking: If a user is using 0.19.11 already, what would be the best way to make the code forward compatible?
Something like this?

try:
    import KedroDataCatalog as DataCatalog
except ImportError:
    import DataCatalog
  • If old Kedro, then old DataCatalog
  • If 0.19.x Kedro, then KedroDataCatalog
  • If Kedro >=1.x, then new DataCatalog

However, is KedroDataCatalog meant to be a 100 % compatible drop-in replacement?
I understand that this is not as terrible as the DataSet migration, but still worth thinking how the settings.py etc should look like for folks that have an explicit DATA_CATALOG_CLASS set and so on

They are compatible now but that won't be the case after replacement, because we will remove all the temporal API kept for compatibility. Even if we suggest something now for importing to be forward-compatible it will not help much because of the API changes.

Got it. Then maybe what we can do is, in Kedro 1.0.x, keep the KedroDataCatalog, but in a deprecated form. Something like

>>> from kedro.io import KedroDataCatalog
This class is only kept for compatibility purposes and it will be removed in Kedro 2.0,
please do `from kedro.io import DataCatalog` instead

That way, code that's written to work with KedroDataCatalog in 0.19 will continue to work in 1.0 without changes.

WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants