Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing kedro-telemetry has altered the order of execution of Hooks #2203

Closed
inigohidalgo opened this issue Nov 22, 2024 · 7 comments
Closed

Comments

@inigohidalgo
Copy link

inigohidalgo commented Nov 22, 2024

Description

Hello,

With yesterday's release of kedro-viz which set kedro-telemetry as a required dependency, our kedro pipelines are breaking because of a change in behavior of our Hooks

We have a hook which runs an after_context_created which initializes a singleton containing certain info about the context which we use later. This hook now isn't running at the same point as before and the singleton is being accessed from within the pipeline registry before it has had time to initialize itself correctly.

The fix on our end is simple as we can just pin to lower versions of viz which do not force telemtry, but I am curious as to what could be causing this issue for when we do upgrade to later kedro versions which also force telemetry.

Thanks

kedro==0.18.14
kedro-viz==10.1.0
kedro-telemetry==0.6.1

@inigohidalgo inigohidalgo changed the title 10.1.0 introduced a bug in Kedro execution Installing kedro-telemetry has altered the order of execution of Hooks Nov 22, 2024
@noklam
Copy link
Contributor

noklam commented Nov 22, 2024

kedro-telemetry also implement after_context_created, but it's unclear whether telemetry hook run first or your custom one (does it matters?). If you custom hook need to be initiated first, use the @try_first decorator from pluggy to guarantee that.

If you have logging enable try change it to DEBUG level that should give you some more trace about hooks/plugins.

@inigohidalgo
Copy link
Author

does it matters?

telemetry or our hook running first shouldn't matter, but what I am seeing from the logs is that the pipeline_registry function is running before our custom after_context_created hook is running. Is it possible the telemetry plugin does something with the registry?

@noklam
Copy link
Contributor

noklam commented Nov 22, 2024

@inigohidalgo Ah, that's possible. Would that be possible to use the try_first decorator to make sure your own hook get executed first? Telemetry does collect information from pipeline i.e. number of nodes/datasets

@DimedS
Copy link

DimedS commented Nov 22, 2024

@inigohidalgo hi! Telemetry runs within the same hook as your code. As @noklam mentioned, it starts by collecting information about the default pipeline from the pipeline registry. It's not entirely clear to me why this is causing an issue in your case. However, we currently offer several options to withdraw consent and disable telemetry, which you can safely use in production. You can find more details here:
How do I withdraw consent?

@inigohidalgo
Copy link
Author

Thanks @noklam @DimedS, it was indeed the fact that telemetry is running my register_pipelines before my hook had time to run.

Basically: we have our hook which saves a copy of the context into a Singleton which we then access inside our create_pipeline functions to build pipelines dynamically. The problem was that telemetry's call of register_pipelines came before our own hook had time to save that context, so when it was running telemetry -> register_pipelines -> create_pipeline the context wasn't available in my Singleton which was causing the issue.

Opting out with any of the methods listed there solves my issue. I haven't tried @noklam's suggestion to use the try_first decorator but I expect that would solve my issue as well.

@MatthiasRoels
Copy link

Basically: we have our hook which saves a copy of the context into a Singleton which we then access inside our create_pipeline functions to build pipelines dynamically.

Wait, how can you access the context in a create_pipeline function? Isn’t the pipeline registry called before the context is created?

@inigohidalgo
Copy link
Author

@MatthiasRoels I can give you more details next week if needed, but before we had kedro-telemetry installed, our after_context_created hook runs before the first time the register_pipelines function was called by kedro run. Not sure if behavior has changed recently but this is the behavior we see in kedro==0.18.14 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants