-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to attach debugger to any code within DVC pipeline #5048
Comments
This is not automated at all but it is a solution that works:
Additional code:import debugpy
debugpy.listen(("localhost", 6666))
debugpy.wait_for_client() launch.json entry:{
"version": "0.2.0",
"configurations": [
{
"name": "Debug experiment",
"type": "python",
"request": "attach",
"justMyCode": false,
"subProcess": true,
"port": 6666
}
]
} Demo: Screen.Recording.2024-01-16.at.12.45.22.pm.mov |
I can make a tutorial and we can add it to the README/dvc.org if we think that would be useful. |
Discussed with @skshetry that we might want to clarify the scope of this. Is it strictly about using IDE debugger tools? It might be worth clarifying this when publishing anything about it. It could give the wrong impression that adding breakpoints to your code won't work when running in DVC, and I don't think we should assume that the typical data scientist is familiar with debugging tools. |
As an advanced (I hope) DVC user, It will be awesome to have the ability to run DVC as an "app", like a flask server or Spring application. I think the latter is the exact dream. Today, we have a huge Maybe I'm going into another issue that we have, but once we can have the DVC "app" ability, maybe we can have annotations for StepInput and StepOutput to standartize the contracts between steps (make it more object orient and not file orient). I guess this can be the next level, since today is a heavy .yaml engineering. |
Side note: we need to see if the same can be achieved in Pycharm (I expect it to be very similar, but it's been a while since I was touching it) |
With a complicated DVC pipeline, with dynamic parametrized dependencies it's not easy to get an exact command that is needed to run a specific stage under debugger outside of DVC.
On the other hand, users compare our experiments with a regular Notebook or even basic scripts workflow. They don't know anymore how to pause and explore a data frame.
We need to research and find on the DVC side or on the extension side a way to mitigate this.
The text was updated successfully, but these errors were encountered: