Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate saliency task - is memory leaking? #549

Open
gabegma opened this issue Apr 19, 2023 · 2 comments
Open

Investigate saliency task - is memory leaking? #549

gabegma opened this issue Apr 19, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@gabegma
Copy link
Contributor

gabegma commented Apr 19, 2023

When profiling the memory - I noticed that as we perform a backward pass, the memory used increases and is never released.

@gabegma gabegma added this to Azimuth Apr 19, 2023
@gabegma gabegma converted this from a draft issue Apr 19, 2023
@gabegma gabegma added the bug Something isn't working label Apr 19, 2023
@gabegma gabegma mentioned this issue Apr 19, 2023
4 tasks
@gabegma gabegma changed the title Investigate saliency task - is memory linking? Investigate saliency task - is memory leaking? Apr 19, 2023
@Dref360
Copy link
Contributor

Dref360 commented Apr 19, 2023

So we should just add hf_pipeline.model.zero_grad() after we unhook the hooks?

https://github.com/ServiceNow/azimuth/blob/main/azimuth/modules/model_contracts/hf_text_classification.py#L169

@gabegma
Copy link
Contributor Author

gabegma commented Apr 19, 2023

I tried this and similar things - but I found nothing that would work. But I'm not super knowledgeable of this stuff in general, so I welcome new ideas.

An example here of memory that gets added at the backward pass but never gets released. In every batch, a few MiB gets added. Usually ~100Mib, except for the first pass where it's closer to ~300Mib, as shown here.

Screen Shot 2023-04-19 at 16 06 37

I did find that adding with torch.no_grad(): to the prediction tasks helped. Some memory was never released there too. I was planning on committing that, but not sure what to do for custom pipelines that might not be pytorch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants