You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried this and similar things - but I found nothing that would work. But I'm not super knowledgeable of this stuff in general, so I welcome new ideas.
An example here of memory that gets added at the backward pass but never gets released. In every batch, a few MiB gets added. Usually ~100Mib, except for the first pass where it's closer to ~300Mib, as shown here.
I did find that adding with torch.no_grad(): to the prediction tasks helped. Some memory was never released there too. I was planning on committing that, but not sure what to do for custom pipelines that might not be pytorch.
When profiling the memory - I noticed that as we perform a backward pass, the memory used increases and is never released.
The text was updated successfully, but these errors were encountered: