You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do you meet the problem that when training GNet, the GPU memory keeps increasing with every batch and finally leads to CUDA being out of memory? When training GNet, the first few iterations have seen a significant GPU memory increase, while the later each batch has increased by approximately 2 or 4M. As far as I know, if we don't intentionally save some variables with gradients, the release and new occupation of memory after the first iteration in training neural networks should be in a balanced stage, that is, there won't be any further increase in GPU memory. Could you help me solve this problem? Thank you very much!
The text was updated successfully, but these errors were encountered:
@siddharthKatageri Hello, this issue may be caused by a mismatch in the PyTorch environment and its dependencies. I resolved the problem by adjusting the environment configuration. I hope this information helps you as well.
Hi @siddharthKatageri@z050209@Lin-ZN, did you guys find a solution for this? Unfortunately I couldn't replicate this so far. Maybe, can you provide more details about the system and config that you have so that we can try replicating this and finding a solution? Also, if you found a solution for this, it would be great to share it. Thanks.
Hi Otaheri,
Thank you for your good work!
Do you meet the problem that when training GNet, the GPU memory keeps increasing with every batch and finally leads to CUDA being out of memory? When training GNet, the first few iterations have seen a significant GPU memory increase, while the later each batch has increased by approximately 2 or 4M. As far as I know, if we don't intentionally save some variables with gradients, the release and new occupation of memory after the first iteration in training neural networks should be in a balanced stage, that is, there won't be any further increase in GPU memory. Could you help me solve this problem? Thank you very much!
The text was updated successfully, but these errors were encountered: