You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! When I run the train.py, I met the problem about out of memory after a few epoches. It also happened even if I add the number of GPU. And I found some other people met this question ,too. I don't it's reason. Could you offer some help?Thank you very much!
It's the information about the question below:
step 120, image: 005365.jpg, loss: 6.3531, fps: 3.71 (0.27s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(14/285)
rpn_cls: 0.6417, rpn_box: 0.0229, rcnn_cls: 1.9303, rcnn_box: 0.1354
step 130, image: 009091.jpg, loss: 4.8151, fps: 3.78 (0.26s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(22/277)
rpn_cls: 0.6486, rpn_box: 0.2012, rcnn_cls: 1.7988, rcnn_box: 0.1184
step 140, image: 008690.jpg, loss: 4.9961, fps: 3.55 (0.28s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(30/269)
rpn_cls: 0.6114, rpn_box: 0.0690, rcnn_cls: 1.4801, rcnn_box: 0.1088
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "train.py", line 138, in
loss.backward()
File "/usr/local/lib/python2.7/dist-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 89, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58
The text was updated successfully, but these errors were encountered:
try pytorch version 0.3.1 with cudatoolkit 8.0
I used 0.4.1 version either, but had same error (may be gpu memory leak in code). So I downgraded the version of pytorch.
Hello! When I run the train.py, I met the problem about out of memory after a few epoches. It also happened even if I add the number of GPU. And I found some other people met this question ,too. I don't it's reason. Could you offer some help?Thank you very much!
It's the information about the question below:
step 120, image: 005365.jpg, loss: 6.3531, fps: 3.71 (0.27s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(14/285)
rpn_cls: 0.6417, rpn_box: 0.0229, rcnn_cls: 1.9303, rcnn_box: 0.1354
step 130, image: 009091.jpg, loss: 4.8151, fps: 3.78 (0.26s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(22/277)
rpn_cls: 0.6486, rpn_box: 0.2012, rcnn_cls: 1.7988, rcnn_box: 0.1184
step 140, image: 008690.jpg, loss: 4.9961, fps: 3.55 (0.28s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(30/269)
rpn_cls: 0.6114, rpn_box: 0.0690, rcnn_cls: 1.4801, rcnn_box: 0.1088
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "train.py", line 138, in
loss.backward()
File "/usr/local/lib/python2.7/dist-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 89, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58
The text was updated successfully, but these errors were encountered: