-
Notifications
You must be signed in to change notification settings - Fork 213
Program failed to train , I am using one GPU to run the program #21
Comments
First of all looks strange. Are you sure that your DataLoader defined in https://github.com/ternaus/robot-surgery-segmentation/blob/master/dataset.py is correct? |
Second Can you delete it and try again? |
Yes, I had deleted the "runs/debug" folder and tried agian. Now it solved the "RuntimeError: Error(s) in loading state_dict for DataParallel" problem but still "num train = 0, num_val = 0" python prepare_train_val.py Log: |
And my folder arrangements are: |
Can you give me the DATASET from the surgery/data/train/instrument_dataset_1 and surgery/data/test/instrument_dataset_1? |
So for anyone encountering this error - check if you changed the problem type: |
#3 (comment) |
num train = 0, num_val = 0
Traceback (most recent call last):
File "train.py", line 157, in
main()
File "train.py", line 152, in main
num_classes=num_classes
File "/content/drive/My Drive/surgery/data/utils.py", line 56, in train
model.load_state_dict(state['model'])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.encoder.0.weight", "module.encoder.0.bias", ...
...................................................
The text was updated successfully, but these errors were encountered: