train is ok，but evaluate OOM #3

TheHonestBob · 2022-03-16T02:16:12Z

thanks for your project, when I sh finetune.sh，OOM happen when evaluate，even though I set eval_batch_size=1，my gpu is 2080ti 11G.

ZhuohanX · 2022-03-21T06:32:04Z

Hi,
Could you please help me to indicate the versions of packages you are using to run the code?
I tried Python 3.7.0 with torch==1.4.0 but it seems that module 'torch.cuda' has no attribtue 'amp' because I think it is included in torch 1.6.
And then I changed my package version to Python 3.8 with torch 1.7 but I got an error when running the fine-tune.

File "/home/zhuohanx/HINT/model/utils.py", line 294, in collate_fn
batch_score.append([float(s.split()[2]) for s in score_list])
File "/home/zhuohanx/HINT/model/utils.py", line 294, in
batch_score.append([float(s.split()[2]) for s in score_list])
IndexError: list index out of range

I am not sure if it is caused by the different version or an error in the code?

Thank you in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train is ok，but evaluate OOM #3

train is ok，but evaluate OOM #3

TheHonestBob commented Mar 16, 2022

ZhuohanX commented Mar 21, 2022

train is ok，but evaluate OOM #3

train is ok，but evaluate OOM #3

Comments

TheHonestBob commented Mar 16, 2022

ZhuohanX commented Mar 21, 2022