Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

代码问题 #72

Open
triumph693 opened this issue Oct 21, 2021 · 5 comments
Open

代码问题 #72

triumph693 opened this issue Oct 21, 2021 · 5 comments

Comments

@triumph693
Copy link

您好,我想请问一下,我在训练到10轮的时候,内存突然溢出了,是因为代码吃内存呢,还是怎么回事呢,期待您的回复

@Capricorn231
Copy link
Collaborator

这是因为前10轮我们的backbone是冻结未参与训练的,因此前10轮内存开销较小。第10轮backbone放开训练会导致内存开销增加,可以试着将config.yaml中的BACKBONE.TRAIN_EPOCH从10改成0,将batch size调好之后再将其改回0。

@triumph693
Copy link
Author

您好,我想问一下,在您的机器的训练了多久才结束的呢,在我的机器上配置没那么好,感觉训练4天才会结束,期待您的回复

@Capricorn231
Copy link
Collaborator

在4张2080ti上大约需要1~2天

@triumph693
Copy link
Author

好的,谢谢您

@triumph693
Copy link
Author

您好,我看您的论文里写的使用COCO , ImageNet DET, ImageNet VID, and YouTube-BB数据集进行训练的,请问一下这几个数据集一共多大数据量呢,我看一下我的空间够不够,我单独使用got10k数据集进行训练的,发现结果并不好,您使用got10k数据集训练了么

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants