Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VQA下游微调阶段GPU的显存占用波动很大 #128

Open
DeguangChen opened this issue Oct 18, 2024 · 9 comments
Open

VQA下游微调阶段GPU的显存占用波动很大 #128

DeguangChen opened this issue Oct 18, 2024 · 9 comments

Comments

@DeguangChen
Copy link

下游微调阶段,用作者提供的数据集进行微调。我用一张3090进行微调,gpu的占用波动很大,尤其是开始微调的前20步,请问您们遇到过吗。

@ZhangXJ199
Copy link
Collaborator

训练过程中显存占用确实可能会发生波动,损失正常下降即可

@DeguangChen
Copy link
Author

感谢耗费您的时间回复

@DeguangChen
Copy link
Author

还有一个问题,请问您们能否释放预训练模型呢,我想基于你们的预训练模型对下游任务进行微调。拜托拜托

@ZhangXJ199
Copy link
Collaborator

ZhangXJ199 commented Oct 19, 2024

我们已经在HF上发布了一些模型 详见github中的model zoo部分

@DeguangChen
Copy link
Author

image为什么允许语言模型梯度更新后就会出现梯度爆炸的问题啊,好烦

@ZhangXJ199
Copy link
Collaborator

预训练阶段我们进行图文特征对齐,用的是image-caption数据,这个阶段需要冻结语言模型

@DeguangChen
Copy link
Author

哦哦,感谢感谢,麻烦您们能否通过一下微信啊

@YingHuTsing
Copy link
Collaborator

哦哦,感谢感谢,麻烦您们能否通过一下微信啊

已通过

@DeguangChen
Copy link
Author

哦哦,感谢感谢,麻烦您们能否通过一下微信啊

已通过

感谢感谢,已加入,我把问题整理好后再咨询您

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants