We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.3 下载minimind语言模型的预训练权重(百度网盘 or HuggingFace),放到./out/ 目录下,命名为*_llm.pth
1、请问这个权重是否直接是minimind项目中训练出的权重(https://github.com/jingyaogong/minimind?tab=readme-ov-file#%E8%AE%AD%E7%BB%83%E5%AE%8C%E6%88%90%E7%9A%84%E6%A8%A1%E5%9E%8B%E6%9D%83%E9%87%8D) 视觉模型的Transformer比语言模型多出一个VisionProj层,不知可否直接用state_dict = torch.load(ckp, map_location=args.device)进行加载 2、加载预训练语言模型参数后,视觉模型的微调可能会导致语言模型性能降低甚至崩塌,是否因此需要刻意调低学习率?
The text was updated successfully, but these errors were encountered:
1、是的,是隔壁训练出来的纯语言模型权重;可以用,只是 strict=False 即可 2、是的,需要降低学习率
strict=False
Sorry, something went wrong.
No branches or pull requests
3.3 下载minimind语言模型的预训练权重(百度网盘 or HuggingFace),放到./out/ 目录下,命名为*_llm.pth
1、请问这个权重是否直接是minimind项目中训练出的权重(https://github.com/jingyaogong/minimind?tab=readme-ov-file#%E8%AE%AD%E7%BB%83%E5%AE%8C%E6%88%90%E7%9A%84%E6%A8%A1%E5%9E%8B%E6%9D%83%E9%87%8D)
视觉模型的Transformer比语言模型多出一个VisionProj层,不知可否直接用state_dict = torch.load(ckp, map_location=args.device)进行加载
2、加载预训练语言模型参数后,视觉模型的微调可能会导致语言模型性能降低甚至崩塌,是否因此需要刻意调低学习率?
The text was updated successfully, but these errors were encountered: