We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,我发现第七章代码中有处地方能够优化一下。 tokenizer函数中,可以去掉padding='max_length',浪费计算资源。transformer提供的Trainer构造时的data_collator参数默认采用了动态补全的方法,按照batch进行补全,能够节省计算资源。
在我的CPU上跑,时间从36小时变为2小时(没跑完,进度条给的预估时间)
The text was updated successfully, but these errors were encountered:
36变2为ssc任务上的训练时间
Sorry, something went wrong.
感谢您的建议,后续会参考进行优化,谢谢!
No branches or pull requests
您好,我发现第七章代码中有处地方能够优化一下。 tokenizer函数中,可以去掉padding='max_length',浪费计算资源。transformer提供的Trainer构造时的data_collator参数默认采用了动态补全的方法,按照batch进行补全,能够节省计算资源。
在我的CPU上跑,时间从36小时变为2小时(没跑完,进度条给的预估时间)
The text was updated successfully, but these errors were encountered: