We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
为什么在代码中这样实现呢?这样导致的直接结果是答案并不是长文本阅读,而是只有model_max_length长度的文本阅读。如果答案出现在中间部分,full attention也无法做对。而如果答案出现在前后部分,这样又会导致去噪。不论哪种情况都会使得结果偏离实际。
The text was updated successfully, but these errors were encountered:
只有LlaMa3会这样做,因为LlaMa3只有8k长度。Mistral就是32k。代码中这样实现是因为这样相比截断后面或者截断前面相对合理。
Sorry, something went wrong.
No branches or pull requests
为什么在代码中这样实现呢?这样导致的直接结果是答案并不是长文本阅读,而是只有model_max_length长度的文本阅读。如果答案出现在中间部分,full attention也无法做对。而如果答案出现在前后部分,这样又会导致去噪。不论哪种情况都会使得结果偏离实际。
The text was updated successfully, but these errors were encountered: