Replies: 1 comment
-
Yes, we are working to bring this feature at DeepSpeed soon. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It seems that padding to the max_seq_len brings a lot of cost especially in the text generation task.
Beta Was this translation helpful? Give feedback.
All reactions