Replies: 1 comment
-
They can be used together. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
HI,
Just wonder how well your implementation of KV Cache int8 PTQ works with flash-attentions's KV Cache partititioning strategy?
Do they all work together? Is there any statictics done on putting all optmizations togethers?
Beta Was this translation helpful? Give feedback.
All reactions