-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
判断条件的依据 #17
Comments
在PyramidKV的做法下,最底层的kv cache budget就是接近(self.max_capacity_prompt - self.window_size) * 2。如果q_len小于(self.max_capacity_prompt - self.window_size) * 2,那么在最底层budget会大于q_len,所以就加入了这个判断条件防止报错,在这个条件下动态调整budget不生效。后面else如果生效的话,动态调整budget就会生效 |
非常感谢您的及时回复!还有一个问题想要请教一下:if key_states.shape[-2] == kv_seq_len: |
可能是有返回值的,但是我代码里没有设置变量去接受这个返回值。你就新建一个变量接收返回值就好了。 |
感谢您分享的代码,请问在pyramidkv_utils.py中的
elif q_len < (self.max_capacity_prompt - self.window_size) * 2: indices = attn_cache.topk(self.max_capacity_prompt - self.window_size, dim=-1).indices
选用这个判断条件的依据是什么?还有后面的else: indices = attn_cache.topk(max_capacity_prompt, dim=-1).indices。希望获取作者解答,感谢!
The text was updated successfully, but these errors were encountered: