You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great job!
I want to modify the attention map in the cross-attention during inference, but I am unable to access the internals of the flash_attn_varlen_kvpacked_func function. Therefore, I implemented my own cross-attention function. However, I found that the output of my implementation differs from the output of flash_attn_varlen_kvpacked_func to some extent. Is there a better way to directly manipulate the attention map within flash_attn_varlen_kvpacked_func?
Or is it possible that there is an issue with the way I implemented the cross-attention calculation? The differences between my implementation and the data are shown in the figure below.
The text was updated successfully, but these errors were encountered:
Thank you for your great job!
I want to modify the attention map in the cross-attention during inference, but I am unable to access the internals of the
flash_attn_varlen_kvpacked_func
function. Therefore, I implemented my own cross-attention function. However, I found that the output of my implementation differs from the output offlash_attn_varlen_kvpacked_func
to some extent. Is there a better way to directly manipulate the attention map withinflash_attn_varlen_kvpacked_func
?Or is it possible that there is an issue with the way I implemented the cross-attention calculation? The differences between my implementation and the data are shown in the figure below.
The text was updated successfully, but these errors were encountered: