How to modify the attention map in the cross-attention during inference ? #52

yuxinchen-210210203 · 2025-01-22T13:27:39Z

Thank you for your great job!
I want to modify the attention map in the cross-attention during inference, but I am unable to access the internals of the flash_attn_varlen_kvpacked_func function. Therefore, I implemented my own cross-attention function. However, I found that the output of my implementation differs from the output of flash_attn_varlen_kvpacked_func to some extent. Is there a better way to directly manipulate the attention map within flash_attn_varlen_kvpacked_func?
Or is it possible that there is an issue with the way I implemented the cross-attention calculation? The differences between my implementation and the data are shown in the figure below.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to modify the attention map in the cross-attention during inference ? #52

How to modify the attention map in the cross-attention during inference ? #52

yuxinchen-210210203 commented Jan 22, 2025

How to modify the attention map in the cross-attention during inference ? #52

How to modify the attention map in the cross-attention during inference ? #52

Comments

yuxinchen-210210203 commented Jan 22, 2025