-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover attention scores #70
Comments
I don't believe that's possible because the order of computation is |
In performer paper, the author use a special "V", which is a diagonal matrix (one-hot indicators), then the attention outputs just equal attention scores. I suggest you read the paragraphs around Figure 10 in the paper. However, I have trouble in the implementation of it, because it is confusing to pass both attention scores and results to other functions/classes meantime. |
@lucidrains Could you please help us about the implementation of obtain attention weights? |
Is it possible to recover the attention scores from the Fast Attention module?
The text was updated successfully, but these errors were encountered: