Skip to content

Commit

Permalink
fix latex (huggingface#928)
Browse files Browse the repository at this point in the history
  • Loading branch information
kashif authored Mar 13, 2023
1 parent 181c0f4 commit 3e55cb5
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion informer.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ $$
\textrm{ProbSparseAttention}(Q, K, V) = \textrm{softmax}(\frac{Q_{reduce}K^T}{\sqrt{d_k}} )V
$$

where the \\(Q_{reduce}\\) matrix only selects the Top \\(u)\\ "active" queries. Here, \\(u = c \cdot \log L_Q\\) and \\(c\\) called the _sampling factor_ hyperparameter for the ProbSparse attention. Since \\(Q_{reduce}\\) selects only the Top \\(u\\) queries, its size is \\(c\cdot \log L_Q \times d\\), so the multiplication \\(Q_{reduce}K^T\\) takes only \\(O(L_K \log L_Q) = O(T \log T)\\).
where the \\(Q_{reduce}\\) matrix only selects the Top \\(u\\) "active" queries. Here, \\(u = c \cdot \log L_Q\\) and \\(c\\) called the _sampling factor_ hyperparameter for the ProbSparse attention. Since \\(Q_{reduce}\\) selects only the Top \\(u\\) queries, its size is \\(c\cdot \log L_Q \times d\\), so the multiplication \\(Q_{reduce}K^T\\) takes only \\(O(L_K \log L_Q) = O(T \log T)\\).

This is good! But how can we select the \\(u\\) "active" queries to create \\(Q_{reduce}\\)? Let's define the _Query Sparsity Measurement_.

Expand Down

0 comments on commit 3e55cb5

Please sign in to comment.