Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

top-k功能实现的代码 #2

Open
z972778371 opened this issue Mar 3, 2022 · 8 comments
Open

top-k功能实现的代码 #2

z972778371 opened this issue Mar 3, 2022 · 8 comments

Comments

@z972778371
Copy link

您好,我想咨询一下实现top-k功能的代码都集中在sparse_activated_multihead_attention.py中的SparseActivatedMultiheadAttention类里了吗?

@zhaoguangxiang
Copy link
Collaborator

Yes

@z972778371
Copy link
Author

Yes

您好,关于top-k功能代码部分,我有些问题想请教您一下:
1、首先就是代码中许多参数不太明白它是用来干什么的。
1)例如parameters中的self.onnx_trace、entmax、bmm_fp16_support、cur_san_active等

2、代码260行的attn_weights = self.apply_sparse_mask(attn_weights, tgt_len, src_len, bsz)
查apply_sparse_mask函数的define 仅是返回了attn_weights,并未做任何处理,这一步是什么用处?
3、代码中的entmax用的是tf,原论文的pytorch版本可以平替代码中的entmax吗?

@zhaoguangxiang
Copy link
Collaborator

zhaoguangxiang commented Mar 6, 2022 via email

@z972778371
Copy link
Author

非常感谢您的回复。
关于您的代码我还有一些问题请教,如果您能帮我解答,将不尽感激^_^
目前我模型的attention_mask仅仅把文本padding的位置mask为-∞,我想对它引入稀疏注意力来检查效果是否有进一步提升。
PS:我的代码是把attention计算、encoder、decoder和transformer分成4个python file,可能需要将您的代码分块调用实现。
1、您代码中args参数是什么?
self.args = args、self.div = args.div、self.lb = args.lb,包括cur_san_active的bool值和self.entmax 也是根据args.use_att判断的,所以想知道一下参数args值是怎么设置的。
2、参数self.div和self.lb的值决定变量top_k的值,这两个参数是在args人为设置还是?
3、代码297行-312行,根据self.entmax来判断使用哪种形式的归一化操作。原论文提出的是1.5-entmax,那么在实际运行中,参数args.entmax的值是设置为2吗?

@zhaoguangxiang
Copy link
Collaborator

  1. args is setting in fairseq/model/transformer
  2. they are set by you
  3. Yes

@z972778371
Copy link
Author

您好,请问entmax15和top-k是如何选择的呢?
在您sparse_activated_multihead_attention.py代码中entmax和top-k是二选一的,在您测试的经验来看,两者各适用于什么情况?

@zhaoguangxiang
Copy link
Collaborator

zhaoguangxiang commented Mar 8, 2022 via email

@z972778371
Copy link
Author

Thank you very much for your patient answer, which helps me a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants