You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have a question regarding the sequence of decoder operations in your config file.
Based on your code, I guess the sequence of operations is as follows: self_attn -> cross_attn -> ffn -> Multiscaledeformableattention. However, when I read the paper, my understanding was that the sequence should be: MultiheadSelfAttention -> Multiscaledeformableattention -> ffn.
We follow the decoder design of Deformable-DETR. In Deformable-DETR, the correct decoder operation order is: self_attn -> cross_attn -> ffn. Specifically, self_attn is implemented by the MultiheadSelfAttention and cross_attn is the MultiscaleDeformableAttention.
Hi, I have a question regarding the sequence of decoder operations in your config file.
Based on your code, I guess the sequence of operations is as follows: self_attn -> cross_attn -> ffn -> Multiscaledeformableattention. However, when I read the paper, my understanding was that the sequence should be: MultiheadSelfAttention -> Multiscaledeformableattention -> ffn.
https://github.com/Sense-X/Co-DETR/blob/2d59a3038533d00732275a0f5d31cf5ff0b540ad/projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py#L68C1-L89C56
Could you explain if I misunderstood the paper or code?
The text was updated successfully, but these errors were encountered: