Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
faster_rcnn_r50_fpn_attention_0010_1x_coco.py		faster_rcnn_r50_fpn_attention_0010_1x_coco.py
faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco.py		faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco.py
faster_rcnn_r50_fpn_attention_1111_1x_coco.py		faster_rcnn_r50_fpn_attention_1111_1x_coco.py
faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco.py		faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco.py
metafile.yml		metafile.yml

README.md

Empirical Attention

An Empirical Study of Spatial Attention Mechanisms in Deep Networks

Abstract

Attention mechanisms have become a popular component in deep neural networks, yet there has been little examination of how different influencing factors and methods for computing attention from these factors affect performance. Toward a better general understanding of attention mechanisms, we present an empirical study that ablates various spatial attention elements within a generalized attention formulation, encompassing the dominant Transformer attention as well as the prevalent deformable convolution and dynamic convolution modules. Conducted on a variety of applications, the study yields significant findings about spatial attention in deep networks, some of which run counter to conventional understanding. For example, we find that the query and key content comparison in Transformer attention is negligible for self-attention, but vital for encoder-decoder attention. A proper combination of deformable convolution with key content only saliency achieves the best accuracy-efficiency tradeoff in self-attention. Our results suggest that there exists much room for improvement in the design of attention mechanisms.

Results and Models

Backbone	Attention Component	DCN	Lr schd	Mem (GB)	Inf time (fps)	box AP	Config	Download
R-50	1111	N	1x	8.0	13.8	40.0	config	model \| log
R-50	0010	N	1x	4.2	18.4	39.1	config	model \| log
R-50	1111	Y	1x	8.0	12.7	42.1	config	model \| log
R-50	0010	Y	1x	4.2	17.1	42.0	config	model \| log

Citation

@article{zhu2019empirical,
  title={An Empirical Study of Spatial Attention Mechanisms in Deep Networks},
  author={Zhu, Xizhou and Cheng, Dazhi and Zhang, Zheng and Lin, Stephen and Dai, Jifeng},
  journal={arXiv preprint arXiv:1904.05873},
  year={2019}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

empirical_attention

empirical_attention

README.md

Empirical Attention

Abstract

Results and Models

Citation

Files

empirical_attention

Directory actions

More options

Directory actions

More options

Latest commit

History

empirical_attention

Folders and files

parent directory

README.md

Empirical Attention

Abstract

Results and Models

Citation