How can I produce the visualized image after ShuntendTransformer? #2

MingfangDeng · 2022-03-04T07:35:09Z

I want to see the result( a image)after ShuntendTransformer,but the type of output is tensor,and the channel is not 3,I don't know how to
acquire the image after ShuntendTransformer?
I will appreciate if you can help me,thank you very much.

OliverRensu · 2022-03-18T02:51:35Z

Hi We follow these codes https://github.com/facebookresearch/dino/blob/main/visualize_attention.py https://github.com/hila-chefer/Transformer-Explainability

rayleizhu · 2022-04-06T11:51:21Z

As far as I know, each heatmap is corresponding to a query point. so what's the query point for the visualization in Figure 3?
Which level of attention block did you use for visualization?

go-ahead-maker · 2022-10-28T06:35:18Z

I find that in DINO, the default backbone is ViT or DeiT, which uses the CLS token to represent the whole image-level information. So the visualization result is corresponding to the CLS token.
But Shuntend does not use CLS token, so It may be difficult to adopt the attention map to reflect the RoI of the whole image, and Grad-CAM may be the alternative.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I produce the visualized image after ShuntendTransformer? #2

How can I produce the visualized image after ShuntendTransformer? #2

MingfangDeng commented Mar 4, 2022

OliverRensu commented Mar 18, 2022

rayleizhu commented Apr 6, 2022 •

edited

Loading

go-ahead-maker commented Oct 28, 2022

How can I produce the visualized image after ShuntendTransformer? #2

How can I produce the visualized image after ShuntendTransformer? #2

Comments

MingfangDeng commented Mar 4, 2022

OliverRensu commented Mar 18, 2022

rayleizhu commented Apr 6, 2022 • edited Loading

go-ahead-maker commented Oct 28, 2022

rayleizhu commented Apr 6, 2022 •

edited

Loading