Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I produce the visualized image after ShuntendTransformer? #2

Open
MingfangDeng opened this issue Mar 4, 2022 · 3 comments
Open

Comments

@MingfangDeng
Copy link

I want to see the result( a image)after ShuntendTransformer,but the type of output is tensor,and the channel is not 3,I don't know how to
acquire the image after ShuntendTransformer?
I will appreciate if you can help me,thank you very much.

@rayleizhu
Copy link

rayleizhu commented Apr 6, 2022

  1. As far as I know, each heatmap is corresponding to a query point. so what's the query point for the visualization in Figure 3?
  2. Which level of attention block did you use for visualization?

@go-ahead-maker
Copy link

I find that in DINO, the default backbone is ViT or DeiT, which uses the CLS token to represent the whole image-level information. So the visualization result is corresponding to the CLS token.
But Shuntend does not use CLS token, so It may be difficult to adopt the attention map to reflect the RoI of the whole image, and Grad-CAM may be the alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants