Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to caculate the mean similarity #4

Open
shipengai opened this issue Jul 11, 2023 · 2 comments
Open

How to caculate the mean similarity #4

shipengai opened this issue Jul 11, 2023 · 2 comments

Comments

@shipengai
Copy link

hello, Is there code to caculate the mean similarity which mentioned by this paper?

@jacklishufan
Copy link
Collaborator

jacklishufan commented Jul 11, 2023

Hi
Currently, we do not have plans for releasing such codes. We might release them after releasing training and evaluation code and more checkpoints. However, I can provide a basic overview of our pipeline.

To obtain the text similarity, you can use

test_categories = get_openseg_labels("coco_panoptic",prompt_engineered=False)
expression, positive_map_idx_token = create_queries_and_maps(test_categories,demo.predictor.tokenizer)
with torch.no_grad():
    text_features = demo.predictor.model.forward_text([expression],'cuda')
text_feature_words = []
for k,v in positive_map_idx_token.items():
    text_feature_words.append(text_features['hidden'][0,v,:].detach().cpu().mean(0))
text_feature_words = torch.stack(text_feature_words)
text_feature_words = torch.nn.functional.normalize(text_feature_words,dim=-1)
dist_text = torch.cdist(text_feature_words,text_feature_words) # 2 - 2 |A| |B|
dist_text = 0.5* (2.0 - dist_text)

Then you can visualize dist_text which has a shape N_CLS X N_CLS

The visual features are non-trivial and more complicated and it requires considerable hacking into the data loading and model inference process.

The first step is to sample N annotations for each class, then for each image, the following code will extract the feature map of this image

batch = mapper(batch) # mapper is a DatasetMapper instance
samples = demo.predictor.model.preprocess_image([batch])
samples = nested_tensor_from_tensor_list(samples, size_divisibility=32)
with torch.no_grad():
    features,_ = demo.predictor.model.detr.detr.backbone(samples)
img_features,mask = features[-1].decompose()
img_features = img_features.cpu() #1 X C X  H X W

Then you want to get the ground truth mask and resize it to the same size as the feature map

msk = batch['pan_seg_gt'] == instance_id # H X W
mask_up = F.interpolate(msk.float()[None,None],img_features.shape[-2:],mode='area')  #1 X C 1 X H X W

The final feature of this mask can be obtained through mask pooling

mask_up = mask_up / mask_up.sum()
out = torch.einsum('bchw,bdhw->bdc',img_features,mask_up)[0][0] # Final output of shape C, thi

Then you need to save out for each selected annotation, average by class, and visualize.
Let me know if you have more questions.

@shipengai
Copy link
Author

Thanks for your reply!I will try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants