How to caculate the mean similarity #4

shipengai · 2023-07-11T01:55:20Z

hello, Is there code to caculate the mean similarity which mentioned by this paper?

jacklishufan · 2023-07-11T19:46:51Z

Hi
Currently, we do not have plans for releasing such codes. We might release them after releasing training and evaluation code and more checkpoints. However, I can provide a basic overview of our pipeline.

To obtain the text similarity, you can use

test_categories = get_openseg_labels("coco_panoptic",prompt_engineered=False)
expression, positive_map_idx_token = create_queries_and_maps(test_categories,demo.predictor.tokenizer)
with torch.no_grad():
    text_features = demo.predictor.model.forward_text([expression],'cuda')
text_feature_words = []
for k,v in positive_map_idx_token.items():
    text_feature_words.append(text_features['hidden'][0,v,:].detach().cpu().mean(0))
text_feature_words = torch.stack(text_feature_words)
text_feature_words = torch.nn.functional.normalize(text_feature_words,dim=-1)
dist_text = torch.cdist(text_feature_words,text_feature_words) # 2 - 2 |A| |B|
dist_text = 0.5* (2.0 - dist_text)

Then you can visualize dist_text which has a shape N_CLS X N_CLS

The visual features are non-trivial and more complicated and it requires considerable hacking into the data loading and model inference process.

The first step is to sample N annotations for each class, then for each image, the following code will extract the feature map of this image

batch = mapper(batch) # mapper is a DatasetMapper instance
samples = demo.predictor.model.preprocess_image([batch])
samples = nested_tensor_from_tensor_list(samples, size_divisibility=32)
with torch.no_grad():
    features,_ = demo.predictor.model.detr.detr.backbone(samples)
img_features,mask = features[-1].decompose()
img_features = img_features.cpu() #1 X C X  H X W

Then you want to get the ground truth mask and resize it to the same size as the feature map

msk = batch['pan_seg_gt'] == instance_id # H X W
mask_up = F.interpolate(msk.float()[None,None],img_features.shape[-2:],mode='area')  #1 X C 1 X H X W

The final feature of this mask can be obtained through mask pooling

mask_up = mask_up / mask_up.sum()
out = torch.einsum('bchw,bdhw->bdc',img_features,mask_up)[0][0] # Final output of shape C, thi

Then you need to save out for each selected annotation, average by class, and visualize.
Let me know if you have more questions.

shipengai · 2023-07-12T01:44:06Z

Thanks for your reply！I will try it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to caculate the mean similarity #4

How to caculate the mean similarity #4

shipengai commented Jul 11, 2023

jacklishufan commented Jul 11, 2023 •

edited

Loading

shipengai commented Jul 12, 2023

How to caculate the mean similarity #4

How to caculate the mean similarity #4

Comments

shipengai commented Jul 11, 2023

jacklishufan commented Jul 11, 2023 • edited Loading

shipengai commented Jul 12, 2023

jacklishufan commented Jul 11, 2023 •

edited

Loading