Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config files for F-ViT from OpenAI-CLIP #13

Open
yhosoya66 opened this issue Mar 4, 2024 · 2 comments
Open

Config files for F-ViT from OpenAI-CLIP #13

yhosoya66 opened this issue Mar 4, 2024 · 2 comments

Comments

@yhosoya66
Copy link

Hi, thank you for your great works. I have two questions.

Firstly, is it possible to share the following config files to train F-ViT models from OpenAI-CLIP rather than EVA-CLIP?
(*The file names are just examples)

  • fvit_vitb16_upsample_fpn_bs64_3e_ovcoco_openai_original.py
  • fvit_vitb16_upsample_fpn_bs64_3e_ovcoco_openai_clipself_patches.py
  • ...

In specific, the file fvit_vitb16_upsample_fpn_bs64_3e_ovcoco_openai_clipself_patches.py will utilze the pre-trained weight of openai_vitb16_coco_clipself_patches.pt for the initialization, instead of eva_vitb16_coco_clipself_patches.pt.

Secondly, to reproduce the results openai_vitb16_coco_clipself_patches.pt (i.e., clipself pre-trained ViT from OpenAI-CLIP instead of EVA-CLIP), which model_name did you use? When we generate text embeddings for OpenAI-CLIP, for example, are the followings identical to your settings?

python tools/dump_coco_openclip_feature.py \
... \
--model_name ViT-B-16 \ # instead of EVA02-CLIP-B-16
--pretrained openai \ # instead of eva
...

Thanks for your help.

@wusize
Copy link
Owner

wusize commented Mar 4, 2024

Hi! Please refer to the scripts of my another work. I believe those would help.

@yhosoya66
Copy link
Author

Hi, thank you for the useful information.

According to your suggestion, I undertook the following steps:

  1. Based on the repositry you provided above, I created models/clip_vit.py.
  2. I also crafted configs/ov_coco/fvit_vitb16_upsample_fpn_bs64_3e_ovcoco_openai_original.py, as delineated below:
model = dict(
    type='FViT',
    backbone=dict(
        type='CLIPViT',
        model_name='ViT-B-16', 
        pretrained='openai',
        cache_dir='checkpoints', 
        norm_cfg=norm_cfg,
        out_indices=[3, 5, 7, 11]),
    neck=dict(
        ...

I hope this works as expected.
Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants