ComCLIP: Training-Free Compositional Image and Text Matching
ComVG & SVO_Probes:
- baseline.ipynb: baseline clip running on datasets
- ComCLIP.ipynb: main algorithm for ComCLIP
- OpenCLIP.ipynb: main algorithm for ComCLIP on openclip
- parse_image.py: helper functions to create subimages.
- match_relation.ipynb: gpt prompt to match dense captions to subject, object, predicates
flickr30k_mscoco:
- CLIP_ComCLIP.ipynb: comclip and clip retrieval on both datasets
- parse_image.py: helper functions to create subimages.
- parse_relation.ipynb: gpt prompt to parse subject, object, predicates and their connection in text
- match_relation.ipynb: gpt prompt to match dense captions to subject, object, predicates
VL-checklist:
- ComBLIP_BLIP.ipynb: main algorithm for ComBLIP, BLIP2 baseline
- ComCLIP_CLIP.ipynb: main algorithm for ComCLIP, CLIP baseline
- parse_image.py: helper functions to create subimages
- parse_relation.ipynb: gpt prompt to parse subject, object, predicates and their connection in text
- match_relation.ipynb: gpt prompt to match dense captions to subject, object, predicates
winoground:
- ComBLIP_BLIP.ipynb: main algorithm for ComBLIP, BLIP2 baseline on winoground
- ComCLIP_CLIP.ipynb: main algorithm for ComCLIP, CLIP baseline on winoground
- parse_image.py: helper functions to create subimages
- parse_relation.ipynb: gpt prompt to parse subject, object, predicates and their connection in text
- match_relation.ipynb: gpt prompt to match dense captions to subject, object, predicates