Skip to content

Latest commit

 

History

History
49 lines (39 loc) · 2.24 KB

data.md

File metadata and controls

49 lines (39 loc) · 2.24 KB

Preparing Data for Mamba-YOLO-World

Overview

For pre-training Mamba-YOLO-World, we adopt several datasets as listed in the below table:

Data Samples Type Boxes
Objects365v1 609k detection 9,621k
GQA 621k grounding 3,681k
Flickr 149k grounding 641k

Dataset Directory

We put all data into the data directory, such as:

├── coco
│   ├── annotations
│   │   ├── instances_val2017.json
│   │   └── instances_train2017.json
│   ├── lvis
│   │   └── lvis_v1_minival_inserted_image_name.json
│   ├── train2017
│   └── val2017
├── flickr
│   ├── final_flickr_separateGT_train.json
│   └── images
├── mixed_grounding
│   ├── final_mixed_train_no_coco.json
│   ├── images
├── objects365v1
│   ├── objects365_train.json
│   └── train
└── texts

NOTE: We strongly suggest that you check the directories or paths in the dataset part of the config file, especially for the values ann_file, data_root, and data_prefix.

We provide the annotations of the pre-training data in the below table:

Data images Annotation File
Objects365v1 Objects365 train objects365_train.json
MixedGrounding GQA final_mixed_train_no_coco.json
Flickr30k Flickr30k final_flickr_separateGT_train.json
LVIS-minival COCO val2017 lvis_v1_minival_inserted_image_name.json

Acknowledgement: We sincerely thank GLIP and mdetr for providing the annotation files for pre-training.