All datasets should be downloaded or soft-linked to ./data/
.
Or you can modify the data_root
value in the config files.
Please download CLEVRTex from their project page to ./data/CLEVRTex/
.
Specifically, you need to download 5 files: ClevrTex (part 1, 4.7 GB)
to ClevrTex (part 5, 4.7 GB)
.
They will be saved as clevrtex_full_part1.tar.gz
to clevrtex_full_part5.tar.gz
.
Unzip them with:
cat clevrtex_full_part*.tar.gz | tar -xzvf
You will get a directory named ./data/CLEVRTex/clevrtex_full/
.
We follow torchvision to download and process it.
You can directly call torchvision
to download it with:
from torchvision.datasets import CelebA
dataset = CelebA('./data/CelebA', download=True)
Make sure you get a directory named ./data/CelebA/celeba/
.
Please use the provided download script download_movi.py.
Download MOVi-D with:
python download_movi.py --out_path ./data/MOVi --level d --image_size 128
Download MOVi-E with:
python download_movi.py --out_path ./data/MOVi --level e --image_size 128
This will save the datasets to ./data/MOVi/MOVi-D/
and ./data/MOVi/MOVi-E/
.
Download their .tar.gz
files from Google Drive and unzip them.
Please rename them to MOVi-Solid/
and MOVi-Tex/
, and put them under ./data/MOVi/
.
Please download Physion from their github repo. Specifically, we only need 2 files containing videos and label files. The HDF5 files containing additional vision data like depth map, segmentation masks are not needed.
- Download
PhysionTest-Core
(the 270 MB one) with the link, and unzip it to a folder namedPhysionTestMP4s
- Download
PhysionTrain-Dynamics
(the 770 MB one) with the link, and unzip it to a folder namedPhysionTrainMP4s
- Download the labels for the readout subset here, and put it under
PhysionTrainMP4s
To speed up data loading, we want to extract frames from videos.
We extract all the videos under PhysionTrainMP4s/
and PhysionTestMP4s/*/mp4s-redyellow/
.
Please run the provided script python scripts/data_utils/physion_video2frames.py
.
You can modify a few parameters in that file such as data_root
, number of process to parallelize NUM_WORKERS
.
We use the trainaug
subset which is widely adopted in previous unsupervised segmentation works.
Please download the processed dataset from Google Drive (credit to this great repo, where we also borrow the VOC dataloader code).
Unzip the downloaded tgz
file. We do not need the folders with saliency
in the name.
Please only take images/
, SegmentationClass/
, SegmentationClassAug/
, sets/
folders and place them under ./data/VOC/
.
Finally, we also need the instance segmentation masks for evaluation.
Please download this file, unzip it, take the VOCdevkit/VOC2012/SegmentationObject/
folder and place it under ./data/VOC/
.
Please download the data from their website.
Specifically, we need 2017 Train images [118K/18GB]
, 2017 Val images [5K/1GB]
, 2017 Train/Val annotations [241MB]
.
Unzip them and you will get 2 images folders train2017/
and val2017/
, and 2 annotation files instances_train2017.json
and instances_val2017.json
.
Please put the image folders under ./data/COCO/images/
, and the annotations json files under ./data/COCO/annotations/
.
-
When you train on some datasets for the first time, the code will cache some index files to be used in later training runs, usually under
datasets/splits/
-
The final
data
directory should look like this:
data/
├── CLEVRTex/
│ ├── clevrtex_full/
│ │ ├── 0/ # folder with images and other annotations (not used here)
│ │ ├── 1/
• • •
• • •
│ │ └── 49/
├── CelebA/
│ ├── celeba/
│ │ ├── img_align_celeba/ # lots of images
│ │ └── list_eval_partition.txt # data split
├── MOVi/
│ ├── MOVi-D/
│ │ ├── train/
│ │ │ ├── 00000000/ # folder with video frames and per-frame masks
│ │ │ ├── 00000001/
• • • •
• • • •
│ │ ├── validation/
│ │ └── test/
│ ├── MOVi-E/
│ │ ├── train/
│ │ ├── validation/
│ │ └── test/
│ ├── MOVi-Solid/
│ │ ├── train/
│ │ ├── val/
│ │ └── test/
│ ├── MOVi-Tex/
│ │ ├── train/
│ │ ├── val/
│ │ └── test/
├── Physion
│ ├── PhysionTestMP4s/
│ │ ├── Collide/ # 8 scenarios
│ │ ├── Contain/
• • •
• • •
│ │ ├── Support/
│ │ └── labels.csv # test subset labels
│ ├── PhysionTrainMP4s/
│ │ ├── Collide_readout_MP4s/ # 8 scenarios x 2 subsets (training, readout)
│ │ ├── Collide_training_MP4s/
• • •
• • •
│ │ ├── Support_readout_MP4s/
│ │ ├── Support_training_MP4s/
│ │ └── readout_labels.csv # readout subset labels
├── VOC/
│ ├── images/ # lots of images
│ ├── SegmentationClass/ # lots of masks
│ ├── SegmentationClassAug/
│ ├── SegmentationObject/
│ └── sets/ # data split
├── COCO/
│ ├── images/
│ │ ├── train2017/ # lots of images
│ │ ├── val2017/
│ │── annotations/
│ │ ├── instances_train2017.json
└ └ └── instances_val2017.json