Skip to content

Latest commit

 

History

History
41 lines (32 loc) · 1.91 KB

README.md

File metadata and controls

41 lines (32 loc) · 1.91 KB

Running MCAN on SQA3D

Data preparation for BEV pictures

  1. Download Blender

  2. Download the ScanNetV2 dataset and put (or link) scans/ under (or to) ../assets/data/scannet/scans/ (Please follow the ScanNet Instructions for downloading the ScanNet dataset).

  3. Use the following command to render the input image for MCAN model

cd ../utils
blender -b file.blend --python mesh2img.py

For convenience, you can download the images rendered by us from here

  1. Download the pretrained vision backbones and other files from here and extract them to ./cache
  2. Download the preprocessed SpaCy embedding and then run
pip install path/en_core_web_lg-1.2.0.tar.gz

Training

train_sqa.py --config-file train_sqa_mcan.yaml

Evaluation

train_sqa.py --test_only --config-file train_sqa_mcan.yaml --test_model <model_path>

<model_path> corresponds to the path to the model.

Pretrained models

  • Pretrained models can be downloaded here. The correspondence between the models and the results in the paper is as follows
    models Model in the paper results
    MCAN.pth MCAN 43.42

Note that due to the slight change of codebase, the results evaluated is slightly higher than presented in the paper(around 1%).

Acknowledgements

We would like to thank MCAN and RelViT for their useful code bases.