Running MCAN on SQA3D

Data preparation for BEV pictures

Download Blender
Download the ScanNetV2 dataset and put (or link) scans/ under (or to) ../assets/data/scannet/scans/ (Please follow the ScanNet Instructions for downloading the ScanNet dataset).
Use the following command to render the input image for MCAN model

cd ../utils
blender -b file.blend --python mesh2img.py

For convenience, you can download the images rendered by us from here

Download the pretrained vision backbones and other files from here and extract them to ./cache
Download the preprocessed SpaCy embedding and then run

pip install path/en_core_web_lg-1.2.0.tar.gz

train_sqa.py --config-file train_sqa_mcan.yaml

train_sqa.py --test_only --config-file train_sqa_mcan.yaml --test_model <model_path>

<model_path> corresponds to the path to the model.

Pretrained models can be downloaded here. The correspondence between the models and the results in the paper is as follows

models Model in the paper results

MCAN.pth MCAN 43.42

Note that due to the slight change of codebase, the results evaluated is slightly higher than presented in the paper(around 1%).

We would like to thank MCAN and RelViT for their useful code bases.