-
Download Blender
-
Download the ScanNetV2 dataset and put (or link)
scans/
under (or to)../assets/data/scannet/scans/
(Please follow the ScanNet Instructions for downloading the ScanNet dataset). -
Use the following command to render the input image for MCAN model
cd ../utils
blender -b file.blend --python mesh2img.py
For convenience, you can download the images rendered by us from here
- Download the pretrained vision backbones and other files from here and extract them to
./cache
- Download the preprocessed SpaCy embedding and then run
pip install path/en_core_web_lg-1.2.0.tar.gz
train_sqa.py --config-file train_sqa_mcan.yaml
train_sqa.py --test_only --config-file train_sqa_mcan.yaml --test_model <model_path>
<model_path> corresponds to the path to the model.
- Pretrained models can be downloaded here. The correspondence between the models and the results in the paper is as follows
models Model in the paper results MCAN.pth
MCAN
43.42
Note that due to the slight change of codebase, the results evaluated is slightly higher than presented in the paper(around 1%).
We would like to thank MCAN and RelViT for their useful code bases.