DM-Calib is a diffusion-based approach for estimating pinhole camera intrinsic parameters from a single input image. We introduce a new image-based representation, termed Camera Image, which losslessly encodes the numerical camera intrinsics and integrates seamlessly with the diffusion framework. Using this representation, we reformulate the problem of estimating camera intrinsics as the generation of a dense Camera Image conditioned on an input image. By fine-tuning a stable diffusion model to generate a Camera Image from a single RGB input, we can extract camera intrinsics via a RANSAC operation. We further demonstrate that our monocular calibration method enhances performance across various 3D tasks, including zero-shot metric depth estimation, 3D metrology, pose estimation and sparse-view reconstruction.
- [2024/11.27]: 🔥 We release the DM-Calib paper on arXiv !
- [2024/12.06]: 🔥 We release the DM-Calib inference code !
For more required dependencies, please refer to requirements.txt
.
Download our pretrained model from here.
python DMCalib/infer.py \
--pretrained_model_path MODEL_PATH \
--input_dir example/outdoor \
--output_dir output/outdoor\
--scale_10 --domain_specify \
--seed 666 --domain outdoor \
--run_depth --save_pointcloud
Most of our training and testing datasets are from MonoCalib.
More training datasets are from Taskonomy, hypersim, TartanAir, Virtual KITTI 2, Argoverse2, Waymo.
- Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation. arXiv, GitHub.
- GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image. arXiv, GitHub.
- DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation. arXiv, GitHub.
The current model for metric depth prediction does not effectively segment elements such as the sky and generally underperforms on outdoor monuments due to limited training data. We will overcome these challenges in our future efforts
Our license is under creativeml-openrail-m which is same with the SD15. If you have any questions about the usage, please contact us first.
If you find our work helpful, please cite our paper:
@misc{deng2024boost3dreconstructionusing,
title={Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration},
author={Junyuan Deng and Wei Yin and Xiaoyang Guo and Qian Zhang and Xiaotao Hu and Weiqiang Ren and Xiaoxiao Long and Ping Tan},
year={2024},
eprint={2411.17240},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.17240},
}