diff --git a/README.md b/README.md index 9c41dec..97658dd 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,47 @@ # Rethinking Inductive Biases for Surface Normal Estimation +
+ +
+ Official implementation of the paper > **Rethinking Inductive Biases for Surface Normal Estimation** > > CVPR 2024 (to appear) > -> [Gwangbin Bae](https://baegwangbin.com) and [Andrew J. Davison](https://www.doc.ic.ac.uk/~ajd/) +> Gwangbin Bae and > Andrew J. Davison > -> [[arXiv]]() [[project page]]() \ No newline at end of file +> [paper.pdf] +[arXiv (coming soon)] +[project page] + +## Abstract + +Despite the growing demand for accurate surface normal estimation models, existing methods use general-purpose dense prediction models, adopting the same inductive biases as other tasks. In this paper, we discuss the **inductive biases** needed for surface normal estimation and propose to **(1) utilize the per-pixel ray direction** and **(2) encode the relationship between neighboring surface normals by learning their relative rotation**. The proposed method can generate **crisp — yet, piecewise smooth — predictions** for challenging in-the-wild images of arbitrary resolution and aspect ratio. Compared to a recent ViT-based state-of-the-art model, our method shows a stronger generalization ability, despite being trained on an orders of magnitude smaller dataset. + ++ +
+ +## Getting Started + +Start by installing the dependencies. + +``` +conda create --name DSINE python=3.10 +conda activate DSINE + +conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia +conda install opencv +python -m pip install geffnet +python -m pip install glob2 +``` + +Then, download the model weights from this link and save it under `./checkpoints/`. + +## Test on images + +* Run `python test.py` to generate predictions for the images under `./samples/img/`. The result will be saved under `./samples/output/`. +* Our model assumes known camera intrinsics, but providing approximate intrinsics still gives good results. For some images in `./samples/img/`, the corresponding camera intrinsics (fx, fy, cx, cy - assuming perspective camera with no distortion) is provided as a `.txt` file. If such a file does not exist, the intrinsics will be approximated, by assuming $60^\circ$ field-of-view. \ No newline at end of file diff --git a/docs/index.html b/docs/index.html index f28ecf9..b0e0f3b 100644 --- a/docs/index.html +++ b/docs/index.html @@ -12,12 +12,16 @@ + - + MathJax.Hub.Config({ + tex2jax: { + inlineMath: [ ['$','$'], ["\\(","\\)"] ], + processEscapes: true + } + }); + + @@ -110,7 +114,7 @@