This repository contains the code to my bachelors project.
Over the last few years Neural Radiance Fields (NeRF) have achieved breakthrough results for the tasks of Novel View Synthesis (NVS) and Neural Scene Representation (NSR). This work looks at the main advancements made to NeRFs, specifically on speed improvements during the training process, covering literature up to January 2022. The various methods will be compared on general terms and with a focus on indoor scenes captured with a conventional smart phone, as this represents a real-world use case of NeRFs that goes against some simple assumptions made by existing methods. The main improvements will be evaluated in respect to ray sampling, input encoding, regularisation, network structure and task formulation. The underlying geometric structure represented by different NeRF models is investigated to show which 3D scene features enable fast convergence and which inhibit the optimization process. This analysis is then utilized to implement an architecture that extends the capabilities of a state-of-the-art model (instant-ngp) and is able to represent a challenging monocular RGB sequence within a couple of minutes. Experimental results with different architectures will be provided, with a specific focus on implementation considerations to best utilize GPUs for training.
- Clone this repository using
git clone https://github.com/mebenstein/bachelor_proj.git --recurse-submodules
- Set up an Anaconda environment and gather all requirements with
conda env update --file env.yml --prune
- Download the pre-trained weights for DeblurGANv2 from their repository and save them in the
deblurganv2
folder asbest_fpn.h5
- Create a directory for your test sequence
mkdir data/test mv video.mp4 data/test/
- Generate a deblurred video
cd deblurganv2 python predict.py ../data/test/video.mp4 --video mv submit/video_deblur.mp4 ../data/test/ cd ..
- Convert video to frames
cd data/test ffmpeg -i video_deblur.mp4 -vf fps=30 unfiltered_images/%04d.png
- Filter images based on bluriness
python ../../image_filtering.py unfiltered_images images <frame_interval:int>
- Estimate poses with COLMAP
python ../../torch-ngp/colmap2nerf.py --images images --colmap_matcher exhaustive --run_colmap cd ../../
- Train representation. Execute
python learn_scene.py --help
to see the available paramters. The weights will be stored in a workspace folder. (During the first execution PyTorch bindings will be compiled and thus it make take a while to start) - View representation. Execute
python view_scene.py --help
misc
contains various Jupyter Notebooks used for benchmarking and other tests for the reportgrid_encoding_test.py
contains an experimental implementation of the neural hash encoding that was created in order to help me understand the technology betterkilonerf.py
includes a smaller KiloNeRF benchmark
You can download the data used for the report here.
Please see the LICENSE
file.