Skip to content

Latest commit

 

History

History
233 lines (182 loc) · 10.7 KB

README.md

File metadata and controls

233 lines (182 loc) · 10.7 KB

CenterFusion++

This repository contains the implementation of the master's thesis project Camera-Radar Sensor Fusion using Deep Learning from Johannes Kübel and Julian Brandes.

The thesis is available for download in the Chalmers Open Digital Repository.


Contents

Introduction

This work is based on the frustum-proposal based radar and camera sensor fusion approach CenterFusion proposed by Nabati et al. We introduce two major changes to the existing network architecture:

  1. Early Fusion (EF) as a projection of the radar point cloud into the image plane. The projected radar point image features (default: depth, velocity components in x and z and RCS value) are then concatenated to the RGB image channels as a new input to the image-based backbone of the network architecture. EF introduces robustness against camera sensor failure and challenging environmental conditions (e.g. rain/night/fog).
  2. Learned Frustum Association (LFANet): The Second major change to the architecture regards the frustum-proposal based association between camera and radar point cloud. Instead of selecting the closest radar point to associate it to the detection obtained from the backbone & primary heads, we propose a network termed LFANet that outputs an artifical radar point r* representing all the radar points in the frustum. LFANet is trained to output the depth to the center of the bounding box associated with the radar point as well as the corresponding radial velocity. The outputs of LFANet are then used as the new channels in the heatmap introduced by Nabati et al.

We combine these two changes to obtain CenterFusion++.
The following figure displays the modified network architecture on a high level:

CenterFusion++ Overview CenterFusion++ Overview

Installation

The code has been tested on Ubuntu 20.04 with Python 3.7.11, CUDA 11.3.1 and PyTorch 1.10.2.
We used conda for the package management, the conda environment file is provided here.
For installation, follow these steps:

  1. Clone the repository with the --recursive option. We'll call the directory that you cloned thesisdlfusion into CFPP_ROOT:

    git clone --recursive https://github.com/brandesjj/centerfusionpp
  2. Install conda, following the instructions on their website. We use Anaconda, Installation can be done via:

    ./<Anaconda_file.sh>
  3. Create a new conda environment (optional):

    conda env create -f <CFPP_ROOT>/experiments/centerfusionpp.yml   

    Restarg the shell and activate the conda environment

    conda activate centerfusionpp
  4. Build the deformable convolution library:

    cd <CFPP_ROOT>/src/lib/model/networks/DCNv2
    ./make.sh

    Note: If the DCNv2 folder does not exist in the networks directory, it can be downloaded using this command:

    cd <CFPP_ROOT>/src/lib/model/networks
    git clone https://github.com/lbin/DCNv2/

    Note that this repository uses a slightly different DCNv2 repository than CenterFusion since this caused some problems for our CUDA/pytorch version.

Additionally, the docker file to build a docker container with all the necessary packages is located here.

Dataset Preparation

CenterFusion++ was trained and validated using the nuScenes dataset only. Previous work (e.g. CenterTrack) uses other dataset (e.g. KITTI etc.) as well. This is not implemented within CenterFusion++. However, the original files that can be used to convert these datasets into the correct dataformat are not removed from this repository.

To download the dataset to your local machine follow these steps:

  1. Download the nuScenes dataset from nuScenes website.

  2. Extract the downloaded files in the <CFPP_ROOT>\data\nuscenes directory. You should have the following directory structure after extraction:

    <CFPP_ROOT>
    `-- data
        `-- nuscenes
            |-- maps
            |-- samples
            |   |-- CAM_BACK
            |   |   | -- xxx.jpg
            |   |   ` -- ...
            |   |-- CAM_BACK_LEFT
            |   |-- CAM_BACK_RIGHT
            |   |-- CAM_FRONT
            |   |-- CAM_FRONT_LEFT
            |   |-- CAM_FRONT_RIGHT
            |   |-- RADAR_BACK_LEFT
            |   |   | -- xxx.pcd
            |   |   ` -- ...
            |   |-- RADAR_BACK_RIGHT
            |   |-- RADAR_FRON
            |   |-- RADAR_FRONT_LEFT
            |   `-- RADAR_FRONT_RIGHT
            |-- sweeps
            |-- v1.0-mini
            |-- v1.0-test
            `-- v1.0-trainval
        `-- annotations
          
    

    In this work, not all the data available from nuScenes is required. To save disk space, you can skip the LiDAR data in /samples and /sweeps and the CAM_(..) folders in /sweeps.

Now you can create the necessary annotations. To create the annotations, run the convert_nuScenes.py script to convert the nuScenes dataset to the required COCO format:

cd <CFPP_ROOT>/src/tools
python convert_nuScenes.py

The script contains several settings that can be used. They are explained in the first block of the code.

Pretrained Models

The pre-trained models can be downloaded from the links given in the following table:

Model GPUs Backbone Val NDS Val mAP
centerfusionpp.pth 2x NVIDIA A100 EF 0.4512 0.3209
centerfusion_lfa.pth 2x NVIDIA A100 CenterNet170 0.4407 0.3219
earlyfusion.pth 2x NVIDIA A100 DLA34 0.3954 0.3159

Notes:

Training

Train on local machine

The scripts in <CFPP_ROOT>/experiments/ can be used to train the network. There is one for the training on 1 GPU and another for the training on 2 GPUs.

cd <CFPP_ROOT>
bash experiments/train.sh

The --train_split parameter determines the training set, which could be mini_train or train. the --load_model parameter can be set to continue training from a pretrained model, or removed to start training from scratch. You can modify the parameters in the script as needed, or add more supported parameters from <CFPP_ROOT>/src/lib/opts.py.

The script creates a log folder in

<CFPP_ROOT>/exp/ddd/<exp_id>/logs_<time_stamp>

where <time_stamp> is the time stamp and the default for <exp_id> is centerfusionpp.
The log folder contains an event file for Tensorboard, a log.txt with a brief summary of the training process and a opt.txt file containing the specified options.

Testing

Download the pre-trained model into the <CFPP_ROOT>/models directory and use the <CFPP_ROOT>/experiments/test.sh script to run the evaluation:

cd <CFPP_ROOT>
bash experiments/test.sh

Make sure the --load_model parameter in the script provides the path to the downloaded pre-trained model. The --val_split parameter determines the validation set, which could be mini_val, val or test. You can modify the parameters in the script as needed, or add more supported parameters from <CFPP_ROOT>/src/lib/opts.py.

Citation

To reference to this work, please use the following:

@mastersthesis{CenterFusionPP,
author = {K{\"u}bel, Johannes and Brandes, Julian},
title = {Camera-Radar Sensor Fusion using Deep Learning},
school = {Chalmers University of Technology},
year = 2022,
note = {Available: https://hdl.handle.net/20.500.12380/305503}}

References

The following works have been used by CenterFusion++.


@INPROCEEDINGS{9423268,
author={Nabati, Ramin and Qi, Hairong},
booktitle={2021 IEEE Winter Conference on Applications of Computer Vision (WACV)}, 
title={CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection}, 
year={2021},
volume={},
number={},
pages={1526-1535},
doi={10.1109/WACV48630.2021.00157}}

@inproceedings{zhou2019objects,
title={Objects as Points},
author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
booktitle={arXiv preprint arXiv:1904.07850},
year={2019}
}

@article{zhou2020tracking,
title={Tracking Objects as Points},
author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
journal={ECCV},
year={2020}
}

@inproceedings{nuscenes2019,
title={{nuScenes}: A multimodal dataset for autonomous driving},
author={Holger Caesar and Varun Bankiti and Alex H. Lang and Sourabh Vora and Venice Erin Liong and Qiang Xu and Anush Krishnan and Yu Pan and Giancarlo Baldan and Oscar Beijbom},
booktitle={CVPR},
year={2020}
}


License

CenterFusion++ is based on CenterFusion and is released under the MIT License. See NOTICE for license information on other libraries used in this project.