MotionBEV: Online LiDAR Moving Object segmentation with Bird's eye view based Appearance and Motion Features (RAL'23)
[Paper | ArXiv] [Paper | IEEEXplore] [Video | YouTube] [Video | Bilibili]
PyTorch implementation for LiDAR moving object segmentation framework MotionBEV (RAL'23).
B. Zhou, J. Xie, Y. Pan, J. Wu and C. Lu, "MotionBEV: Attention-Aware Online LiDAR Moving Object Segmentation With Bird's Eye View Based Appearance and Motion Features," in IEEE Robotics and Automation Letters, vol. 8, no. 12, pp. 8074-8081, Dec. 2023, doi: 10.1109/LRA.2023.3325687.
MotionBEV is a simple yet effective framework for LiDAR moving object segmentation. We extract spatio-temporal information from consecutive LiDAR scans in bird's eye view domain, and perform multi-modal features fusion with the multi-modality co-attention modules.
Overview of MotionBEV.
Visualization of MOS results on SemanticKITTI validation set.
MotionBEV is able to perform LiDAR-MOS with both mechanical LIDAR such as Velody HDL-64, and solid-state LiDAR with small Fov and non-repetitive scanning mode, such as Livox Avia.
Visualization of MOS results on SipailouCampus validation set.
This code is tested on Ubuntu 18.04 with Python 3.7, CUDA 11.6 and Pytorch 1.13.0.
Install the following dependencies:
- numpy==1.21.6
- pytorch==1.13.0+cu116
- tqdm==4.65.0
- pyyaml==6.0
- strictyaml==1.7.3
- icecream==2.1.3
- scipy==1.7.3
- numba==0.56.4
- torch-scatter==2.1.1+pt113cu116
- dropblock==0.3.0
Download SemanticKITTI dataset here. Extract everything into the same folder. Data file structure should look like this:
path_to_KITTI/
├──sequences
├── 00/
│ ├── calib.txt # Calibration file.
│ ├── poses.txt # Odometry poses.
│ ├── velodyne/ # Unzip from KITTI Odometry Benchmark Velodyne point clouds.
| | ├── 000000.bin
| | ├── 000001.bin
| | └── ...
│ └── labels/ # Unzip from SemanticKITTI label data.
| ├── 000000.label
| ├── 000001.label
| └── ...
├── ...
└── 21/
└── ...
If you want to use KITTI-road dataset, please follow MotionSeg3D, and put all extra sequences in the folder: path_to_KITTI/sequences
.
We also provide our dataset collected by Livox Avia here (Google Drive). There are 8 sequences in total, and the data file structure is the same as SemanticKITTI.
NOTE: DO NOT put the Sipailou sequences in SemanticKITTI's folder.
To speed up training, we first generate motion features for all scans.
Specify paths in data_preparing_polar_sequential.yaml
scan_folder: 'your_path/path_to_KITTI/'
residual_image_folder: 'your_path/mos/residual-polar-sequential-480-360/'
run
cd utils/generate_residual/utils
python auto_gen_polar_sequential_residual_images.py
Then we obtain motion features with N channels for all scans.
TODO
- The motion features in BEV domain are back-projected to the 3D space after being generated, and projected to the 2D space again when training. This consumes a lot of time and space. We'll keep them in the 2D space in the future.
- CPP implementation for motion features generation.
pretrained models:
MotionBEV-kitti-val-76.54.pt
MotionBEV-kitti-road-test-74.88.pt
the filename means: [model name] - [dataset] - [split] - [IoU]
Specify params in MotionBEV-semantickitti.yaml
data_path: "your_path/path_to_KITTI"
residual_path: "your_path/mos/residual-polar-sequential-480-360/" #"/media/ubuntu/4T/KITTI/mos/residual-polar-sequential-480-360"
model_load_path: "pretain/MotionBEV-kitti-val-76.54.pt"
Run
python infer_SemanticKITTI.py
the predictions will be saved in folder prediction_save_dir
You may want to modify these params in MotionBEV-semantickitti.yaml
data_path: "your_path/path_to_KITTI"
residual_path: "your_path/mos/residual-polar-sequential-480-360/" #"/media/ubuntu/4T/KITTI/mos/residual-polar-sequential-480-360"
model_load_path: "" # none for training from scratch
batch_size: 8
eval_every_n_steps: 1048 #1411 #1048
drop_few_static_frames: True # drop_few_static_frames for training, speed up training while slightly reduce the accuracy
Run
python train_SemanticKITTI.py
generate motion features:
cd utils/generate_residual/utils
python auto_gen_livox_sequential_residual_images.py
pretained models:
MotionBEV-livox-val-89.22.pt
infer:
python infer_livox.py
train:
python train_livox.py
Follow semantic-kitti-api.
Or run:
python utils/evaluate_mos.py -d your_path/path_to_KITTI -p your_path/path_to_predictions -s valid
Install open3d for visualization.
python utils/visualize_mos.py -d your_path/path_to_KITTI -p your_path/path_to_predictions -s 08
Please cite our paper if this code benefits your research:
@ARTICLE{motionbev2023,
author={Zhou, Bo and Xie, Jiapeng and Pan, Yan and Wu, Jiajie and Lu, Chuanzhao},
journal={IEEE Robotics and Automation Letters},
title={MotionBEV: Attention-Aware Online LiDAR Moving Object Segmentation With Bird's Eye View Based Appearance and Motion Features},
year={2023},
volume={8},
number={12},
pages={8074-8081},
doi={10.1109/LRA.2023.3325687}}
We thank for the opensource codebases, PolarSeg , LiDAR-MOS , MotionSeg3D , Motion-Guided-Attention and AMC-Net.