This project is for 3D object detection, which combines CenterNet, Feature Pyramid Networks and PointPillars.
If you use KITTI dataset, here is the link. You need to download the data declared below:
- Velodyne point clouds (29 GB)
- Training labels of object data set (5 MB)
- Camera calibration matrices of object data set (16 MB)
- Left color images of object data set (12 GB) (For visualization purpose only)
python demo_2_sides.py
In our case we got better results using PointPillars, PointPillars converts point clouds into voxels, and finally converts voxels into Pillars. A PP network is added in front of the backbone network. Different from the original work, we use Anchor Free's CenterNet and detection network with FPN.
python train_pp.py
python train.py
In order to test the model results, you first need to decode the prediction results and save them in KITTI format. To do that, you need to run
python test.py
Then you need to use C++ in cpp folder to perform model inference on the saved results. It should be noted that if you use the KITTI dataset, the test set won't have labels, so it is recommended to use the validation set for testing
g++ -O3 -DNDEBUG -o evaluate_object evaluate_object.cpp
Eventually you should be able to draw the PR curve for 2D BEV BBox and 3D BBox, you also can use this files to calculate the map.
${ROOT}
└── checkpoints/
├── fpn_resnet_18/
├── fpn_resnet_18_epoch_300.pth
└── dataset/
└── kitti/
├──ImageSets/
│ ├── test.txt
│ ├── train.txt
│ └── val.txt
├── training/
│ ├── image_2/ (left color camera)
│ ├── calib/
│ ├── label_2/
│ └── velodyne/
└── testing/
│ ├── image_2/ (left color camera)
│ ├── calib/
│ └── velodyne/
└── classes_names.txt
└── sfa/
├── README.md
└── requirements.txt
[1] CenterNet: Objects as Points paper, PyTorch Implementation
[2] PointPillars: Fast Encoders for Object Detection from Point Clouds, PyTorch Implementation
[3] Super Fast and Accurate 3D Object Detection: PyTorch implementation
[4] Feature Pyramid Networks for Object Detection