August 2019
tl;dr: Detect 2D oriented bbox with BEV maps by adding angle regression to YOLO.
The paper is clearly written and the innovation is limited. However the performance is really nice -- this is exactly the type of paper industry likes.
It is twice slower than Point Pillars achieves 115 fps.
- Add angle regression to YOLO.
- IoU calculation is updated to accommodate oriented bbox.
- The input encoding is based on MV3D.
- Each grid has only five anchor bboxes with different headings. The anchors do not cover a full grid but rather a finite combination of the parameters.
- Angle loss only effective when the oriented bbox IOU is larger than a threshold.
- Almost 10 times faster than VoxelNet, at 50 fps. In comparison Point Pillars achieves 115 fps.
- FOV is 40 m x 80 m (same with radar). The image format is 512 x 1024.
- RGB map encoded by height, intensity and density.
- The camera FOV is only about 90 (similar to radar). The heatmap of GT is very helpful. Output outside FOV is filtered before evaluation.
- Github repos of unofficial implementations: here and here with uncertainty and here