You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The experiments are run with PyTorch 1.7.0, CUDA 10.1 and CUDNN 7.6
The training is conducted on 8 Telsa V100 GPUs
For the fade strategy proposed by PointAugmenting(disenable the copy-and-paste augmentation for the last 5 epochs), we currently implement this strategy by manually stop training at 15 epoch and resume the training without copy-and-paste augmentation. If you find more elegant ways to implement such strategy, please let we know and we really appreciate it. The fade strategy reduces lots of false positive, improving the mAP remarkably especially for TransFusion-L while having less influence on TransFusion.
Pretrained 2D Backbones
DLA34: Following PointAugmenting, we directly reuse the checkpoints pretrained on monocular 3D detection task provided by CenterNet.
ResNet50 on instance segmentation: We acquire the model pretrained on nuImages from MMDetection3D.
ResNet50 on 2D detection: We train a model using the config of instance segmentation but remove the mask head.
nuScenes 3D Detection
All the LiDAR-only models are trained in 20 epochs, the fusion-based models are further trained for 6 epochs from the pretrained LiDAR backbone. We freeze the weight of LiDAR backbone to save GPU memory.
We use 300 object queries during inference for online submission for a slightly better performance. We do not use any test-time-augmentation and model ensemble.