Skip to content

Latest commit

 

History

History
59 lines (40 loc) · 3.27 KB

MODEL_ZOO.md

File metadata and controls

59 lines (40 loc) · 3.27 KB

GTR model zoo

Introduction

This file documents a collection of models reported in our paper. Our experiments are trained on a DGX machine with 8 32G V100 GPUs. Most of our models use 4 GPUs.

How to Read the Tables

The "Name" column contains a link to the config file. To train a model, run

python train_net.py --num-gpus 4 --config-file /path/to/config/name.yaml

To evaluate a model with a trained/ pretrained model, run

python train_net.py --config-file /path/to/config/name.yaml --eval-only MODEL.WEIGHTS /path/to/weight.pth

MOT

Validation set

Name MOTA IDF1 HOTA DetA AssA Download
GTR_MOT_FPN 71.3 75.9 63.0 60.4 66.2 model
GTR_MOT_FPN (local) 71.1 74.2 62.1 60.2 64.4 same as above

Test set

Name MOTA IDF1 HOTA DetA AssA Download
GTR_MOTFull_FPN 75.3 71.5 59.1 61.6 57.0 model

Note

  • The validation set follows the half-half training set split from CenterTrack.
  • All models are finetuned from a detection-only model trained on Crowdhuman (config, model). Download or train the model and place it as GTR_ROOT/models/CH_FPN_1x.pth before training. Training the detection-only models takes ~12 hours on 4 GPUs.
  • Training GTR takes ~3 hours on 4 V100 GPUs (32G memory).
  • GTR_MOT_FPN is our model with a temporal-window size of 32. It needs more than 12G GPU memory in testing. To change the temporal-window size, append INPUT.VIDEO.TEST_LEN 16 to the command.
  • GTR_MOT_FPN (local) is our local tracker baseline, which applies FairMOT to our detections and features. To run it, append VIDEO_TEST.LOCAL_TRACK True to the command.

TAO

Name validation mAP Test mAP Download
GTR_TAO_DR2101 22.5 20.1 model

Note

  • The model is evaluated on TAO keyframes only, which are sampled in ~1 frame-per-second.
  • Our model is trained on LVIS+COCO only. The TAO training set is not used anywhere.
  • Our model is finetuned on a detection-only CenterNet2 model trained on LVIS+COCO (config, model). Download or train the model and place it as GTR_ROOT/models/C2_LVISCOCO_DR2101_4x.pth before training. Training the detection-only models takes ~3 days on 8 GPUs.
  • Training GTR takes ~13 hours on 4 V100 GPUs (32G memory).