Skip to content

chzh9311/compound-triangulation-with-co-fixing

Repository files navigation

Compound Triangulation & Co-fixing

This is the official implementation of the TMM paper Joint-Limb Compound Triangulation With Co-Fixing for Stereoscopic Human Pose Estimation.

Note: This repository is extended with code for MHAD and joint training.

File Structure

  • experiments: configuration files. The files are all in yaml format.
  • lib: main code.
    • dataset: the dataloaders.
    • models: network model files.
    • utils: tools, functions, data format, etc.
  • backbone_pretrain.py: file to pretrain 2D backbone before E2E training.
  • config.py: configuration processors and the default config.
  • main.py: file to do E2E training.

How to Use

For convenience, we refer to the root directory of this repo as ${ROOT}.

Install Dependencies

First install the latest torch that fits your cuda version, then install the listed requirements. Note that this repository is tested under torch version 1.13.0 and cuda version 11.7.

cd ${ROOT}
pip install -r requirement.txt

Prepare Data

Human3.6M

  • Follow this guide to prepare image data and labels. Refer to the fetched data directory as ${H36M_ROOT}, then the directory should look like this:
    ${H36M_ROOT}
        |-- processed
        |     |-- S1/
        |     ...
        |-- extra
        |     |- human36m-multiview-labels-GTbboxes.npy
        |     ...
        ...
  • Generate monocular labels at ${H36M_ROOT}/extra/human36m-monocular-labels-GTbboxes.npy
    python lib/dataset/convert-multiview-to-monocular.py ${H36M_ROOT}/extra

Total Capture

  • Use Total Capture Toolbox to prepare data. Suppose the processed data root directory is ${TC_ROOT} (usually TotalCapture-Toolbox/data). It should look like this:
${TC_ROOT}
    |-- annot
    |     |-- totalcapture_train.pkl
    |     `-- totalcapture_validation.pkl
    `-- images

Pretrained Models

Here we provide the weights for ResNet152 which we used for our model:

Create a folder named pretrained under ${ROOT} and place the weights in it. If you want the backbone pretraining step to work out-of-the-box, the folder should look like this:

pretrained
    |-- from_lt/pose_resnet_4.5_pixels_human36m.pth
    `-- pytorch/imagenet/resnet152-b121ed2d.pth

We also provide the 4-view weights for Huamn3.6M and Total Capture, which reproduces the results in the paper:

Place the weights at a certain directory so the path could be referred to as ${weight_path}. We will be using this path in the Testing stage.

Training

We train the model in a two-step manner: first train the 2D backbone which outputs the joint confidence heatmap and the LOF. Then we train the model end-to-end for better accuracy.

To do pretraining, just run:

python backbone_pretrain.py --cfg experiments/ResNet${n_layers}/${dataset}-${resolution}-backbone.yaml --runMode train

Then, to train the model end-to-end:

python main.py --cfg experiments/ResNet${n_layers}/${dataset}-${resolution}.yaml --runMode train

Testing

python main.py\
     --cfg experiments/ResNet${n_layers}/${dataset}-${resolution}.yaml\
     --runMode test\
     -w ${weight_path}

For this repo, we provide n_layers=152, dataset=human3.6m | totalcapture and resolution=384x384 | 320x320 as an example.

Note: If you wish to train or test using multiple GPUS, please specify the GPU ids in the config file. By default, the script only uses GPU 0 for training / testing.

Citation

If you use our code, please cite us with:

@article{zhuo2024compound,
  author={Chen, Zhuo and Wan, Xiaoyue and Bao, Yiming and Zhao, Xu},
  journal={IEEE Transactions on Multimedia}, 
  title={Joint-Limb Compound Triangulation With Co-Fixing for Stereoscopic Human Pose Estimation}, 
  year={2024},
  pages={1-11},
  doi={10.1109/TMM.2024.3410514}}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages