The official PyTorch implementation of the paper "MotionGPT: Human Motion Synthesis with Improved Diversity and Realism via GPT-3 Prompting".
If you find this code useful in your research, please cite:
@inproceedings{ribeiro2024motiongpt,
title={MotionGPT: Human Motion Synthesis with Improved Diversity and Realism via GPT-3 Prompting},
author={Ribeiro-Gomes, Jose and Cai, Tianhui and Milacski, Zolt{\'a}n A and Wu, Chen and Prakash, Aayush and Takagi, Shingo and Aubel, Amaury and Kim, Daeil and Bernardino, Alexandre and De La Torre, Fernando},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={5070--5080},
year={2024}
}
This code currently only has the instructions for inference. Training and data preparation will come shortly.
This code was tested on Ubuntu 18.04 LTS
and requires:
- Python 3.7
- conda3 or miniconda3
- CUDA capable GPU (tested on NVidia RTX A4000 16GB)
Install ffmpeg (if not already installed):
sudo apt update
sudo apt install ffmpeg
Setup conda env:
conda env create -f environment.yml
conda activate motiongpt
python -m spacy download en_core_web_sm
pip install git+https://github.com/openai/CLIP.git
pip install sentence_transformers
Download dependencies:
bash prepare/download_smpl_files.sh
bash prepare/download_glove.sh
bash prepare/download_t2m_evaluators.sh
Download the model(s) you wish to use, then unzip and place them in ./save/
.
python -m sample.generate --model_path ./save/mini/model000600161.pt --text_prompt "greet a friend" --babel_prompt "hug"
You may also define:
--device
id.--seed
to sample different prompts.--motion_length
(text-to-motion only) in seconds (maximum is 9.8[sec]).--second_llm
Running those will get you:
results.npy
file with text prompts and xyz positions of the generated animationsample##_rep##.mp4
- a stick figure animation for each generated motion.
It will look something like this:
You can stop here, or render the SMPL mesh using the following script.
To create SMPL mesh per frame run:
python -m visualize.render_mesh --input_path /path/to/mp4/stick/figure/file
This script outputs:
sample##_rep##_smpl_params.npy
- SMPL parameters (thetas, root translations, vertices and faces)sample##_rep##_obj
- Mesh per frame in.obj
format.
This code is heavily adapted from: