ETC-TON:Enhancing Temporal Consistency of video-based virtual Try-ON with latent diffusion models

NOTE: Thanks to ladi-vton, DisCo for the inspiration, and any related discussions are welcome.

Result Visualization

Improved 3D denoising UNet, improved face reconstruction performance and temporal consistency, using VVT dataset, fine-tuning based on stable-diffusion-2-inpainting
Preprocessing dataset based on Densepose, Human-Parsing. The processed files will be uploaded soon
Based on a pre-trained self-developed clothing deformation model, images of the warped clothing will also be released. The deformation model can be replaced by the existing graph virtual try-on deformation model

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
src		src
README.md		README.md
eval.py		eval.py