PATMAT enables personalization of Mask-Aware Transformer model, provided refrence images of a face.
Our two-step framework; PAT and MAT build extensively on Pivot Tuning's (PTI) paper + code and MAT's paper + code.
- Clone the repository.
git clone https://github.com/humansensinglab/PATMAT
- Install the dependencies.
- Python 3.7
- PyTorch 1.7.1
- Cuda 11.0
- Other packages
pip install -r requirements.txt
Please download the pretrained models from the following links.
various auxiliary models needed for PAT inversion task.
This includes the StyleGAN generator and pre-trained models used for loss computation.
Path | Description |
---|---|
FFHQ StyleGAN | StyleGAN2-ada model trained on FFHQ with 1024x1024 output resolution. |
Dlib alignment | Dlib alignment used for images preproccessing. |
FFHQ e4e encoder | Pretrained e4e encoder. Used for StyleCLIP editing. |
Glinnt360k can be downloaded from this link: https://drive.google.com/file/d/1pRDYnndOUemVrZaFV6ZGpH3eQowQpQlL/view?usp=sharing |
MAT repo provides models trained on CelebA-HQ, FFHQ and Places365-Standard at 512x512 resolution. Download models from One Drive and put them into the 'pretrained' directory. Note: The StyleGAN model is used directly from the official stylegan2-ada-pytorch implementation. For StyleCLIP pretrained mappers, please see StyleCLIP's official routes
By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models
.
However, you may use your own paths by changing the necessary values in configs/path_configs.py
.
The main training script is PAT/scripts/run_pat.py
. The script receives aligned and cropped images from paths configured in the "Input info" subscetion in
configs/paths_config.py
where you can also find out about the data structure and file naming convention. I am hoping to make thgis step more user friendly but for now please bear with me.
Results are saved to directories found at "Dirs for output files" under configs/paths_config.py
. This includes inversion latent codes and tuned generators.
The hyperparametrs for the inversion task can be found at configs/hyperparameters.py
.
To inpaint desired images after tuning your network with PAT, you can run:
python generate_image.py --network model_path --dpath data_path --refpath reference_path --outdir out_path [--mpath mask_path]
where model_path
is the path to PAT's output model and reference_path
is a few reference images of the identity you are inpainting (*can be a subset of PAT's training data).
Pivot Tuning and implementation:
https://github.com/richzhang/PerceptualSimilarity
MAT model and implementation:
https://github.com/omertov/encoder4editing
StyleGAN2-ada model and implementation:
https://github.com/NVlabs/stylegan2-ada-pytorch
Copyright © 2021, NVIDIA Corporation.
Nvidia Source Code License https://nvlabs.github.io/stylegan2-ada-pytorch/license.html
This repository structure is based on MAT and Pivot Tuning
For any inquiry please contact us at our email addresses: sam(dot)motamed(at)insait(dot)ai
If you use this code for your research, please cite:
@InProceedings{Motamed_2023_ICCV,
author = {Motamed, Saman and Xu, Jianjin and Wu, Chen Henry and H\"ane, Christian and Bazin, Jean-Charles and De la Torre, Fernando},
title = {PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {22778-22787}
}