Skip to content

Latest commit

 

History

History
161 lines (119 loc) · 8.54 KB

README.md

File metadata and controls

161 lines (119 loc) · 8.54 KB

Mind the Gap - Official PyTorch Implementation

Peihao Zhu, Rameen Abdal, John Femiani, Peter Wonka

arXiv Open In Colab ICLR Project Page

Abstract: We present a new method for one shot domain adaptation. The input to our method is trained GAN that can produce images in domain A and a single reference image I_B from domain B. The proposed algorithm can translate any output of the trained GAN from domain A to domain B. There are two main advantages of our method compared to the current state of the art: First, our solution achieves higher visual quality, e.g. by noticeably reducing overfitting. Second, our solution allows for more degrees of freedom to control the domain gap, i.e. what aspects of image I_B are used to define the domain B. Technically, we realize the new method by building on a pre-trained StyleGAN generator as GAN and a pre-trained CLIP model for representing the domain gap. We propose several new regularizers for controlling the domain gap to optimize the weights of the pre-trained StyleGAN generator to output images in domain B instead of domain A. The regularizers prevent the optimization from taking on too many attributes of the single reference image. Our results show significant visual improvements over the state of the art as well as multiple applications that highlight improved control.

Description

Official Pytorch Implementation of " Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks"

Google Colab

Open In Colab

We set up a Colab Notebook so you can play with it yourself :) Everything to get started is in it!

Getting Started

Prerequisites

  • Linux or macOS
  • NVIDIA GPU + CUDA CuDNN
  • Python 3

Installation

  • Clone the repository:
git clone https://github.com/ZPdesu/MindTheGap.git
cd MindTheGap
  • Dependencies: We recommend running this repository using Anaconda. All dependencies for defining the environment are provided in ./environment/environment.yml.
conda env create -f environment/environment.yml

Pretrained Models

If the automatic download doesn't work, please download the pre-trained models from Google Drive.

Model Description
FFHQ StyleGAN model pretrained on FFHQ with 1024x1024 output resolution.
e4e_ffhq_encode FFHQ e4e encoder.
titan_erwin StyleGAN model finetuned on titan_erwin.png.
titan_armin StyleGAN model finetuned on titan_armin.png.
titan_historia StyleGAN model finetuned on titan_historia.png
pocahontas StyleGAN model finetuned on pocahontas.png
moana StyleGAN model finetuned on moana.png
doc_brown StyleGAN model finetuned on doc_brown.png
brave StyleGAN model finetuned on brave.png
sketch StyleGAN model finetuned on sketch.png
jojo StyleGAN model finetuned on jojo.png
detroit StyleGAN model finetuned on detroit.png
picasso StyleGAN model finetuned on picasso.png
anastasia StyleGAN model finetuned on anastasia.png
room_girl StyleGAN model finetuned on room_girl.png
speed_paint StyleGAN model finetuned on speed_paint.png
digital_painting_jing StyleGAN model finetuned on digital_painting_jing.png
mermaid StyleGAN model finetuned on mermaid.png
zbrush_girl StyleGAN model finetuned on zbrush_girl.png
joker StyleGAN model finetuned on joker.png

By default, we assume that all models are downloaded and saved to the directory pretrained_models.

Inference

Transfer the pretrained style onto a given image:

python inference.py --input_img Yui.jpg --style_img titan_erwin.png --embedding_method II2S

Put the unprocessed input image (e.g. Yui.jpg) to ./face_images/Unaligned. After the code runs, the aligned input image will be saved in ./face_images/Aligned, and the corresponding embedding latent code will be saved in ./inversions/II2S. Users can find output results in ./output/inference.

To speed up runtime, users can choose to use e4e embeddings at inference time.

python inference.py --input_img Yui.jpg --style_img titan_erwin.png --embedding_method e4e

Remark: Although using e4e can save inference time, its embedding results are sometimes very different from the input image.

Generation

Generate random face images using pretrained styles. (Results are saved in the ./output/generate folder):

python generate.py --style_img titan_erwin.png --n_sample 5 --truc 0.5

Train on your own style image

Put your own style image in the ./style_images/Unaligned folder and run

python train.py --style_img moana.jpg

The finetuned generator will be saved in the ./output/train folder. More training options can be found in ./options/MTG_options.py. For example, specify loss weights and training iterations.

python train.py --style_img moana.jpg --clip_across_lambda 1 --ref_clip_lambda 30 --l2_lambda 10 --lpips_lambda 10 --clip_within_lambda 0.5 --iter 600

Iteration: 0

Iteration: 600

BibTeX

@misc{zhu2021mind,
    title={Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks},
    author={Peihao Zhu and Rameen Abdal and John Femiani and Peter Wonka},
    year={2021},
    eprint={2110.08398},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Acknowledgments

This code borrows from StyleGAN2 by rosalinity and II2S. Some snippets of colab code from StyleGAN-NADA and JoJoGAN