Skip to content

The implementation of our paper: Bilinear Representation for Language-Based Image Editing using Conditional Generative Adversarial Networks (ICASSP2019)

License

Notifications You must be signed in to change notification settings

vtddggg/BilinearGAN_for_LBIE

Repository files navigation

BilinearGAN_for_LBIE

Implementation of the paper Bilinear Representation for Language-Based Image Editing using Conditional Generative Adversarial Networks in ICASSP2019

Results

Requirements

  • Python 2
  • PyTorch 0.3.1
  • Torchvision
  • FastText
  • NLTK

FastText Install

 $ git clone https://github.com/facebookresearch/fastText.git
 $ cd fastText
 $ pip install .

Download a pretrained English word vectors. Unzip it and move wiki.en.bin to fasttext_models/

Datasets download

  • Oxford-102 flowers: images and captions
  • Caltech-200 birds: images and captions
  • Fashion Synthesis: download language_original.mat, ind.mat and G2.zip from here

Move all the downloaded files into datasets/ and extract them.

Train

Oxford-102 flowers

Stage1: train visual-semantic embedding model.

python2 train_text_embedding.py \
    --img_root ./datasets \
    --caption_root ./datasets/flowers_icml \
    --trainclasses_file trainvalclasses.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --save_filename ./models/text_embedding_flowers.pth

Stage2: train BilinearGAN for Language-Based Image Editing (LBIE).

python2 train.py \
    --img_root ./datasets \
    --caption_root ./datasets/flowers_icml \
    --trainclasses_file trainvalclasses.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --text_embedding_model ./models/text_embedding_flowers.pth \
    --save_filename ./models/flowers_res_lowrank_64.pth \
    --use_vgg \
    --fusing_method lowrank_BP

Caltech-200 birds

Stage1: train visual-semantic embedding model.

python2 train_text_embedding.py \
    --img_root ./datasets/CUB_200_2011/images \
    --caption_root ./datasets/cub_icml \
    --trainclasses_file trainvalclasses.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --save_filename ./models/text_embedding_birds.pth

Stage2: train BilinearGAN for Language-Based Image Editing (LBIE).

python2 train.py \
    --img_root ./datasets/CUB_200_2011/images \
    --caption_root ./datasets/cub_icml \
    --trainclasses_file trainvalclasses.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --text_embedding_model ./models/text_embedding_birds.pth \
    --save_filename ./models/birds_res_lowrank_64.pth \
    --use_vgg \
    --fusing_method lowrank_BP

Fashion Synthesis

Stage1: preprocess training data by runing python2 process_fashion_data.py.

Stage2: train visual-semantic embedding model.

python2 train_text_embedding.py \
    --img_root ./datasets \
    --caption_root ./datasets/FashionGAN_txt \
    --trainclasses_file trainclasses.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --save_filename ./models/text_embedding_fashion.pth

Stage3: train BilinearGAN for Language-Based Image Editing (LBIE).

python2 train.py \
    --img_root ./datasets \
    --caption_root ./datasets/FashionGAN_txt \
    --trainclasses_file trainclasses.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --text_embedding_model ./models/text_embedding_fashion.pth \
    --save_filename ./models/fashion_res_lowrank_64.pth \
    --use_vgg \
    --fusing_method lowrank_BP

Other fusing methods

You can modify --fusing_method to train the model by different fusing methods: lowrank_BP, FiLM and default is concat

Test

  • Oxford-102 flowers
python2 test.py \
    --img_root ./test/flowers \
    --text_file ./test/text_flowers.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --text_embedding_model ./models/text_embedding_flowers.pth \
    --generator_model ./models/flowers_res_lowrank_64.pth \
    --output_root ./test/result_flowers \
    --use_vgg \
    --fusing_method lowrank_BP
  • Caltech-200 birds
python2 test.py \
    --img_root ./test/birds \
    --text_file ./test/text_birds.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --text_embedding_model ./models/text_embedding_birds.pth \
    --generator_model ./models/birds_res_lowrank_64.pth \
    --output_root ./test/result_birds \
    --use_vgg \
    --fusing_method lowrank_BP
  • Fashion Synthesis
python2 test.py \
    --img_root ./test/fashion \
    --text_file ./test/text_fashion.txt \
    --fasttext_model ./fasttext_models/wiki.en.bin \
    --text_embedding_model ./models/text_embedding_fashion.pth \
    --generator_model ./models/fashion_res_lowrank_64.pth \
    --output_root ./test/result_fashion \
    --use_vgg \
    --fusing_method lowrank_BP

Reference

About

The implementation of our paper: Bilinear Representation for Language-Based Image Editing using Conditional Generative Adversarial Networks (ICASSP2019)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages