Wikipedia2Image

Student project for the lecture Deep Vision summer term 2019.

from left to right: he former president of Zimbabwe Africa young, he Swiss businessman and actor Europe young, she American congresswoman from 1967-1998 Canada middle, he Japanese professor for socio-economics Asia middle

Problem Description

This project aims at translating text in the form of single-sentence human-written de- scriptions from Wikipedia articles into high-resolution images using a conditional ProGAN (by akanimax, thanks!). We use a self-crawled dataset of (image, text)-pairs from the wikidata Query Service. The training data are only weakly correlated - only very salient features of persons within images (such as age, gender or origin) find mention in the texts while the rest of the texts consist of noise irrelevant to the image and a learning model.

Data Set

For obtaining the data set please change to the data/ directory. Within the data/ directory we provide the json file contianing the entities we crawled from wikipedia.

How to run the code

Get pretrained models for InferSent and GloVe

!wget https://dl.fbaipublicfiles.com/infersent/infersent2.pkl
!wget http://nlp.stanford.edu/data/glove.840B.300d.zip && unzip glove.840B.300d.zip && rm glove.840B.300d.zip

Requirements

See requirements.txt

Evaluation

For the evaluation we provide a Jupyter Notebook and pretrained weights within the Evaluator folder. For more details, have a look at the notebook.

Thanks to

Code is based partially on akanimax.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Wikipedia2Image

Problem Description

Data Set

How to run the code

Get pretrained models for InferSent and GloVe

Requirements

Evaluation

Thanks to

Files

README.md

Latest commit

History

README.md

File metadata and controls

Wikipedia2Image

Problem Description

Data Set

How to run the code

Get pretrained models for InferSent and GloVe

Requirements

Evaluation

Thanks to