This repository contains the code and resources for the research project "Draw Me Like Your Triples: Leveraging Generative AI for the Completion of Wikidata". The project was conducted by Raia Abu Ahmad, Martin Critelli, Şefika Efeoğlu, Eleonora Mancini, Célian Ringwald and Xinyue Zhang under the supervision of Prof. Albert Merono.
wip!
Folder Structure:
.
├── dataset ---> Dataset folder
│ ├── image-data
│ │ ├── generated-data ---> Generated images of the fictional characters
│ │ └── ground-truth ---> Original images of the fictional characters
│ ├── prompt-data ---> Prompts generated from raw-data.
│ ├── raw-data ---> This directory contains the datasets created by querying Wikidata and DBpedia for fictional characters
│ └── unavailable_pic_ids ---> Fictional characters with unaccessible images, but have an image property in the Wikidata KB
├── scripts ---> Bash scripts
└── src
├── data-collection ---> Data collection codes from DBpedia and Wikidata, and ground truth image downloader
├── evaluation ---> Evaluation metrics
│ ├── evaluation-emotion
│ └── evaluation-image-semantic
├── image-generator ---> text (prompt) to image generator
├── notebooks ---> visualization codes
├── prompt-generator ---> prompt generation codes
└── utils ---> read, write, download, and data loader functions
The ground truth image data is available @ 🤗 .
- Clone the repository to the local.
$ git clone https://github.com/helemanc/gryffindor.git
- Install requirements: wip!
$ pip install -r requirements.txt
- Wikidata Dataset Creation
$ cd src/data-collection
$ python wiki_query_service.py
- DBpedia Abstract Collection
$ cd src/data-collection
$ python get_dbpedia_abstracts.py
Please first download the file best_model.ckpt.tar.gz here, and put it in the folder src/prompt-generator/graph2text/outputs/t5-base_13881.
Then compress the file by using the following command:
$ cd src/prompt-generator/graph2text/outputs/t5-base_13881
$ tar -xzvf best_model.ckpt.tar.gz
Prepare another model using the following command:
$ cd best_tfmr
$ cat pytorch_model.bin.tar.gz.parta* > pytorch_model.bin.tar.gz
$ tar -xzvf pytorch_model.bin.tar.gz
Finally, the prompt generator is ready to run:
$ cd PATH2THISREPO/src/prompt-generator/
$ python generatePrompt.py
$ cd src/data-collection
$ python ground_truth_image_downloader.py
$ cd src/image-generator
$ python generator.py
- Image Semantic Evaluation
$ cd src/evaluation/evaluation-image-semantic
$ python evaluation_uqi.py
- Emotion Evaluation
$ cd src/evaluation/evaluation-emotion
$ python emotion_prediction_prompt.py
- The analysis about the evaluation results
For any inquiries or further information regarding this research project, please feel free to reach out NAME(EMAIL).
We appreciate your interest in our work and hope that this repository proves useful to the research community.