Image-Text Relation Classification in Tweets

Environment

python
numpy
Pillow
scikit-learn
torch
torchvision
tqdm
transformers
flair (for LSTM+GoogLeNet baseline) (pip install -r requirements.txt)

Data

Please refer this repository for the text-image relationship dataset and this repository for the Twitter100k dataset.

controversial_samples.txt contains the ids of samples with controversial labels and is used in statistic_relabel.py. Each line in this file contains a head and a list of ids, e.g. "01->11: 3997, 4067, 4299". The head part represents the labels before and after relabeling. For example, "01" stands for "text is not represented & image adds", which is the original label. While "11" stands for "text is represented & image adds", which is the replaced label corrected by us.

Pretrained Models/Embeddings

Download pretrained BERT-Base from here and put it in this directory.

Download pretrained ResNet-101 from here, rename the binary file as "resnet101.pth" and put it in this directory.

Download pretrained Twitter Word Embedding from here and put it in this directory.

Usage

Training & Testing

run clustering.py for clustering-based baselines.
run supervised.py for supervised baselines.
run unsupervised.py for our ITRp method.

Analysis

run statistic.py to obtain average F1 score of different tasks on the raw/removed/relabeled test set.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
model		model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clustering.py		clustering.py
controversial_samples.txt		controversial_samples.txt
requirements.txt		requirements.txt
statistic.py		statistic.py
supervised.py		supervised.py
unsupervised.py		unsupervised.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-Text Relation Classification in Tweets

Environment

Data

Pretrained Models/Embeddings

Usage

Training & Testing

Analysis

About

Releases

Packages

Languages

License

SuYindu/ITRp

Folders and files

Latest commit

History

Repository files navigation

Image-Text Relation Classification in Tweets

Environment

Data

Pretrained Models/Embeddings

Usage

Training & Testing

Analysis

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages