Image Captioning

Description

In this project, I develop, train, and evaluate models for image captioning, inspired by BLIP's approach. The goal is to create a system that can generate descriptive and accurate captions for images. Additionally, I build a demo web app here to showcase these models in action, providing an interactive platform for users to experience the capabilities of AI-driven image captioning firsthand.

Results

The Flickr30k dataset is divided into training and testing sets with a 70/30 split.

Model	Test WER	Test BLEU@4	Train WER	Train BLEU@4	Config	Checkpoint	Report	Paper
BLIP Base	59.15	14.11	55.61	16.11	Config	HuggingFace	Wandb	Arxiv

Demo

You can this notebook (Colab) or this demo on HuggingFace for inference. You can also use the Streamlit demo offline by running this command from the root directory.

streamlit src/app.py

Installation

Pip

# clone project
git clone https://github.com/tanthinhdt/imcap
cd imcap

# [OPTIONAL] create conda environment
conda create -n imcap python=3.11.10
conda activate imcap

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt

Conda

# clone project
git clone https://github.com/tanthinhdt/imcap
cd imcap

# create conda environment and install dependencies
conda env create -f environment.yaml -n imcap

# activate conda environment
conda activate imcap

Training

Train model with default configuration

# train on CPU
python src/train.py trainer=cpu

# train on GPU
python src/train.py trainer=gpu

Train model with chosen experiment configuration from configs/experiment/

python src/train.py experiment=experiment_name.yaml

You can override any parameter from command line like this

python src/train.py trainer.max_epochs=20 data.batch_size=64

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
.github		.github
configs		configs
data		data
logs		logs
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.project-root		.project-root
Makefile		Makefile
README.md		README.md
environment.yaml		environment.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning

Table of Contents

Description

Results

Demo

Installation

Pip

Conda

Training

About

Releases

Packages

Languages

tanthinhdt/imcap

Folders and files

Latest commit

History

Repository files navigation

Image Captioning

Table of Contents

Description

Results

Demo

Installation

Pip

Conda

Training

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages