Image-Captioning-using-Attention-Mechanism

CourseWork Project, Mathematics for Machine Learning

Teamates

Adya Bhat
Aditya G H

It uses a encoder-decoder architecture to caption Images. I used a pretrained ResNet model as the encoder to extract image features, combined with an LSTM to generate captions using the extracted features.

Model Architecture

Dataset

Flickr 8K - It consists of a folder Images with images (.jpg format), and a a text file captions.txt. Each image has five captions each. containing the captions corresponding to the image file names.

The format of the text file is:

<image-filename>,<caption>\n

Results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image-Captioning-using-Attention-Mechanism

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image-Captioning-using-Attention-Mechanism