CVND-Image-Captioning-Project

CNN-RNN image captioning project for Udacity's Computer Vision Nanodegree (CVND) program.

Instructions

git clone https://github.com/cocodataset/cocoapi.git

cd cocoapi/PythonAPI  
make  
cd ..

Download some specific data from here: http://cocodataset.org/#download (described below)

Under Annotations, download:
- 2014 Train/Val annotations [241MB] (extract captions_train2014.json and captions_val2014.json, and place at locations cocoapi/annotations/captions_train2014.json and cocoapi/annotations/captions_val2014.json, respectively)
- 2014 Testing Image info [1MB] (extract image_info_test2014.json and place at location cocoapi/annotations/image_info_test2014.json)
Under Images, download:
- 2014 Train images [83K/13GB] (extract the train2014 folder and place at location cocoapi/images/train2014/)
- 2014 Val images [41K/6GB] (extract the val2014 folder and place at location cocoapi/images/val2014/)
- 2014 Test images [41K/6GB] (extract the test2014 folder and place at location cocoapi/images/test2014/)

The project is structured as a series of Jupyter notebooks that are designed to be completed in sequential order (0_Dataset.ipynb, 1_Preliminaries.ipynb, 2_Training.ipynb, 3_Inference.ipynb).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
images		images
0_Dataset.ipynb		0_Dataset.ipynb
1_Preliminaries.ipynb		1_Preliminaries.ipynb
2_Training.html		2_Training.html
2_Training.ipynb		2_Training.ipynb
3_Inference.html		3_Inference.html
3_Inference.ipynb		3_Inference.ipynb
4_Zip Your Project Files and Submit.ipynb		4_Zip Your Project Files and Submit.ipynb
LICENSE		LICENSE
README.md		README.md
data_loader.py		data_loader.py
filelist.txt		filelist.txt
model.py		model.py
project2.zip		project2.zip
training_log.txt		training_log.txt
vocab.pkl		vocab.pkl
vocabulary.py		vocabulary.py