Neural-Machine-Translation

Implementing NMT on an IWSLT dataset, using sequence-to-sequence (seq2seq) models, in Tensorflow

Dependencies

Code to be run in Linux terminal to train and evaluate the model :

$ python main_nmt.py

main_nmt.py : This file contains the main program. This directs the model to run training/evaluation as directed.
data_preparation.py : This file contains code to extract data from the datasets used(refer below) and preprocessing
parameters.py : This file defines a function initialising all the parameters and hyperparamers used.
model_attention.py : This file defines a class, AttentionModel, that defines the architecture of the encoder-decoder NMT model.
train_nmt.py : This file creates train and test/validation instances. It trains and evaluates the model. Checkpoints are created periodically, with the latest five checkpoints stored at any time in the specified output directory. The model is loaded from the latest checkpoint present, or if none are available, a new model is created.
basic_functions.py : This file defines functions for different parts of the neural network model architecture and for evaluation.
additional_functions.py : The file defines functions to load data and and to format the output.
calculate\_bleu\_score.py : This file calculates the BLEU-4 score given two files to be compared. Usage : python calculate_bleu_score.py /path/to/reference_file /path/to/predicted_file

English-Vietnamese parallel corpus of TED Talks, provided by the IWSLT Evaluation Campaign. Preprocessed data from The Stanford NLP group was used to train and test the models.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Datasets		Datasets
Vocabulary_Files		Vocabulary_Files
README.md		README.md
additional_functions.py		additional_functions.py
basic_functions.py		basic_functions.py
calculate_bleu_score.py		calculate_bleu_score.py
data_preparation.py		data_preparation.py
main_nmt.py		main_nmt.py
model_attention.py		model_attention.py
parameters.py		parameters.py
train_nmt.py		train_nmt.py