Transliteration-using-RNNs

Implementing a Text Transliteration system on the NEWS 2012 English-Hindi Dataset using Tensorflow. Done as an assignment for the course 'Deep Learning : CS7015'

Dependencies

Python(used v2.7.12)
Tensorflow (used v1.10.1)
numpy (used v1.13.3)
matplotlib (used v1.5.3)
pandas (used v0.22.0)

Dataset

The dataset used is NEWS 2012 (Named Entities Workshop) shared task dataset, containing input words of different lengths. The training set has 13122 datapoints, validation set has 997 datapoints, test set (partial) has 400 datapoints and test set (final) has 1000 datapoints.

Implementation Details

Recurrent neural networks (LSTM) were used for the encoder and the decoder, for the input and output sequences respectively. For encoding the input sequence, we use tensorflow's implementation of bidirectional_dynamic_rnn for the encoder. For predicting the output sequence, we implement a custom decoder with attention mechanism, using basic tensorflow operations rather than tensorflow's seq2seq module. The model is trained end-to-end using a cross entropy loss. Techniques such as early stopping, dropout, uni/bi-directional encoders and stacked/non-stacked decoders have been experimented with. The observations and conclusions of these experiments, and more specific hyperparameter details+equations can be found in report.pdf.

Code Organization

train.py : Code to train and test the RNN model
train_uni.py : Code with unidirectional encoder
create_vocab.py : Code to create and save english and hindi vocabulary
plot_loss.py : Code to plot loss and accuracy plots
attention_plots.py : Code to plot the attention weights on the given test set
report.pdf : Detailed report with all experiments, plots and explnations
run.sh : Command for running inference with the best hyperparameter configuration

Attention Plots

The attention weights obtained are shown below for different words in the test set. As can be seen in the plots below, the implemented attention mechanism works well(almost perfectly!), even for relatively long sequences. Most of the attention plots have meaningful character alignments, with only a few characters having the wrong alignments(such as ’NI’ in the last plot). There no non-contiguous alignments observed. A lot of the characters have perfect one-one or one-many alignments(see ’AU’ in CAULFIELD or ’CO’ in ACORN). Even when the one-many alignments are not perfect, the higher probability is almost always assigned to the correct english character(s).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transliteration-using-RNNs

Dependencies

Dataset

Implementation Details

Code Organization

Attention Plots

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
Nirmala.ttf		Nirmala.ttf
README.md		README.md
attention_plots.py		attention_plots.py
create_vocab.py		create_vocab.py
infer.py		infer.py
plot_loss.py		plot_loss.py
report.pdf		report.pdf
run.sh		run.sh
train.py		train.py
train_best_test.py		train_best_test.py
train_best_val.py		train_best_val.py
train_uni.py		train_uni.py

SahanaRamnath/Transliteration-using-RNNs

Folders and files

Latest commit

History

Repository files navigation

Transliteration-using-RNNs

Dependencies

Dataset

Implementation Details

Code Organization

Attention Plots

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages