Note: repository in progress...
This project focuses on transliterating names from English to Hindi using various sequence-to-sequence (seq2seq) models. The transliteration task involves converting the English names into their equivalent in Devanagari/Hindi script.
A standard seq2seq model consists of two main components:
• Encoder: Processes the input sequence
• Decoder: Takes the encoder’s representation as inputs and generates the output sequence
- The decoder is modeling the distribution:
$P(\mathbf{y}|\mathbf{x})=\quad P(y_1, y_2 \dots y_N|\mathbf{x}) = \quad \prod_{i=1}^{N} P(y_i|\mathbf{x}, y_{1:i-1})$
• Encoder: Exactly similar to standard seq2seq
• Attention Mechanism: For each decoding step, it computes attention weights
- The attention weights are then applied to the encoder hidden states, producing a context vector ( c_j ) for each decoding step. This context vector is a weighted sum of encoder hidden states and dynamically focuses on different parts of the input sequence depending on the output being generated.
• Decoder:
• The decoder consists of multiple decoding steps. For each step, the decoder generates an output token