Keras Attention Mechanism

pip install attention

Many-to-one attention mechanism for Keras.

Examples

IMDB Dataset

In this experiment, we demonstrate that using attention yields a higher accuracy on the IMDB dataset. We consider two LSTM networks: one with this attention layer and the other one with a fully connected layer. Both have the same number of parameters for a fair comparison (250K).

Here are the results on 10 runs. For every run, we record the max accuracy on the test set for 10 epochs.

Measure	No Attention (250K params)	Attention (250K params)
MAX Accuracy	88.22	88.76
AVG Accuracy	87.02	87.62
STDDEV Accuracy	0.18	0.14

As expected, there is a boost in accuracy for the model with attention. It also reduces the variability between the runs, which is something nice to have.

Adding two numbers

Let's consider the task of adding two numbers that come right after some delimiters (0 in this case):

x = [1, 2, 3, 0, 4, 5, 6, 0, 7, 8]. Result is y = 4 + 7 = 11.

The attention is expected to be the highest after the delimiters. An overview of the training is shown below, where the top represents the attention map and the bottom the ground truth. As the training progresses, the model learns the task and the attention map converges to the ground truth.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.github		.github
attention		attention
examples		examples
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keras Attention Mechanism

Examples

IMDB Dataset

Adding two numbers

References

About

Releases

Packages

Languages

License

leeshien/keras-attention-mechanism

Folders and files

Latest commit

History

Repository files navigation

Keras Attention Mechanism

Examples

IMDB Dataset

Adding two numbers

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages