Skip to content

Commit

Permalink
Merge pull request #11 from Helsinki-NLP/fix/sanitizing
Browse files Browse the repository at this point in the history
Sanitization / refactoring
  • Loading branch information
shaoxiongji authored Sep 26, 2023
2 parents 470d466 + 1177ba8 commit 22fea93
Show file tree
Hide file tree
Showing 152 changed files with 2,309 additions and 5,605 deletions.
2 changes: 1 addition & 1 deletion build_vocab.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/env python
from onmt.bin.build_vocab import main
from mammoth.bin.build_vocab import main


if __name__ == "__main__":
Expand Down
2 changes: 1 addition & 1 deletion docs/source/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ OpenNMT-py is a community developed project and we love developer contributions.
## Guidelines
Before sending a PR, please do this checklist first:

- Please run `onmt/tests/pull_request_chk.sh` and fix any errors. When adding new functionality, also add tests to this script. Included checks:
- Please run `mammoth/tests/pull_request_chk.sh` and fix any errors. When adding new functionality, also add tests to this script. Included checks:
1. flake8 check for coding style;
2. unittest;
3. continuous integration tests listed in `.travis.yml`.
Expand Down
37 changes: 0 additions & 37 deletions docs/source/FAQ.md

This file was deleted.

11 changes: 5 additions & 6 deletions docs/source/attention_bridges.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# Attention Bridge

The embeddings are generated through the self-attention mechanism ([Attention Bridge](./onmt/attention_bridge.py)) of the encoder and establish a connection with language-specific decoders that focus their attention on these embeddings. This is why they are referred to as 'bridges'. This architectural element serves to link the encoded information with the decoding process, enhancing the flow of information between different stages of language processing.
The embeddings are generated through the self-attention mechanism ([Attention Bridge](./mammoth/modules/attention_bridge.py)) of the encoder and establish a connection with language-specific decoders that focus their attention on these embeddings. This is why they are referred to as 'bridges'. This architectural element serves to link the encoded information with the decoding process, enhancing the flow of information between different stages of language processing.

There are five types of attention mechanism implemented:

Expand Down Expand Up @@ -61,7 +61,7 @@ The `PerceiverAttentionBridgeLayer` involves a multi-headed dot product self-att

3. **Linear Layer**: After normalization, the data is fed into a linear layer. This linear transformation can be seen as a learned projection of the attention-weighted data into a new space.

4. **ReLU Activation**: The output of the linear layer undergoes the Rectified Linear Unit (ReLU) activation function.
4. **ReLU Activation**: The output of the linear layer undergoes the Rectified Linear Unit (ReLU) activation function.

5. **Linear Layer (Second)**: Another linear layer is applied to the ReLU-activated output.

Expand All @@ -72,11 +72,11 @@ The `PerceiverAttentionBridgeLayer` involves a multi-headed dot product self-att
The process described involves dot product self-attention. The steps are as follows:

1. **Input Transformation**: Given an input matrix $\mathbf{H} \in \mathbb{R}^{d_h \times n}$, two sets of learned weight matrices are used to transform the input. These weight matrices are $\mathbf{W}_1 \in \mathbb{R}^{d_h \times d_a}$ and $\mathbf{W}_2 \in \mathbb{R}^{d_h \times d_a}$. The multiplication of $\mathbf{H}$ with $\mathbf{W}_1$ and $\mathbf{W}_2$ produces matrices $\mathbf{V}$ and $\mathbf{K}$, respectively:

- $\mathbf{V} = \mathbf{H} \mathbf{W}_1$
- $\mathbf{K} = \mathbf{H} \mathbf{W}_2$

2. **Attention Calculation**: The core attention calculation involves three matrices: $\mathbf{Q} \in \mathbb{R}^{d_h \times n}$, $\mathbf{K}$ (calculated previously), and $\mathbf{V}$ (calculated previously). The dot product of $\mathbf{Q}$ and $\mathbf{K}^\top$ is divided by the square root of the dimensionality of the input features ($\sqrt{d_h}$).
2. **Attention Calculation**: The core attention calculation involves three matrices: $\mathbf{Q} \in \mathbb{R}^{d_h \times n}$, $\mathbf{K}$ (calculated previously), and $\mathbf{V}$ (calculated previously). The dot product of $\mathbf{Q}$ and $\mathbf{K}^\top$ is divided by the square root of the dimensionality of the input features ($\sqrt{d_h}$).
The final attended output is calculated by multiplying the attention weights with the $\mathbf{V}$ matrix: $\mathbf{H}^\prime = \operatorname{Softmax}(\frac{\mathbf{Q}\mathbf{K}^\top}{\sqrt{d_h}})\mathbf{V}$


Expand All @@ -86,5 +86,4 @@ The TransformerEncoderLayer employs multi-headed dot product self-attention (by

## FeedForwardAttentionBridgeLayer

The `FeedForwardAttentionBridgeLayer` module applies a sequence of linear transformations and `ReLU` activations to the input data, followed by an attention bridge normalization, enhancing the connectivity between different parts of the model.

The `FeedForwardAttentionBridgeLayer` module applies a sequence of linear transformations and `ReLU` activations to the input data, followed by an attention bridge normalization, enhancing the connectivity between different parts of the model.
10 changes: 5 additions & 5 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ Contents
:caption: API
:maxdepth: 2

onmt.rst
onmt.modules.rst
onmt.translation.rst
onmt.translate.translation_server.rst
onmt.inputters.rst
mammoth.rst
mammoth.modules.rst
mammoth.translation.rst
mammoth.translate.translation_server.rst
mammoth.inputters.rst
20 changes: 20 additions & 0 deletions docs/source/mammoth.inputters.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Data Loaders
=================

Data Readers
-------------

.. autoexception:: mammoth.inputters.datareader_base.MissingDependencyException

.. autoclass:: mammoth.inputters.DataReaderBase
:members:

.. autoclass:: mammoth.inputters.TextDataReader
:members:


Dataset
--------

.. autoclass:: mammoth.inputters.Dataset
:members:
109 changes: 109 additions & 0 deletions docs/source/mammoth.modules.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
Modules
=============

Core Modules
------------

.. autoclass:: mammoth.modules.Embeddings
:members:


Encoders
---------

.. autoclass:: mammoth.encoders.EncoderBase
:members:

.. autoclass:: mammoth.encoders.MeanEncoder
:members:

.. autoclass:: mammoth.encoders.RNNEncoder
:members:


Decoders
---------


.. autoclass:: mammoth.decoders.DecoderBase
:members:

.. autoclass:: mammoth.decoders.decoder.RNNDecoderBase
:members:

.. autoclass:: mammoth.decoders.StdRNNDecoder
:members:

.. autoclass:: mammoth.decoders.InputFeedRNNDecoder
:members:

Attention
----------

.. autoclass:: mammoth.modules.AverageAttention
:members:

.. autoclass:: mammoth.modules.GlobalAttention
:members:



Architecture: Transformer
----------------------------

.. autoclass:: mammoth.modules.PositionalEncoding
:members:

.. autoclass:: mammoth.modules.position_ffn.PositionwiseFeedForward
:members:

.. autoclass:: mammoth.encoders.TransformerEncoder
:members:

.. autoclass:: mammoth.decoders.TransformerDecoder
:members:

.. autoclass:: mammoth.modules.MultiHeadedAttention
:members:
:undoc-members:


Architecture: Conv2Conv
----------------------------

(These methods are from a user contribution
and have not been thoroughly tested.)


.. autoclass:: mammoth.encoders.CNNEncoder
:members:


.. autoclass:: mammoth.decoders.CNNDecoder
:members:

.. autoclass:: mammoth.modules.ConvMultiStepAttention
:members:

.. autoclass:: mammoth.modules.WeightNormConv2d
:members:

Architecture: SRU
----------------------------

.. autoclass:: mammoth.models.sru.SRU
:members:


Copy Attention
--------------

.. autoclass:: mammoth.modules.CopyGenerator
:members:


Structured Attention
-------------------------------------------

.. autoclass:: mammoth.modules.structured_attention.MatrixTree
:members:
32 changes: 32 additions & 0 deletions docs/source/mammoth.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
Framework
=================

Model
-----

.. autoclass:: mammoth.models.NMTModel
:members:

Trainer
-------

.. autoclass:: mammoth.Trainer
:members:


.. autoclass:: mammoth.utils.Statistics
:members:

Loss
----


.. autoclass:: mammoth.utils.loss.LossComputeBase
:members:


Optimizer
---------

.. autoclass:: mammoth.utils.Optimizer
:members:
21 changes: 21 additions & 0 deletions docs/source/mammoth.translate.translation_server.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Server
======


Models
-------------

.. autoclass:: mammoth.translate.translation_server.ServerModel
:members:


Core Server
------------

.. autoexception:: mammoth.translate.translation_server.ServerModelError

.. autoclass:: mammoth.translate.translation_server.Timer
:members:

.. autoclass:: mammoth.translate.translation_server.TranslationServer
:members:
39 changes: 39 additions & 0 deletions docs/source/mammoth.translation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
Translation
==================

Translations
-------------

.. autoclass:: mammoth.translate.Translation
:members:

Translator Class
-----------------

.. autoclass:: mammoth.translate.Translator
:members:

.. autoclass:: mammoth.translate.TranslationBuilder
:members:


Decoding Strategies
--------------------
.. autoclass:: mammoth.translate.DecodeStrategy
:members:

.. autoclass:: mammoth.translate.BeamSearch
:members:

.. autofunction:: mammoth.translate.greedy_search.sample_with_temperature

.. autoclass:: mammoth.translate.GreedySearch
:members:

Scoring
--------
.. autoclass:: mammoth.translate.penalties.PenaltyBuilder
:members:

.. autoclass:: mammoth.translate.GNMTGlobalScorer
:members:
20 changes: 0 additions & 20 deletions docs/source/onmt.inputters.rst

This file was deleted.

Loading

0 comments on commit 22fea93

Please sign in to comment.