forked from pytorch/examples
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Apply Transformer model for the word language problem in pytorch/exam…
…ples (pytorch#555) * Use append to accelerate data loading process. * First transformer model working for word language model. * Work for GPU (all the model and data have to be sent to cuda) * Transformer model GPU activated nhead=1 nlayers=1 d_ff=64 test loss 6.55 * Use lr=5.0 test loss 4.8 Encoder/decoder embeddings normalized by sqrt(d_model). test loss 3.84 lr=5.0 Encoder/decoder embeddings normalized by sqrt(d_model). test loss 4.68 lr=20.0 Remove print out. Revise main.py file. Load the best training model through epochs. Update README.md file to include the transformer model. Update the README.md file. Use PositionalEncoding in transformer. test loss 0.30 lr=5.0 * Update main.py to have mask on source sequences. Update generate.py to generate text with transformer.pt model. Add CUDA function to generate.py when running transformer model. Add generate_subsequent_mask() in Transformer Generate transformer model in main.py. Revise generate.py working for both RNN and Transformer models. Remove decoder_data Add some changes because of transformer.py. * No need to provide Trnasform args for generating text. Change d_ff to dim_feedforward. Remove Embeddings and PositionalEncoder out of transformer.py. * Replace tabs with spaces. * Update transformer model in model.py. * Recycle RNN arguments for Transformer model. * Update README.md file. * Remove model.generator in main.py. * Update the warnings in transformer model. * Fix a small bug in model.py. * Remove keyword arguments for consistence. * Create a new function generate_square_subsequent_mask inside the TransformerSeq2Seq * Remove unnecessary attributes. * A minor change. * Move src_mask and tgt_mask as the members of the module. * Move transformer check to model.py * Move masks inside forward function. * User TransformerEncoder for word language model. * Remove Generator module in Transformer. * Merge RNN and Transformer model in model.py * Minor changes. * Minor changes to address reviewer's comments. * Remove reset_parameter function. * Split RNN and Transformer model to keep code readable. * Minor changes.
- Loading branch information
1 parent
d587b53
commit 4581968
Showing
5 changed files
with
170 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters