Skip to content

Commit

Permalink
Merge pull request #9 from rafajak/fix_readme_dropout_layers
Browse files Browse the repository at this point in the history
Fix hyperparameter naming in readme, add learning resources
  • Loading branch information
shiffman authored Nov 8, 2019
2 parents d2692d0 + 52fe2ad commit 44f4cb0
Showing 1 changed file with 28 additions and 24 deletions.
52 changes: 28 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@ Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language m

Based on [char-rnn-tensorflow](https://github.com/sherjilozair/char-rnn-tensorflow).

[Here](https://www.youtube.com/watch?v=xfuVcfwtEyw) is a video to help you get started with training charRNN with [Spell](https://www.spell.run/)
- **[Blog post describing how to train and use an LSTM network in ml5.js](https://blog.paperspace.com/training-an-lstm-and-using-the-model-in-ml5-js/)**.
- **[Video showing how to train an LSTM network using Spell and ml5.js](https://youtu.be/xfuVcfwtEyw)** to generate text in the style of a particular author.


## Requirements

Expand Down Expand Up @@ -78,27 +80,29 @@ That's it!

Given the size of the training dataset, here are some hyperparameters that might work:

* 2 MB:
- rnn_size 256 (or 128)
- layers 2
- seq_length 64
- batch_size 32
- dropout 0.25
* 5-8 MB:
- rnn_size 512
- layers 2 (or 3)
- seq_length 128
- batch_size 64
- dropout 0.25
* 10-20 MB:
- rnn_size 1024
- layers 2 (or 3)
- seq_length 128 (or 256)
- batch_size 128
- dropout 0.25
* 25+ MB:
- rnn_size 2048
- layers 2 (or 3)
- seq_length 256 (or 128)
- batch_size 128
* 2 MB:
- rnn_size 256 (or 128)
- num_layers 2
- seq_length 64
- batch_size 32
- output_keep_prob 0.75
* 5-8 MB:
- rnn_size 512
- num_layers 2 (or 3)
- seq_length 128
- batch_size 64
- dropout 0.25
* 10-20 MB:
- rnn_size 1024
- num_layers 2 (or 3)
- seq_length 128 (or 256)
- batch_size 128
- output_keep_prob 0.75
* 25+ MB:
- rnn_size 2048
- num_layers 2 (or 3)
- seq_length 256 (or 128)
- batch_size 128
- output_keep_prob 0.75

Note: output_keep_prob 0.75 is equivalent to dropout probability of 0.25.

0 comments on commit 44f4cb0

Please sign in to comment.