Skip to content

Commit

Permalink
Swap PTB for Wikitext-2 (which is open access)
Browse files Browse the repository at this point in the history
  • Loading branch information
adamlerer authored and soumith committed Nov 27, 2017
1 parent 62d5ca5 commit cf74c81
Show file tree
Hide file tree
Showing 10 changed files with 44,848 additions and 49,208 deletions.
10 changes: 5 additions & 5 deletions word_language_model/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Word-level language modeling RNN

This example trains a multi-layer RNN (Elman, GRU, or LSTM) on a language modeling task.
By default, the training script uses the PTB dataset, provided.
By default, the training script uses the Wikitext-2 dataset, provided.
The trained model can then be used by the generate script to generate new text.

```bash
python main.py --cuda --epochs 6 # Train a LSTM on PTB with CUDA, reaching perplexity of 117.61
python main.py --cuda --epochs 6 --tied # Train a tied LSTM on PTB with CUDA, reaching perplexity of 110.44
python main.py --cuda --tied # Train a tied LSTM on PTB with CUDA for 40 epochs, reaching perplexity of 87.17
python main.py --cuda --epochs 6 # Train a LSTM on Wikitext-2 with CUDA, reaching perplexity of 117.61
python main.py --cuda --epochs 6 --tied # Train a tied LSTM on Wikitext-2 with CUDA, reaching perplexity of 110.44
python main.py --cuda --tied # Train a tied LSTM on Wikitext-2 with CUDA for 40 epochs, reaching perplexity of 87.17
python generate.py # Generate samples from the trained LSTM model.
```

Expand Down Expand Up @@ -51,6 +51,6 @@ python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40
python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 --tied # Test perplexity of 72.30
```

These perplexities are equal or better than
Perplexities on PTB are equal or better than
[Recurrent Neural Network Regularization (Zaremba et al. 2014)](https://arxiv.org/pdf/1409.2329.pdf)
and are similar to [Using the Output Embedding to Improve Language Models (Press & Wolf 2016](https://arxiv.org/abs/1608.05859) and [Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling (Inan et al. 2016)](https://arxiv.org/pdf/1611.01462.pdf), though both of these papers have improved perplexities by using a form of recurrent dropout [(variational dropout)](http://papers.nips.cc/paper/6241-a-theoretically-grounded-application-of-dropout-in-recurrent-neural-networks).
3,761 changes: 0 additions & 3,761 deletions word_language_model/data/penn/test.txt

This file was deleted.

42,068 changes: 0 additions & 42,068 deletions word_language_model/data/penn/train.txt

This file was deleted.

3,370 changes: 0 additions & 3,370 deletions word_language_model/data/penn/valid.txt

This file was deleted.

3 changes: 3 additions & 0 deletions word_language_model/data/wikitext-2/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This is raw data from the wikitext-2 dataset.

See https://www.salesforce.com/products/einstein/ai-research/the-wikitext-dependency-language-modeling-dataset/
4,358 changes: 4,358 additions & 0 deletions word_language_model/data/wikitext-2/test.txt

Large diffs are not rendered by default.

36,718 changes: 36,718 additions & 0 deletions word_language_model/data/wikitext-2/train.txt

Large diffs are not rendered by default.

3,760 changes: 3,760 additions & 0 deletions word_language_model/data/wikitext-2/valid.txt

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions word_language_model/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@

import data

parser = argparse.ArgumentParser(description='PyTorch PTB Language Model')
parser = argparse.ArgumentParser(description='PyTorch Wikitext-2 Language Model')

# Model parameters.
parser.add_argument('--data', type=str, default='./data/penn',
parser.add_argument('--data', type=str, default='./data/wikitext-2',
help='location of the data corpus')
parser.add_argument('--checkpoint', type=str, default='./model.pt',
help='model checkpoint to use')
Expand Down
4 changes: 2 additions & 2 deletions word_language_model/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
import data
import model

parser = argparse.ArgumentParser(description='PyTorch PennTreeBank RNN/LSTM Language Model')
parser.add_argument('--data', type=str, default='./data/penn',
parser = argparse.ArgumentParser(description='PyTorch Wikitext-2 RNN/LSTM Language Model')
parser.add_argument('--data', type=str, default='./data/wikitext-2',
help='location of the data corpus')
parser.add_argument('--model', type=str, default='LSTM',
help='type of recurrent net (RNN_TANH, RNN_RELU, LSTM, GRU)')
Expand Down

0 comments on commit cf74c81

Please sign in to comment.