Skip to content
/ gpt-2 Public
forked from openai/gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

License

Notifications You must be signed in to change notification settings

kju196/gpt-2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gpt-2

Code and samples from the paper "Language Models are Unsupervised Multitask Learners".

For now, we have only released a smaller (117M parameter) version of GPT-2.

See more details in our blog post.

Installation

Download the model data

sh download_model.sh 117M

The remaining steps can optionally be done in a virtual environment using tools such as virtualenv or conda.

Install tensorflow r1.12 (with GPU support, if you have a GPU and want everything to run faster)

pip3 install tensorflow==1.12.0

or

pip3 install tensorflow-gpu==1.12.0

Install other python packages:

pip3 install -r requirements.txt

Unconditional sample generation

WARNING: Samples are unfiltered and may contain offensive content.

To generate unconditional samples from the small model:

python3 src/generate_unconditional_samples.py | tee samples

There are various flags for controlling the samples:

python3 src/generate_unconditional_samples.py --top_k 40 --temperature 0.7 | tee samples

Conditional sample generation

To give the model custom prompts, you can use:

python3 src/interactive_conditional_samples.py --top_k 40

GPT-2 samples

While we have not yet released GPT-2 itself, you can see some samples from it in the gpt-2-samples folder. We show unconditional samples with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40. We show conditional samples, with contexts drawn from WebText's test set, with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40.

Future work

We may release code for evaluating the models on various benchmarks.

We are still considering release of the larger models.

About

Code for the paper "Language Models are Unsupervised Multitask Learners"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.2%
  • Shell 2.8%