GPT Language Model From Scratch

This repo contains code to construct a character-level GPT (Generative Pretrained Transformer) model from scratch, following Andrej Karpathy's Zero To Hero series on GPT. The model is trained on different texts, for example Shakespeare, Goethe's "Faust", the "Lord of the Rings" or books from Jane Austen, and is able to generate new text based on the text from the book.

The repo contains 3 interactive Jupyter notebooks, each in a 'student' and 'solution' version. Work on the starter files in the following order:

1_Bigram_Language_Model__student.ipynb

This notebook constructs a bigram language model from scratch. The model is trained on a text file containing names and will be able to generate new names based on what it has learned.

2_MLP_Language_Model__student.ipynb

This notebook extends the previous bigram model to a multi-layer perceptron to improve the name generation results.

3_Character_Level_GPT__student.ipynb

Finally, the full GPT is implemented from scratch, trained on different texts, and generating new text.

Dependencies

See Pipfile for an overview of required python packages. For PyTorch with GPU support, download the correct wheel file here and place it in your project folder: https://download.pytorch.org/whl/torch/. I am using torch-2.5.1+cu121-cp312-cp312-win_amd64.whl here, but you may have to use another wheel file depending on the OS and CUDA version of your system. Adapt the Pipfile accordingly and run pip install pipenv, then pipenv install.

Author

This is an extended version of Andrej Karpathy's notebook in addition to his Zero To Hero video on GPT.

Adapted by:

Prof. Dr.-Ing. Antje Muntzinger, University of Applied Sciences Stuttgart

[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
text_input		text_input
text_output		text_output
.gitignore		.gitignore
1_Bigram_Language_Model__solution.ipynb		1_Bigram_Language_Model__solution.ipynb
1_Bigram_Language_Model__student.ipynb		1_Bigram_Language_Model__student.ipynb
2_MLP_Language_Model__solution.ipynb		2_MLP_Language_Model__solution.ipynb
2_MLP_Language_Model__student.ipynb		2_MLP_Language_Model__student.ipynb
3_Character_Level_GPT__solution.ipynb		3_Character_Level_GPT__solution.ipynb
3_Character_Level_GPT__student.ipynb		3_Character_Level_GPT__student.ipynb
GPT3_params_table.jpg		GPT3_params_table.jpg
MLP.jpg		MLP.jpg
MLP1.jpg		MLP1.jpg
MLP2.jpg		MLP2.jpg
MLP3.jpg		MLP3.jpg
MLP4.jpg		MLP4.jpg
Pipfile		Pipfile
README.md		README.md
bigram1.jpg		bigram1.jpg
bigram2.jpg		bigram2.jpg
bigram3.jpg		bigram3.jpg
chatgpt_diagram_light.png		chatgpt_diagram_light.png
dropout.png		dropout.png
embeddings.png		embeddings.png
layer_norm.png		layer_norm.png
multi-head-attention.jpg		multi-head-attention.jpg
neuron1.jpg		neuron1.jpg
neuron2.jpg		neuron2.jpg
neuron3.jpg		neuron3.jpg
neuron4.jpg		neuron4.jpg
skip-connection.png		skip-connection.png
transformer.jpg		transformer.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT Language Model From Scratch

1_Bigram_Language_Model__student.ipynb

2_MLP_Language_Model__student.ipynb

3_Character_Level_GPT__student.ipynb

Dependencies

Author

About

Releases

Packages

Languages

AMuntzinger/GPT_From_Scratch

Folders and files

Latest commit

History

Repository files navigation

GPT Language Model From Scratch

1_Bigram_Language_Model__student.ipynb

2_MLP_Language_Model__student.ipynb

3_Character_Level_GPT__student.ipynb

Dependencies

Author

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages