This repo contains code to construct a character-level GPT (Generative Pretrained Transformer) model from scratch, following Andrej Karpathy's Zero To Hero series on GPT. The model is trained on different texts, for example Shakespeare, Goethe's "Faust", the "Lord of the Rings" or books from Jane Austen, and is able to generate new text based on the text from the book.
The repo contains 3 interactive Jupyter notebooks, each in a 'student' and 'solution' version. Work on the starter files in the following order:
This notebook constructs a bigram language model from scratch. The model is trained on a text file containing names and will be able to generate new names based on what it has learned.
This notebook extends the previous bigram model to a multi-layer perceptron to improve the name generation results.
Finally, the full GPT is implemented from scratch, trained on different texts, and generating new text.
See Pipfile
for an overview of required python packages. For PyTorch with GPU support, download the correct wheel file here and place it in your project folder: https://download.pytorch.org/whl/torch/. I am using torch-2.5.1+cu121-cp312-cp312-win_amd64.whl here, but you may have to use another wheel file depending on the OS and CUDA version of your system. Adapt the Pipfile accordingly and run pip install pipenv
, then pipenv install
.
This is an extended version of Andrej Karpathy's notebook in addition to his Zero To Hero video on GPT.
Adapted by:
Prof. Dr.-Ing. Antje Muntzinger, University of Applied Sciences Stuttgart