Skip to content

We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3.

Notifications You must be signed in to change notification settings

mustachemo/transformer

Repository files navigation

Transformer Implementations Repository

Overview

This repository is dedicated to hosting various implementations of the Transformer model, as introduced in the landmark paper "Attention Is All You Need" by Vaswani et al. The Transformer architecture is designed for high performance in sequence-to-sequence tasks, leveraging self-attention mechanisms for superior handling of dependencies in data. This repository serves as a collective resource for different flavors and adaptations of the Transformer model, facilitating exploration and innovation in neural network architectures.

Improvements

The following libraries and tools can be used to enhance the functionality and performance of the Transformer implementations:

Hugging Face's transformers library

- Pre-trained models
- Tokenization
- Optimization
- Learning rate scheduling
- Evaluation
- Inference
- Generation
- Fine-tuning
- Model saving/loading
- Model sharing
- Model serving
- Model conversion
- Model quantization
- Model compression
- Model distillation
- Model pruning

About

We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published