Is Attention all you need?

This repo contains our experiments in researching and implementing alternatives to Attention mechanism ie. MAMBA and xLSTM.

Getting Started

Note: Running Training and Inference requires CUDA installation. (nvcc and other dependencies)

The steps to run this project are -

1. Setup virtual environment

The project uses Anaconda to create programming environment

conda create --name <env> --file requirements.txt

2. Running Demo

Models supported: attention, mamba, xlstm
Context can be any string

python demo.py --model <model_name> -c "Shakespeare likes attention"

3. Setup Weights & Biases for training

We are using Weights & Biases library (W&B) for tracking training metrics (quickstart). To use W&B, setup the WANDB_API_KEY

export WANDB_API_KEY = <Your WandB api key>

4. Testing

The testing files for each model are: gpt_test.py, mamba_test.py, xlstm_test.py

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
attention		attention
mamba		mamba
.gitignore		.gitignore
.python-version		.python-version
CS_6120_NLP.pdf		CS_6120_NLP.pdf
README.md		README.md
demo.py		demo.py
gpt_test.py		gpt_test.py
input.txt		input.txt
itos_shake.pkl		itos_shake.pkl
mLSTMblock.py		mLSTMblock.py
mamba_test.py		mamba_test.py
requirements.txt		requirements.txt
sLSTMblock.py		sLSTMblock.py
stoi_shake.pkl		stoi_shake.pkl
train_GPT.ipynb		train_GPT.ipynb
utils.py		utils.py
xLSTM.py		xLSTM.py
xLSTMmodel.py		xLSTMmodel.py
xlstm_test.py		xlstm_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Is Attention all you need?

Getting Started

1. Setup virtual environment

2. Running Demo

3. Setup Weights & Biases for training

4. Testing

About

Releases

Packages

Contributors 2

Languages

sidmttl/LMcomparison

Folders and files

Latest commit

History

Repository files navigation

Is Attention all you need?

Getting Started

1. Setup virtual environment

2. Running Demo

3. Setup Weights & Biases for training

4. Testing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages