SpikeGPT is a lightweight generative language model with pure binary, event-driven spiking activation units.
If you are interested in SpikeGPT, feel free to join our Discord by this link!
This repo is inspired by the RWKV-LM.
- Download the enwik8 dataset.
- Run
train.py
You can choose to inference with your customized model or with our pre-trained model, our pre-trained model on BookCorpus is avaiable here. This model only trained 900M token on BookCorpus.
- Modify the hyper-parameter of the network, which could be found in line 36-38 of the
run.py
:
# For BookCorpus pre-trained model, you can change it if you trained your own model.
n_layer = 18
n_embd = 512
ctx_len = 1024
- download our BookCorpus pre-trained model, and put it in thein the root directory of this repo.
- Modify the 'context' variable in
run.py
to your custom prompt - Run
run.py
If you find SpikeGPT useful in your work, please cite the following source:
@article{zhu2023spikegpt,
title = {SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks},
author = {Zhu, Rui-Jie and Zhao, Qihang and Eshraghian, Jason K.},
journal = {arXiv preprint arXiv:2302.13939},
year = {2023}
}