Skip to content

Latest commit

 

History

History
47 lines (32 loc) · 1.68 KB

readme.md

File metadata and controls

47 lines (32 loc) · 1.68 KB

SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks

SpikeGPT is a lightweight generative language model with pure binary, event-driven spiking activation units.

If you are interested in SpikeGPT, feel free to join our Discord by this link!

This repo is inspired by the RWKV-LM.

Training on Enwik8

  1. Download the enwik8 dataset.
  2. Run train.py

Inference with Prompt

You can choose to inference with your customized model or with our pre-trained model, our pre-trained model on BookCorpus is avaiable here. This model only trained 900M token on BookCorpus.

  1. Modify the hyper-parameter of the network, which could be found in line 36-38 of the run.py:
# For BookCorpus pre-trained model, you can change it if you trained your own model.
n_layer = 18
n_embd = 512
ctx_len = 1024
  1. download our BookCorpus pre-trained model, and put it in thein the root directory of this repo.
  2. Modify the 'context' variable in run.py to your custom prompt
  3. Run run.py

Citation

If you find SpikeGPT useful in your work, please cite the following source:

@article{zhu2023spikegpt,
        title = {SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks},
        author = {Zhu, Rui-Jie and Zhao, Qihang and Eshraghian, Jason K.},
        journal = {arXiv preprint arXiv:2302.13939},
        year    = {2023}
}