Skip to content
forked from mathcom/molgpt

Viraj Bagal, Rishal Aggarwal, P. K. Vinod, and U. Deva Priyakumar Journal of Chemical Information and Modeling 2022 62 (9), 2064-2076 DOI: 10.1021/acs.jcim.1c00600

License

Notifications You must be signed in to change notification settings

yukisoya/molgpt

 
 

Repository files navigation

MolGPT

In this work, we train small custom GPT on Moses and Guacamol dataset with next token prediction task. The model is then used for unconditional and conditional molecular generation. We compare our model with previous approaches on the Moses and Guacamol datasets. Saliency maps are obtained for interpretability using Ecco library.

  • The processed Guacamol and MOSES datasets in csv format can be downloaded from this link:

https://drive.google.com/drive/folders/1LrtGru7Srj_62WMR4Zcfs7xJ3GZr9N4E?usp=sharing

  • Original Guacamol dataset can be found here:

https://github.com/BenevolentAI/guacamol

  • Original Moses dataset can be found here:

https://github.com/molecularsets/moses

  • All trained weights can be found here:

https://www.kaggle.com/virajbagal/ligflow-final-weights

To train the model, make sure you have the datasets' csv file in the same directory as the code files.

Training

./train_moses.sh
./train_guacamol.sh

Generation

./generate_guacamol_prop.sh
./generate_moses_prop_scaf.sh

If you find this work useful, please cite:

Bagal, Viraj; Aggarwal, Rishal; Vinod, P. K.; Priyakumar, U. Deva (2021): MolGPT: Molecular Generation using a Transformer-Decoder Model. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.14561901.v1

About

Viraj Bagal, Rishal Aggarwal, P. K. Vinod, and U. Deva Priyakumar Journal of Chemical Information and Modeling 2022 62 (9), 2064-2076 DOI: 10.1021/acs.jcim.1c00600

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.0%
  • Shell 4.0%