MarcusOlivecrona
released this
08 May 08:38
·
33 commits
to master
since this release
Code and data to reproduce results from article.
- "saved_models.tar.gz" contains the pretrained Priors (canonical and reduced)
- "data.tar.gz" contains:
- clf.pkl - The SVM activity model of DRD2
- prior_trainingset_MolData Object used to sample training data for the Prior
- prior_trainingset_Voc Object used to encode/decode from SMILES to one-hot. Has to be consistent between the Prior and Agents trained from the Prior. Can use this one for the reduced Priors too.
- prior trainingsets - SMILES files containing the structures used to train the canonical and reduced Priors. Run "python data_struct.py [SMILES file location]" to construct MolData and Voc objects from these files to train Priors from scratch. N.B. Requires a large amount of memory for large SMILES files, modify the data_structs.py file to construct the MolData object gradually if this is a problem.
- DRD2 train, validation, and test sets including cluster IDs