This is a benchmarking study for current 3D structure-based generative models, focusing on the chemical plausibility of generated structures. Novel metrics were proposed to facilitate the development of this area.
A conda environment file is provided to install the conda environment.
conda env create -f environment.yml
conda activate cheminfo
The versions of key packages used in this study are shown as following:
Package | Version |
---|---|
Python | 3.11.9 |
rdkit | 2023.9.6 |
useful-rdkit-utils | 0.56 |
All the datasets used in metrics are provided, the ring system frequency of ZINC20 and ZINC22 drug-like molecules are under data
folder. The BM scaffold files extracted from ZINC20 and ZINC22 drug-like molecules
could be downloaded from Google Drive.
The molecules generated by each algorithm are saved in folders named after the respective algorithms.
All metrics and known medicinal chemistry filters are applied to each set of molecules in the Ring_system.ipynb
file located within each folder.
All figures presented in the manuscript were generated using the Picture_drawing.ipynb
file.
Feel free to create an issue or email Bo Yang (yang2531@purdue.edu) if you have any questions!