Open Quantum Data Commons
git clone git@github.com:OpenDrugDiscovery/openQDC.git
cd openQDC
# use mamba/conda
mamba env create -n openqdc -f env.yml
pip install -e .
You can run tests locally with:
pytest
You can build the documentation locally with:
mkdocs serve
A command line interface is available to download datasets or see which dataset is available, for more information please run openqdc --help.
# Display the available datasets
openqdc datasets
# Display the help message for the download command
openqdc download --help
# Download the Spice and QMugs dataset
openqdc download Spice QMugs
We provide support for the following publicly available QM Potential Energy Datasets.
Dataset | # Molecules | # Conformers | Average Conformers per Molecule | Force Labels | Atom Types | QM Level of Theory | Off-Equilibrium Conformations |
---|---|---|---|---|---|---|---|
ANI | 57,462 | 20,000,000 | 348 | No | 4 | ωB97x:6-31G(d) | Yes |
GEOM | 450,000 | 37,000,000 | 82 | No | 18 | GFN2-xTB | No |
Molecule3D | 3,899,647 | 3,899,647 | 1 | No | 5 | B3LYP/6-31G* | No |
NablaDFT | 1,000,000 | 5,000,000 | 5 | No | 6 | ωB97X-D/def2-SVP | |
OrbNet Denali | 212,905 | 2,300,000 | 11 | No | 16 | GFN1-xTB | Yes |
PCQM_PM6 | 1 | No | PM6 | No | |||
PCQM_B3LYP | 85,938,443 | 85,938,443 | 1 | No | B3LYP/6-31G* | No | |
QMugs | 665,000 | 2,000,000 | 3 | No | 10 | GFN2-xTB, ωB97X-D/def2-SVP | No |
QM7X | 6,950 | 4,195,237 | 603 | Yes | 7 | PBE0+MBD | Yes |
SN2RXN | 39 | 452709 | 11,600 | Yes | 6 | DSD-BLYP-D3(BJ)/def2-TZVP | |
SolvatedPeptides | 2,731,180 | Yes | revPBE-D3(BJ)/def2-TZVP | ||||
Spice | 19,238 | 1,132,808 | 59 | Yes | 15 | ωB97M-D3(BJ)/def2-TZVPPD | Yes |
tmQM | 86,665 | 86,665 | 1 | No | TPSSh-D3BJ/def2-SVP | ||
Transition1X | 9,654,813 | Yes | ωB97x/6–31 G(d) | Yes | |||
WaterClusters | 1 | 4,464,740 | No | 2 | TTM2.1-F | Yes |
We also provide support for the following publicly available QM Noncovalent Interaction Energy Datasets.
Dataset |
---|
DES370K |
DES5M |
Metcalf |
DESS66 |
DESS66x8 |
Splinter |
X40 |
L7 |