Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train a model on QM9 dataset? #914

Open
Jevon-Du opened this issue Nov 15, 2024 · 1 comment
Open

How to train a model on QM9 dataset? #914

Jevon-Du opened this issue Nov 15, 2024 · 1 comment

Comments

@Jevon-Du
Copy link

Jevon-Du commented Nov 15, 2024

Hi guys,

I want to train a model to predict HOMO energy using the QM9 dataset but I'm having trouble finding relevant documentation. Can you guide me on how to proceed?

@misko
Copy link
Collaborator

misko commented Nov 19, 2024

Hi!,
I don't have a detailed answer for you, but I think a rough approach might be as follows,

  1. Convert QM9 into a usable data format (ASEdb or ASELMDB)
  2. Add an output for HOMO energy , or easier, just use the already existing energy scalar to represent HOMO energy
  3. Run training

For (1) there are two existing issues in the repo which we have not solved yet that might provide some insight, #788 , #787 , https://fair-chem.github.io/core/ase_dataset_creation.html

For (3) I think this might be a good start. Depending on what data format you use you can ignore parts relating to LMDB. To use ASEdb format you change the config slightly to specify format: ase_db (as is done in the example linked)

Hope this helps! If you make some progress and get stuck please reach out here, I would be happy to help! if you are interested we could also use your approach in the tutorial, and save other people asking the same question some time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants