Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kliff DNN torch trainer #185

Merged
merged 4 commits into from
Jul 6, 2024
Merged

Kliff DNN torch trainer #185

merged 4 commits into from
Jul 6, 2024

Conversation

ipcamit
Copy link

@ipcamit ipcamit commented Jul 5, 2024

Summary

Added tests and trainer for training generic dense neural networks using libdescriptor and pytorch (as opposed to lightning).

  • Batched descriptor evaluation
  • DNN model using TorchML driver

TODO (if any)

  • Need to add the capability to write the trained model as a valid DUNN driver model.

Checklist

Before a pull request can be merged, the following items must be checked:

  • Make sure your code is properly formatted. isort and black are used for this purpose. The simplest way is to use pre-commit. See instructions here.
  • Doc strings have been added in the Google docstring format on your code.
  • Type annotations are highly encouraged. Run mypy to type check your code.
  • Tests have been added for any new functionality or bug fixes.
  • All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more efficient if you already fix most errors prior to submitting the PR.

@mjwen mjwen self-requested a review July 6, 2024 01:37
@mjwen mjwen self-assigned this Jul 6, 2024
@@ -82,33 +185,22 @@ def collate(self, batch: Any) -> dict:
"""
# get fingerprint and consistent properties
config_0, property_dict_0 = batch[0]
device = config_0.device
ptr = torch.tensor([0], dtype=torch.int64, device=device)
ptr = np.array([0], dtype=np.intc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the collated data will be provided as the input to a torch NN model. Any reason why use np.array but torch.tensor for all variables?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: NVM, now I understand that the batched data are passed to the descriptor in def _descriptor_eval_batch(self, batch) -> torch.Tensor, which requires numpy array.

)
self.torchscript_file = None
self.train_dataloader = None
self.validation_dataloader = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add test_dataloader and test the model at the end of training?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will add it uniformly across all trainers after finalizing this release. Idea is to have a base method test() that can run some tests on the test datasets. It will do simple energy and forces test but will not be limited to it, but could also leverage openkim tests if user requests.

@mjwen
Copy link
Collaborator

mjwen commented Jul 6, 2024

Looks great! Merged.

@mjwen mjwen merged commit 697d54f into openkim:v1 Jul 6, 2024
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants