Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface to metatensor to use arbitrary machine learning models as collective variables #1082

Merged
merged 32 commits into from
Jun 27, 2024

Conversation

Luthaf
Copy link
Contributor

@Luthaf Luthaf commented May 23, 2024

Description

This PR add an interface using metatensor atomistic models to execute arbitrary machine learning models, with the intended use case of using these models as collective variables in PLUMED.

The core idea of metatensor is to facilitate data sharing between different machine learning ecosystems by annotating data with metadata which describe what exactly is being passed around. This is done through the TensorMap class, which contains one of more TensorBlock, each block containing the actual values, and Labels describing the rows and columns of the values.

On top of this data format, we defined an interface for exporting trained ML models, and then loading and executing such models from a variety of simulation engines. The initial objective was to share models that would compute the energy and forces of a system and use these as ML force fields, but the same API can also be used for arbitrary quantities, including collective variables. These models are based on PyTorch for three reasons:

  • ability to define models in Python and execute them in C++ without a Python interpreter, through TorchScript;
  • ability to compute gradients with automatic differentiation;
  • ability to run calculations on GPU/CPU/... with almost the same code.

There is some documentation on the workflow to load and execute a metatensor model here for reference: https://docs.metatensor.org/latest/atomistic/overview.html#data-flow-between-the-model-and-engine. The main bits of data a model needs from PLUMED are the types for all particles (for atoms, this can be the atomic number), the positions of all particles, the simulation cell/box and (potentially many) neighbor lists for given cutoffs.

Difference with the existing PyTorch module

Both this module and the existing PyTorch module in PLUMED are based on PyTorch (for the same reasons!), the main difference as far as I know is that the PyTorch module is intended to act as transformation on top of CV computed by other PLUMED actions; which the metatensor module is intended to start with positions/atom types/... and compute a new CV from scratch. This enables using metatensor to compute some representation of the system (such as SOAP) and send it to PLUMED for further processing.

Unresolved questions/issues

The main question remaining for me is what to do regarding the neighbor lists calculation. I currently vendor code from https://github.com/Luthaf/vesin in the metatensor module, and it works pretty well. However, the code is just copy-pasted from another repository, and might need to be updated in the future; but does not 100% conform to the code style of the plumed repository. I can see a couple of solutions here, ordered from favorite to least favorite:

  • Disable style & code checks for this code in the repository. I managed to disable astyle, but I could not find a way to disable codecheck or headers.sh. Is there a mechanism to mark some files/directories as "external" code that should be ignored by such linters?
  • I could include the code as a .tar.gz archive which gets extracted in configure. This would make the process of updating the code a lot easier (just update the archive), however I don't know if this will help with removing the code form astyle/codecheck/... unless these scripts respect .gitignore.
  • I could also re-implement the neighbor list in terms of the PLUMED version. I was not aware of this code at the time, hence why I imported vesin instead. I would have to check that the behavior of the PLUMED NL matches what metatensor need and write some code adapting the data from one format to the other.

What do you think? Any other idea on what to do regarding "external" code in the PLUMED repository?


The other questions I have concerns the unit cell storage as returned by this->getPbc().getBox();:

  • for non-periodic systems, what's returned by this functions? A matrix of zeros, or the bounding box of the system? Does it depend from one simulation engine to another?
  • in all cases, the storage format seems to be using rows of the matrix to store the three box vectors, is this correct?

Type of contribution

  • changes to code or doc authored by PLUMED developers, or additions of code in the core or within the default modules
  • changes to a module not authored by you
  • new module contribution or edit of a module authored by you

Copyright

  • I agree to transfer the copyright of the code I have written to the PLUMED developers or to the author of the code I am modifying.

  • the module I added or modified contains a COPYRIGHT file with the correct license information. Code should be released under an open source license. I also used the command cd src && ./header.sh mymodulename in order to make sure the headers of the module are correct.

Tests

  • I added a new regtest or modified an existing regtest to validate my changes.
  • I verified that all regtests are passed successfully on GitHub Actions.
    - As discussed with @gtribello, the regtest run in a separate repository (currently cloning plumed from the branch of this PR, I'll update to pull from the main repo once the code is merged): https://github.com/lab-cosmo/plumed-metatensor-tests

@GiovanniBussi
Copy link
Member

Thanks @Luthaf !!! A few comments below:

  • could you please regenerate the configure file using autoconf 2.69? installing it should be straightforward (see here).
  • I noticed that there are tests that will likely be skipped on GitHub Actions (because metatensor is not configured there). You should mark this test as skippable as we did here, otherwise the test suite will complain

Then let's wait for the other checks that are running now. Thanks again!

@Luthaf Luthaf force-pushed the action-metatensor branch from fe1f51b to ab62e1e Compare May 27, 2024 12:00
//+ENDPLUMEDOC

/*INDENT-OFF*/
#if !defined(__PLUMED_HAS_LIBTORCH) || !defined(__PLUMED_HAS_METATENSOR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luthaf this triggers a bug in our script that does not detect the __PLUMED_HAS_METATENSOR if it's in the same line as another __PLUMED_HAS_* string, could you please split the line? Maybe just reversing the two strings will stop triggering an error:

#if !defined(__PLUMED_HAS_METATENSOR) || !defined(__PLUMED_HAS_LIBTORCH)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try reversing

/*INDENT-OFF*/
#if !defined(__PLUMED_HAS_LIBTORCH) || !defined(__PLUMED_HAS_METATENSOR)

namespace PLMD { namespace metatensor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luthaf I think that you should place the namespace on a separate line for our script to detect it.

Notice that you can run very quickly the check on your computer with cd src ; make plumedcheck. It just needs gawk to be installed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

#ifndef VESIN_H
#define VESIN_H

#include <stddef.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luthaf do you really need a C compatible header here? If possible I would remote the C case and replace these with includes from the C++ std library (cstddef and cstdint)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a C compatible header since the library only exposes a C API.

As discussed above (in the "Unresolved questions/issues" section), this code is included as a library and not developed specially for plumed. Ideally I'd like to exclude it from the code checks in plumed, is this possible? I suggested three potential solutions in the initial message, is there one you prefer?

If nothing else works I can change the code here, but any future update to the library (for bug fixes or better performance, …) will require re-doing the same set of changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing to this, I forgot this note in the original pull request.

Notice that you can exclude the codecheck adding metatensor/ similarly to other external packages here

However, we cannot accept the vesin namespace as is, because it adds symbols that are not in the PLMD namespace. This could create clashes if e.g. you combine PLUMED with another code using another version of the vesin library. If you look at other "external" tools (e.g. molfile, blas, lapack, xdfile, etc) they all provide symbols hidden in the PLMD namespace.

Given this, I would suggest:

  • writing a script that fixes namespace when importing the vesin library. You might want to get inspiration from src/*/import.sh scripts.
  • if possible, write such script so that all the issues with codecheck are fixed. if not, we can disable codecheck as explained above
  • run the ./header script on metatensor and check if it breaks something

This would be very close to what we do with other external packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another idea that should handle the symbol clash as well: basically move the code to external/vesin-single-build.cpp and then add a file neighbors.cpp which looks like

namespace PLMD {
namespace metatensor {

#include "./external/vesin.h"
#include "./external/vesin-single-build.cpp"

}}

I can then add the metatensor/external folder to the excluded folder list, and this seems to pass all checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be the only source code file located in a subdirectory so I would avoid it if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I finally had time to come back to this. The code is now using an import.sh file to import code from vesin

@GiovanniBussi
Copy link
Member

for non-periodic systems, what's returned by this functions? A matrix of zeros, or the bounding box of the system? Does it depend from one simulation engine to another?

It is expected to be a matrix of zeros. We only use the box to apply PBCs. But certainly it is possible that some code incorrectly passes a box even when there are no PBCs applied. In any case, within PLUMED it is correct to assume that having non zero numbers here imply PBCs.

in all cases, the storage format seems to be using rows of the matrix to store the three box vectors, is this correct?

Yes (in C order). So box[2][0] is the first component of the third vector.

@Luthaf Luthaf force-pushed the action-metatensor branch from dd85ec8 to 3ba18d7 Compare June 12, 2024 13:34
@Luthaf Luthaf requested a review from GiovanniBussi June 12, 2024 13:47
@GiovanniBussi
Copy link
Member

@Luthaf sorry for not merging this yet. Could you please confirm that I can do it? Thanks!!!

@Luthaf
Copy link
Contributor Author

Luthaf commented Jun 27, 2024

If you are happy with the code, please do merge! I'll send further PR to improve it/update dependencies as needed.

@GiovanniBussi GiovanniBussi merged commit 120447e into plumed:master Jun 27, 2024
22 checks passed
@GiovanniBussi
Copy link
Member

Thanks for your contribution!

@Luthaf Luthaf deleted the action-metatensor branch June 27, 2024 14:34
@Luthaf
Copy link
Contributor Author

Luthaf commented Jun 27, 2024

And thanks for your review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants