enable use of CVs defined by PyTorch neural network models #570

zwpku · 2023-08-28T14:19:49Z

This branch implements a class called torchANN, which allows to define cv components by loading pretrained PyTorch neural network models.

Installation Steps

Download LibTorch. This package is required in order to enable the torchann class. First, download the code and unzip it.
```
     wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-cxx11-abi-shared-with-deps-latest.zip
     unzip libtorch-cxx11-abi-shared-with-deps-latest.zip
```
In this way, the library is uncompressed under the current directory. Let's say it is located at /path/to/libtorch.
Patch MD engine. This step is done as usual using the script update-colvars-code.sh. Enter the source code of Colvars package, and run:
```
     ./update-colvars-code.sh /path/to/md-engine        
```

Compilation. This step depends on the engine to be compiled.

NAMD: add "--with-colvars-torch --torch-prefix path/to/libtorch" to the argument of ./config

Assume packages that are required to build NAMD, e.g. charm, tcl/tcl-threaded, are already prepared.
Then, one can compile the NAMD package with the following commands:

    ./config Linux-x86_64-g++ --charm-arch multicore-linux-x86_64 --with-colvars-torch    \
          --torch-prefix /path/to/libtorch  --with-fftw3 --fftw-prefix /path/to/fftw
    cd Linux-x86_64-g++
    make

GROMACS: add "-DTorch_DIR=/path/to/libtorch/share/cmake/Torch" when running cmake

An example of the command is:

    cmake .. -DCMAKE_INSTALL_PREFIX=/home/username/local/gromacs  \
                    -DFFTWF_LIBRARY=/home/username/mambaforge/lib/libfftw3f.so  \
                    -DFFTWF_INCLUDE_DIR=/home/username/mambaforge/include \
                    -DTorch_DIR=/path/to/libtorch/share/cmake/Torch/  \
                    -DCMAKE_CXX_COMPILER=/usr/bin/mpicxx \
                    -DOpenMP_gomp_LIBRARY=/home/username/mambaforge/lib/libgomp.so

LAMMPS: only installation by cmake is supported. In the directory of LAMMPS source code, run

     mkdir build && cd build
     cmake ../cmake -D PKG_COLVARS=yes -D COLVARS_TORCH=yes

and set the variable Torch_DIR in the file CMakeCache.txt. When a cpu version of libtorch library is used, it may
also be necessary to set MKL path to empty:

     MKL_INCLUDE_DIR:PATH=

Alternatively, one could combine these steps in one command:

     cmake ../cmake -D PKG_COLVARS=yes -D COLVARS_TORCH=yes      \ 
         -D  Torch_DIR=/path/to/libtorch/share/cmake/Torch -D MKL_INCLUDE_DIR=

After that, run make and make install to compile and install the package.

The class has only been tested using simple neural network models (i.e. an autoencoder on alanine dipeptide), under NAMD and GROMACS engines. Feedbacks are welcome!

A (trivial) example

Create a PyTorch model

import torch

class MyModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
    def forward(self, x):
        return x

model = MyModel()
scripted_cv_filename = f'./identity.pt'
torch.jit.script(model).save(scripted_cv_filename)

This Python script simply creates a model which is an identity map and save it to a file named identity.pt.

Define the COLVARS config file

This file defines two CVs using torchann class taking other cv components (here dihedral angles) as inputs.

colvarsTrajFrequency    10000
colvarsRestartFrequency 10000

colvar {
  name nn_0
  lowerBoundary -180.0
  upperBoundary 180
  width 5.0
  extendedLagrangian on
  extendedFluctuation 5.0
  extendedTimeConstant 200

  torchann {
    modelFile identity.pt
    m_output_index 0
    period 360

    dihedral {
      group1 { 
	atomnumbers 5
      }
      group2 { 
	atomnumbers 7
      }
      group3 { 
	atomnumbers 9
      }
      group4 { 
	atomnumbers 15
      }
    }

    dihedral {
      group1 { 
	atomnumbers 7
      }
      group2 { 
	atomnumbers 9
      }
      group3 { 
	atomnumbers 15
      }
      group4 { 
	atomnumbers 17
      }
    }

  }
}

colvar {
  name nn_1
  lowerBoundary -180.0
  upperBoundary 180
  width 5.0
  extendedLagrangian on
  extendedFluctuation 5.0
  extendedTimeConstant 200

  torchann {
    modelFile identity.pt
    m_output_index 1
    period 360

    dihedral {
      group1 { 
	atomnumbers 5
      }
      group2 { 
	atomnumbers 7
      }
      group3 { 
	atomnumbers 9
      }
      group4 { 
	atomnumbers 15
      }
    }

    dihedral {
      group1 { 
	atomnumbers 7
      }
      group2 { 
	atomnumbers 9
      }
      group3 { 
	atomnumbers 15
      }
      group4 { 
	atomnumbers 17
      }
    }
  }
}

abf {
  colvars nn_0 nn_1
  fullSamples	200
}

Torchann1

…model

giacomofiorin · 2024-10-19T20:03:44Z

The latest GROMACS test error is unrelated to Colvars: https://gitlab.com/gromacs/gromacs/-/issues/5204

giacomofiorin · 2024-11-08T21:48:27Z

Hi there! GROMACS 2025 runs without errors the torchann input from this branch, but there are differences from the reference files generated by @zwpku and updated by @jhenin.

See the outputs here:
https://github.com/Colvars/colvars/actions/runs/11747796165/artifacts/2164700677

zwpku · 2024-11-12T20:43:45Z

Hi there! GROMACS 2025 runs without errors the torchann input from this branch, but there are differences from the reference files generated by @zwpku and updated by @jhenin.

See the outputs here: https://github.com/Colvars/colvars/actions/runs/11747796165/artifacts/2164700677

@giacomofiorin thanks for the work!
Any hint on what the reason might be? I don't have a good understanding on the regression test... Could it be that the files under gromacs/tests/library/000_torchann/AutoDiff are outdated?

giacomofiorin · 2024-11-12T21:25:58Z

Hi there! GROMACS 2025 runs without errors the torchann input from this branch, but there are differences from the reference files generated by @zwpku and updated by @jhenin.
See the outputs here: https://github.com/Colvars/colvars/actions/runs/11747796165/artifacts/2164700677

@giacomofiorin thanks for the work! Any hint on what the reason might be? I don't have a good understanding on the regression test... Could it be that the files under gromacs/tests/library/000_torchann/AutoDiff are outdated?

@zwpku Yes, the reference files currently in that folder were produced came from another build, with a different version of libTorch. Would you expect this kind of difference? It is small, but it did exceed our threshold (1.0e-6 relative error).

zwpku · 2024-11-12T22:01:21Z

Hi there! GROMACS 2025 runs without errors the torchann input from this branch, but there are differences from the reference files generated by @zwpku and updated by @jhenin.
See the outputs here: https://github.com/Colvars/colvars/actions/runs/11747796165/artifacts/2164700677

@giacomofiorin thanks for the work! Any hint on what the reason might be? I don't have a good understanding on the regression test... Could it be that the files under gromacs/tests/library/000_torchann/AutoDiff are outdated?

@zwpku Yes, the reference files currently in that folder were produced came from another build, with a different version of libTorch. Would you expect this kind of difference? It is small, but it did exceed our threshold (1.0e-6 relative error).

@giacomofiorin If I see correctly, the torch model in that test is simply the identity map and the CV is a dihedral angle. So I expect there should be little difference due to different versions of libTorch. Could it also be caused by some changes in the source code or in the config files of that test? I can try to build and examine the test on my local machine.

giacomofiorin · 2024-11-13T17:41:46Z

Hi there! GROMACS 2025 runs without errors the torchann input from this branch, but there are differences from the reference files generated by @zwpku and updated by @jhenin.
See the outputs here: https://github.com/Colvars/colvars/actions/runs/11747796165/artifacts/2164700677

@giacomofiorin thanks for the work! Any hint on what the reason might be? I don't have a good understanding on the regression test... Could it be that the files under gromacs/tests/library/000_torchann/AutoDiff are outdated?

@zwpku Yes, the reference files currently in that folder were produced came from another build, with a different version of libTorch. Would you expect this kind of difference? It is small, but it did exceed our threshold (1.0e-6 relative error).

@giacomofiorin If I see correctly, the torch model in that test is simply the identity map and the CV is a dihedral angle. So I expect there should be little difference due to different versions of libTorch. Could it also be caused by some changes in the source code or in the config files of that test? I can try to build and examine the test on my local machine.

@zwpku Yes, if you could please check as soon as possible that would be very helpful! We may have a chance to convince the GROMACS people to include this in the 2025 release, but timing is very tight.

Updated reference files accordingly.

giacomofiorin · 2024-11-14T01:51:43Z

Still getting deviations from the reference files that you just uploaded in the CI tests. It is probably due to differences in libTorch versions. The one in the container is 2.4:

colvars/devel-tools/containers/CentOS9-devel.def

Lines 65 to 68 in 4651a59

    
           # Download pre-built libTorch 
        
           rm -fr /opt/torch 
        
           curl -o /tmp/libtorch.zip https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.4.1%2Bcpu.zip 
        
           unzip /tmp/libtorch.zip -d /opt

Is the .pt file sensitive to the version? If so, I strongly recommend that this is re-generated using a matching same version of libTorch?

zwpku · 2024-11-14T13:56:58Z

Still getting deviations from the reference files that you just uploaded in the CI tests. It is probably due to differences in libTorch versions. The one in the container is 2.4:

colvars/devel-tools/containers/CentOS9-devel.def

Lines 65 to 68 in 4651a59

# Download pre-built libTorch

rm -fr /opt/torch

curl -o /tmp/libtorch.zip https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.4.1%2Bcpu.zip

unzip /tmp/libtorch.zip -d /opt

Is the .pt file sensitive to the version? If so, I strongly recommend that this is re-generated using a matching same version of libTorch?

@giacomofiorin I guess the original deviation you encountered was (partially) due to random seed. The seed was fixed in gromacs/tests/library/Common/test.mdp in a previous commit by @jhenin, but somehow it was reverted again, possibly due to a merge with master.
The deviation from yesterday was due to the reference files I uploaded, which were generated on my local machine using Gromacs in single precision (I realized that Gromacs is compiled with double precision when running the CI tests). After changing to double precision and regenerating the reference files, it passes the test.

I tried libtorch 2.0.1, 2.3.0, and 2.4.1, and the results are the same. The .pt files generated using PyTorch 1.13.1 and 2.3.1 are different (in size). But when loaded in gromacs/colvars, the results (input/output/gradient) are the same.

zwpku · 2024-11-14T14:10:53Z

@giacomofiorin besides, I saw some differences (e.g. in colvar.cpp, colvarmodule.cpp, colvarmodule_refs.h) when I ran
git diff master..torchann -- src. They seem to be unrelated to torchann class. Should these files be updated?

giacomofiorin · 2024-11-14T15:43:55Z

@giacomofiorin besides, I saw some differences (e.g. in colvar.cpp, colvarmodule.cpp, colvarmodule_refs.h) when I ran git diff master..torchann -- src. They seem to be unrelated to torchann class. Should these files be updated?

Yes. I just did that.

If the tests pass (thank you for addressing the GROMACS precision issue!) we can proceed to merge this PR into master. Given the many conflicts accumulated in this branch, we should use a squash merge but link the PR in the commit.

giacomofiorin · 2024-11-14T16:03:20Z

PR merged! Thanks so much @zwpku and everyone who helped getting this PR done!

giacomofiorin · 2024-11-15T17:53:21Z

FYI Lukas Mullender from the GROMACS team raised a couple of comments on the code regarding the use of GPU models and precision:
https://gitlab.com/gromacs/gromacs/-/merge_requests/4780#note_2213094685

@HubLot

This MR includes small fixes and improvements to the copy of the Colvars library in `src/external`, as well as one feature (the `torchANN` collective variable type). The Torch-Colvars interface was previously not included in !4611 because of failures in the Colvars CI runners. We have since confirmed with the main author of the feature that the culprit was the precision of the GROMACS build and he confirmed that the numerical results are consistent across libTorch versions (see Colvars/colvars#570 (comment)). Matches [this commit in the Colvars repo](Colvars/colvars@3023d8e). CC @HubLot @jhenin

zwpku and others added 30 commits May 13, 2022 22:49

In the process of implementing torchann colvar

a6f8e84

Merge branch 'master' of https://github.com/zwpku/colvars

12c4ac7

Implementing torchann

6336c4b

Implemented calc_gradients and apply_forces

0d6c2c6

Start working

80a293e

Handling periodicity; test torchANN with phi angle

6275dfb

Add output_index to specify the component of output layer that is used

2d06095

update with upstream

eb5cfe5

up

70ef673

Update gromacs-2021.6 patch

c4aa08b

modify patch files for namd to support LibTorch

3363f39

Merge branch 'Colvars:master' into torchann

56ee5d1

updated with upstream:master

3711f0d

Add test for torchann

3f5eb82

Merge pull request #1 from zwpku/torchann1

45064d6

Torchann1

minor changes

2edb0ff

Modified test for torchann component

5a781d8

Add figures for AutoEncoder CV

4776e5e

Convert unit of atom coordinates to angstrom before feeding to torch …

41f5618

…model

Bug fixed in gradient of torchann

66b69b4

Add Torch in lammps' cmake

53bbf6a

Added test files for torchann in NAMD

c049f84

Merge branch 'Colvars:master' into torchann

c68df86

Minor changes in torchann component

2c5ae86

Updated patches for Gromacs 2020.x and 2022.x

1286028

Updated torchann in header file; Removed test for torchann component

78789ed

Add tests for Gromacs and NAMD

13753ff

Add tests for Gromacs and NAMD

0ae580c

Updated torchann-namd test and references

47f4a87

Reimplement torchann as a derived class of LinearCombination

b59830c

Zhang and others added 5 commits October 11, 2024 17:51

minor fix

9f7cd27

Add torch header file to dependence file of backends

c471ae8

fixed bug when torch is not available

d500a88

Merge branch 'master'

28d80b4

Enable Torch in GROMACS builds when available from CI container

871cfa0

HanatoK mentioned this pull request Oct 16, 2024

Derivatives of RMSD plumed/plumed2#1138

Closed

Do not duplicate libTorch C++ flags

3c08493

giacomofiorin added 4 commits October 22, 2024 11:02

Document apptainer push

7830857

Let GROMACS use the newer CMake when available

fbcbd0d

Squash-merge of master branch

1b6f9ad

Squash-merge of master branch

b76d60e

Set damping=0 in gromacs regtest for torchann class.

1ade0dc

Updated reference files accordingly.

Zhang added 2 commits November 14, 2024 13:53

Updated references files generated by Gromacs with double precision

f75efd1

Minor corrections in reference files.

e7597c0

Squashed merge of master branch

1c4bf10

giacomofiorin merged commit f718e65 into master Nov 14, 2024
15 checks passed

giacomofiorin deleted the torchann branch November 14, 2024 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable use of CVs defined by PyTorch neural network models #570

enable use of CVs defined by PyTorch neural network models #570

zwpku commented Aug 28, 2023 •

edited

Loading

giacomofiorin commented Oct 19, 2024

giacomofiorin commented Nov 8, 2024

zwpku commented Nov 12, 2024 •

edited

Loading

giacomofiorin commented Nov 12, 2024

zwpku commented Nov 12, 2024

giacomofiorin commented Nov 13, 2024

giacomofiorin commented Nov 14, 2024

zwpku commented Nov 14, 2024

zwpku commented Nov 14, 2024

giacomofiorin commented Nov 14, 2024

giacomofiorin commented Nov 14, 2024

giacomofiorin commented Nov 15, 2024

enable use of CVs defined by PyTorch neural network models #570

enable use of CVs defined by PyTorch neural network models #570

Conversation

zwpku commented Aug 28, 2023 • edited Loading

Installation Steps

A (trivial) example

giacomofiorin commented Oct 19, 2024

giacomofiorin commented Nov 8, 2024

zwpku commented Nov 12, 2024 • edited Loading

giacomofiorin commented Nov 12, 2024

zwpku commented Nov 12, 2024

giacomofiorin commented Nov 13, 2024

giacomofiorin commented Nov 14, 2024

zwpku commented Nov 14, 2024

zwpku commented Nov 14, 2024

giacomofiorin commented Nov 14, 2024

giacomofiorin commented Nov 14, 2024

giacomofiorin commented Nov 15, 2024

zwpku commented Aug 28, 2023 •

edited

Loading

zwpku commented Nov 12, 2024 •

edited

Loading