Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About D3 patch #151

Open
Taeyoung-kim-korea opened this issue Dec 25, 2024 · 5 comments
Open

About D3 patch #151

Taeyoung-kim-korea opened this issue Dec 25, 2024 · 5 comments

Comments

@Taeyoung-kim-korea
Copy link

I tried applying the D3 patch.

######################
(sevn)(My name)@WS-GPU:~$ sevenn_patch_lammps ./lammps_sevenn --d3
Patching LAMMPS with the following settings:

  • LAMMPS source directory: /home/(My name)/lammps_sevenn
  • D3 support enabled

Seems like the given LAMMPS is already patched.
Try again after removing src/pair_e3gnn.cpp to force the patch

Example build commands, under LAMMPS root:
mkdir build; cd build
cmake ../cmake -DCMAKE_PREFIX_PATH=/home/(My name)/anaconda3/envs/sevn/lib/python3.9/site-packages/torch/share/cmake
make -j 4
######################

It seems like the patch was successful.
However, in my src directory, there are no D3 patch files (such as pair_d3.cpp).

So, I removed pair_e3gnn.cpp to force the patch.

######################
(sevn) (My name)@WS-GPU:~$ sevenn_patch_lammps ./lammps_sevenn --d3
Patching LAMMPS with the following settings:

  • LAMMPS source directory: /home/(My name)/lammps_sevenn
  • D3 support enabled
    This system's OpenMPI is not 'CUDA aware', parallel performance is not optimal
    Changes made:
  • Original LAMMPS files (src/comm_brick.*, cmake/CMakeList.txt) are in {lammps_root}/_backups
  • Copied contents of pair_e3gnn to /home/(My name)/lammps_sevenn/src/
  • Patched CMakeLists.txt: include LibTorch, CXX_STANDARD 17
  • Copied contents of pair_d3 to /home/(My name)/lammps_sevenn/src/
  • Patched CMakeLists.txt: include CUDA
    Example build commands, under LAMMPS root
    mkdir build; cd build
    cmake ../cmake -DCMAKE_PREFIX_PATH=/home/(My name)/anaconda3/envs/sevn/lib/python3.9/site-packages/torch/share/cmake
    make -j 4
    ######################

But the problem is not solved.
Without D3, everything seems to work fine.

What should i do?

@Taeyoung-kim-korea
Copy link
Author

I solved this issue!

@YutackPark
Copy link
Member

Sorry for the late delay and inconvenience. If you have a time, could you share the cause? If it was related to lack of documentation, I want to fix it for other users.

@kartiksau89
Copy link

By the way, I found this error. I used
module purge
module load gcc/10.3.0 openmpi_nvhpc/4.1.2 compiler-rt cuda/11.2 tbb/2021.12 mkl/2024.1

cmake ../cmake -D PKG_MC=ON -D PKG_MOLECULE=ON -D BUILD_GPU=ON -D GPU_API=cuda -D GPU_ARCH=sm_70 -D CMAKE_C_COMPILER=$(which gcc) -D CMAKE_CXX_COMPILER=$(which g++) -D CMAKE_CUDA_COMPILER=$(which nvcc) -D CMAKE_PREFIX_PATH="$(python -c 'import torch; print(torch.utils.cmake_prefix_path)')"

"src/pair_d3.cu(1081): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (double *, float)"

Please help to solve this problem.

@YutackPark
Copy link
Member

@dambi3613

@dambi3613
Copy link
Contributor

dambi3613 commented Jan 21, 2025

By the way, I found this error. I used module purge module load gcc/10.3.0 openmpi_nvhpc/4.1.2 compiler-rt cuda/11.2 tbb/2021.12 mkl/2024.1

cmake ../cmake -D PKG_MC=ON -D PKG_MOLECULE=ON -D BUILD_GPU=ON -D GPU_API=cuda -D GPU_ARCH=sm_70 -D CMAKE_C_COMPILER=$(which gcc) -D CMAKE_CXX_COMPILER=$(which g++) -D CMAKE_CUDA_COMPILER=$(which nvcc) -D CMAKE_PREFIX_PATH="$(python -c 'import torch; print(torch.utils.cmake_prefix_path)')"

"src/pair_d3.cu(1081): error: no instance of overloaded function "atomicAdd" matches the argument list argument types are: (double *, float)"

Please help to solve this problem.

This error typically occurs when compiling for CUDA devices with a compute capability below 6.0. Although I don’t have direct experience using the BUILD_GPU option when building LAMMPS, there should ordinarily be no issues when targeting the sm_70 architecture. However, when compiling with SevenNet, the architecture settings might be overridden by Torch, so it’s important to check whether a lower architecture, such as below sm_60, has been selected.

If a lower architecture has been selected, you should ensure that architectures below sm_60 are excluded (this is the default behavior for sevenn_patch_lammps, but there could be cases where it doesn’t function as expected). Alternatively, you might consider implementing atomicAdd using atomicCAS to handle double arguments properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants