NEP for organic system #808

Leo-678 · 2024-11-29T05:19:20Z

Dear Prof.Fan

I am training FAPbI3 model, and while the RMSE during training stays within a normal range, I encounter an issue when performing molecular dynamics simulations. Specifically, the FA molecules tend to cluster together and thats not happen in AIMD simulations. I have tried changing the training parameters for the NEP but the issue persists.

What parameters should I pay attention to when training models involving organic molecules like FA (formamidinium) in the structure?

Best Regards!
Leo

zhyan0603 · 2024-11-29T07:03:51Z

Dear Leo,

Could you provide more details? For example, is your training set entirely from AIMD sampling? You can also try using this script plt_nep_train_results.py to plot the training results. Also, sharing the hyperparameters from your nep.in file would be helpful.

Finally, it might be better to move this issue to the "Discussions" board, so others can chime in too.

Best,
Zihan

Leo-678 · 2024-12-02T02:00:01Z

Dear Zihan:
Thank you for your reply. My training structure starts with pretraining using MTP, followed by structure selection using the D-optimal method with the pretrained potential. Finally, I perform training in NEP. I have tried two sets of parameters: one that I modified and the default one. However, after running for some time in NEP, an error occurs. In LAMMPS, no error is reported, but the FA molecules aggregate.

train-1
type 5 C Pb I N H
version 4
cutoff 9 5
n_max 8 6
l_max 4 2
neuron 50
batch 100
generation 2000000

train-2

type 5 C Pb I N H
version 4 # default
cutoff 8 4 # default
n_max 4 4 # default
basis_size 8 8 # default
l_max 4 2 0 # default
neuron 30 # default
lambda_e 1.0 # default
lambda_f 1.0 # default
lambda_v 0.1 # default
batch 1000 # default
population 50 # default
generation 1000000 # default

Best Regards! Leo

zhyan0603 · 2024-12-02T07:36:39Z

Dear Leo,

Thank you for sharing the details of your setup. I have a few suggestions that might help with the FA molecule aggregation issue:

Your current cutoff (e.g., cutoff 8 4) might also be too large, possibly including too many neighbors, especially some that may be from periodic image atoms. This can impact model stability. Could you share the size of your system? It might help determine if the cutoff need adjustment. Additionally, the log file during the training process could provide useful information, such as the Maximum number of neighbors for one atom for radial and angular descriptors. If this number is close to or even exceeds the maximum number of atoms for the system, reducing the cutoff or adding larger structures to the training set may help.

Also, I suggest trying active learning directly with NEP to add some new structures, which might be helpful.

Best，
Zihan

Leo-678 · 2024-12-02T08:32:42Z

Dear Zihan:

Thanks for your reply. Yes, that's exactly the issue. I discovered that for my AIMD simulation with 144 atoms in total, the cutoff value was indeed too large. I will respond to you after conducting the cutoff testing. However, since I'm studying phase transitions. I believe the choice of cutoff value should have a significant impact on the phase transition behavior. Do you have any suggestions for selecting a cutoff value? Or does this need to be adjusted adaptively based on how the molecular dynamics simulation behaves?

Best Regards.
Leo

zhyan0603 · 2024-12-02T08:53:00Z

Hi Leo,

For choosing the cutoff, it is best to keep it within half of the box size to avoid periodic image effects. If this is unacceptable, adding some supercell structures to the training set can help the model learn more diverse environments.

You can also test how the accuracy of the model changes with a smaller cutoff. If the RMSE is still acceptable, lowering the cutoff can speed up the simulation and generally has no adverse effects. For your system, you might try cutoff 6 4 and check whether they can correctly describe the phase transition behavior. For some phase transitions, it may be sufficient.

Best,
Zihan

Leo-678 · 2024-12-12T00:46:19Z

Hi Leo,

For choosing the cutoff, it is best to keep it within half of the box size to avoid periodic image effects. If this is unacceptable, adding some supercell structures to the training set can help the model learn more diverse environments.

You can also test how the accuracy of the model changes with a smaller cutoff. If the RMSE is still acceptable, lowering the cutoff can speed up the simulation and generally has no adverse effects. For your system, you might try cutoff 6 4 and check whether they can correctly describe the phase transition behavior. For some phase transitions, it may be sufficient.

Best, Zihan

Dear Zihan

Thank you very much for your reply. After testing many parameters, I found that whether it can run successfully and its performance is indeed closely related to the training parameters. Finally, I would like to ask if there are any recommended active learning examples. On the official website, I only found related commands. If there are examples available, it would help me get started more quickly.

Best
Leo

zhyan0603 · 2024-12-12T03:22:50Z

Dear Leo,

Regarding your request for active learning examples, I would like to inform you that as of now, the official documentation does not yet include specific examples for active learning with NEP. However, we have assigned a dedicated team member to maintain and expand our examples, which will eventually include active learning examples.

In the meantime, I can suggest two approaches that might help you:

MD Simulations and Sampling:
Start by performing MD simulations with the current NEP model. Then, extract some structures from these simulation trajectories for DFT calculations and use to verify the reliability of the current model (set prediction 1 in your nep.in file). For the sampling method, you can use random sampling, uniform sampling, or descriptor-based farthest point sampling implemented in pynep. If the prediction error is very close to the training error, it indicates that the NEP model is able to handle the current simulation conditions. You can then increase the simulation time (from ps to ns), temperature, pressure, or other conditions of interest so that NEP can learn more complex local atomic environments. If the predictions are not satisfactory, add the sampled structures to the training set for further training. The nep.restart file allows you to refine the force field incrementally. Repeat this sampling and prediction cycle until the force field meets your goals, such as accurately predicting phase transitions.

By the way, moderately reducing the batch size (eg. batch 200) can speed up training without compromising too much on model accuracy.

On-the-fly active learning:
See activate command for more details.

In addtion, our team members are implementing improved sampling methods similar to those used in the MTP active learning strategy. This feature is expected to debut in GPUMD 4.0, which will provide better tools for evaluating whether a structure should be included in the training set.

We appreciate your understanding and look forward to offering more comprehensive support and resources in the near future.

Best,
Zihan

Leo-678 closed this as completed Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEP for organic system #808

NEP for organic system #808

Leo-678 commented Nov 29, 2024

zhyan0603 commented Nov 29, 2024

Leo-678 commented Dec 2, 2024 •

edited

Loading

zhyan0603 commented Dec 2, 2024

Leo-678 commented Dec 2, 2024

zhyan0603 commented Dec 2, 2024

Leo-678 commented Dec 12, 2024

zhyan0603 commented Dec 12, 2024

NEP for organic system #808

NEP for organic system #808

Comments

Leo-678 commented Nov 29, 2024

zhyan0603 commented Nov 29, 2024

Leo-678 commented Dec 2, 2024 • edited Loading

zhyan0603 commented Dec 2, 2024

Leo-678 commented Dec 2, 2024

zhyan0603 commented Dec 2, 2024

Leo-678 commented Dec 12, 2024

zhyan0603 commented Dec 12, 2024

Leo-678 commented Dec 2, 2024 •

edited

Loading