-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance stability of minimization #557
Conversation
See also choderalab/yank#1267 |
FYI I just realized that I accidentally named this branch stable_equilibration instead of stable_minimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ijpulidos : This is a great step forward, but there are a few risky things in here we should address before pulling it into a bugfix release. I'll look into this in more detail.
thermodynamic_state.pressure = None | ||
|
||
# Use Gradient Descent first for numerical stability | ||
integrator_grad = GradientDescentMinimizationIntegrator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial step size here is not guaranteed to be robust. I wonder if there's a way we can automatically select a more robust initial step size, or if this is not a critical issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some of my other work with gromacs, I use emstep 0.0001 (nanometers) which is 10 times smaller than the openmmtools default of 0.01 Angstroms. gromacs also has an adaptive step size, and I have some plots showing that starting with 0.001 Angstroms, the adaptive step size increases for the first ~25 iterations before stabilizing. In other words, you're only 'wasting' at most ~25 iterations, which isn't even a rounding error in the total runtime. Of course, those iterations are well spent if they can prevent minimization failures.
|
||
# Use Gradient Descent first for numerical stability | ||
integrator_grad = GradientDescentMinimizationIntegrator() | ||
context_grad = thermodynamic_state.create_context(integrator_grad) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good that we're creating the context and then cleaning it up again, since we don't repeatedly re-use the minimization context, but it would be useful if we also specified platform properties to ensure we use mixed precision if needed for the propagation or energy computation context caches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I'm not sure why this is failing CI. |
Are there any thoughts on this PR? Again, disabling the barostat is really the only thing that is critical, which is why I originally separated it out into it's own 3-line commit. Perhaps it was a mistake to add additional enhancements. Can we at least merge the pressure bug in the first commit? |
@jfennick Thanks for your contributions. I'm tagging @jchodera for him to take another look into these, sine we need to see if the previously raised concerns are solved. I will also review this in detail, since this can potentially change many things in our calculations. For now I'm marking this to be released on our 0.22 milestone. |
Resolves #668 |
Description
I have been experiencing some instabilities which, after much difficulty, I was able to track down to the minimizer. The observed behavior is that the minimizer will finish 'successfully', but some time later, during equilibration and/or propagating replicas, the simulation may or may not randomly crash with NaNs or even throw CUDA_ERROR_ILLEGAL_ADDRESS (700).
Nearly identical behavior can be found here openmm/openmm#3414. In that ticket, the issue was that the box vectors were not being updated correctly. For stability, during minimization the box vectors should not be allowed to change at all. Subsequent equilibration at NPT should be used to adjust the box vectors.
In my case(s), the box vectors were changing only at the fourth significant figure, and yet that was sufficient to (usually) crash the simulations. Inserting Gradient Descent before FIRE seems to greatly ameliorate but not completely eliminate the instabilities. Replacing FIRE with L-BFGS (with or without Gradient Descent) also seems to fix the instabilities.
This PR adds three changes to improve the stability of minimization:
Todos
Status