-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducibility and Stability of ABFE Calculations #18
Comments
Thanks for the details, this is really helpful. Hopefully we can discuss most of this in the meeting tomorrow. Regarding the flexible lambda schedules, this is actually something that I've been meaning to talk to you about. I have a local feature branch that allows the user to explicitly specify the lambda windows. This means that you can run windows independently, or fill in gaps if you find that more sampling is required in certain regions. This works fine, but there are issues combining the data to form a consistent energy trajectory. (It works if all windows are run the first time around, but not if you later add additional windows.) |
Following up on our discussion today: As Christopher suggested, I attempted to rerun these simulations with SOMD2 to check for instabilities. Initially, I found that everything was crashing during minimisation. This was due to the mapping failing in
Showed that most of the atoms were not being matched (
I got the expected mapping with all 32 atoms matched, showing that the mapping failure was due to me changing the charge, LJ, element, or amber_type properties of the end state molecule in
For benzene (as in original example in this issue), Obviously, this is not a problem for me since it is easy to manually create the mapping, but it would be interesting to know why this is failing. The input is: gh_issue_mapping.tar.gz. To reproduce, run Thanks. |
I've tested and the mapping works if you don't modify the |
For benzene, perhaps the MCS can correctly be mapped from the bonding alone. |
I've done some more investigations into the crashes at lambda =1. It turned out that the integrator wasn't being passed though to the underlying sire dynamics before. For each set of parameters, I use 10 replicates, and make modifications with respect to the config:
I am using sire and biosimspace 2024.1.0.dev, and all tests are for the free benzene in the inputs above. I am using two lambda windows to concentrate sampling at lambda =1, while keeping the sanity check of lambda = 0 .
I also got crashes for some quick NoseHoover and Andersen runs. I also tried reproducing the earlier results with and without restraints for the bound leg, with more repeats: Bound leg, default config with restraints: No crashes, 10 runs. I tried a Nose Hoover integrator on the whole free system, with a view to implementing @jmichel80's suggestion (OpenBioSim/sire#100 (comment)). I realised that non-stochastic integrators aren't suitable for the decoupled ligand, though - it just floats in a straight line according to the initial velocity. So the crashes are fully prevented by "stiff" coupling to the protein with intermolecular restraints, but not by strongly damping the velocities. I tried to see if leaving one carbon with full charges and LJ terms sorted the crashes, but I still got 4/10 runs crashing with NaN. To see if I could see high bond energies in the ligand at lambda = 1, I tried printing out the energy components every 10 ps for a ns, with:
But I was confused by the output, which did not show any bonded component, but seemed to show substantial intramolecular coulomb and LJ terms which were similar at lambda =0 and 1: I assume I'm doing something wrong when printing the system energies, and I need to tell it which state it's at? I think the next thing to try is checking that the expected terms are in the lambda =1 Hamiltonian, and trying Chrisotopher's decouple/ annihilate code, as he suggested before. |
The To expose the perturbed state, you will probably want to do something like the following in your loop when the lambda index corresponds to the lambda = 1 state: perturbed = sr.morph.link_to_perturbed(self._system)
lig_energies = perturbed[0].energy().components() (For reference, the various sire perturbation functions are found here.) I'm not sure about the missing bonded components. I'll look at some merged systems locally to see what I get. |
Thanks for the quick response and links. |
I've had a quick shot at running ABFEs with SOMD2. The overall experience was a massive step up from SOMD! The recent sire documentation was also super clear and comprehensive. However, I've had a couple of issues:
Overall Set-Up
To Reproduce
Input files and scripts: reproduce_somd2_abfe.tar.gz
feature_abfe
branch of SOMD2 (tiny modifications to allow a Boresch restraint to be passed in)cd reproduce_somd2_abfe
python run_somd2.py free input/pert_system_free.bss
, or bound withpython run_somd2.py bound input/pert_system_bound.bss
Optionally, recreate the perturbed systems from the SOMD1 input with:
Because I ran my tests before the issue with installing the 2024 conda packages was fixed, I'm using:
Running on Ubuntu 22.04.
SOMD1 Results
Experimental free energy: -5.19 +/- 0.16 kcal mol-1 (https://pubs.acs.org/doi/epdf/10.1021/ct060037v)
All uncertainties are 95 % t-based CIs from 5 replicate runs.
Used default lambda values shown here
Used these SOMD settings. 12 A cutoff with RF.
Analysed with SOMD "native" MBAR script, no subsampling.
Free energy of releasing restraint to standard state volume: -7.08 kcal mol-1
Symmetry correction for restraint: - kTln2 = -0.41 kcal mol-1 (as rotation around axis of 6-fold symmetry occurs as the restraints are turned on, but flipping of the ring is not sampled).
200 ps per window, no equilibration
Free: -5.07 +/- 0.08 kcal mol-1
Bound: 6.86 +/-0.30 kcal mol-1
Overall = free - bound - restraint_corr + symmetry_corr = -5.07 - 6.86 + 7.08 - 0.41 = -5.26 +/- 0.35 kcal mol-1
In good agreement with experiment.
30 ns per window, 10 ns of which discarded to equilibration
Free: -5.18 +/- 0.02 kcal mol-1
Bound: 5.52 +/- 0.80 kcal mol-1
Overall: -4.03 +/- 0.80 kcal mol-1
Larger error and drift to more positive free energies due to water starting to move into the binding site for some of the runs towards lambda = 1 in the bound vanish leg.
SOMD2 Results
Bound Leg
SOMD2 defaults other than RF with 12 A cutoff and 100 ps runs
Repeat 1: 6.2 +/- 0.3 kcal mol-1
Repeat 2: 6.9 +/- 0.4 kcal mol-1
Repeat 3: 7.1 +/- 0.4 kcal mol-1
In good agreement with the SOMD1 results for 200 ps, at least within the large errors.
The overlap is more than enough:
and the ligand clearly stays in the binding site when fully decoupled.
As above but without restraints
-1.8 +/- 0.2 kcal mol-1
The with_restraints - without_restraints difference is in the expected order of magnitude.
SOMD2 defaults other than RF with 12 A cutoff, 5 ns runs, and 10 ps energy frequency
Repeat 1: 5.5 +/- 0.2 kcal mol-1
Repeat 2: 5.3 +/- 0.2 kcal mol-1
Repeat 3: 5.3 +/- 0.2 kcal mol-1
A slight decrease compared to the shorter runs, as expected, and in good agreement with the SOMD1 30 ns runs.
Free Leg
SOMD2 defaults other than RF with 12 A cutoff and 100 ps runs
Repeat 1: -9.1 +/- 0.6 kcal mol-1
Repeat 2: -7.8 +/- 0.6 kcal mol-1
Repeat 3: -8.3 +/- 0.6 kcal mol-1
Substantially more negative than for SOMD1, producing a substantially more negative free energy of binding compared to experiment. Also, substantially greater variability in results than expected. As expected from the bound leg, the overlap was good:
SOMD2 defaults other than RF with 12 A cutoff, 5 ns runs, and 10 ps energy frequency
Repeat 1: -8.8 kcal mol-1
Attempted 5 more repeats, but all failed at lambda = 1.0 with:
or
Tried:
Investigating lambda = 1 crashes with bound leg
So we get crashes when the ligand is non-interacting which are prevented by coupling the non-interacting ligand to interacting particles - seems like energy is building up in the decoupled ligand. To try and dissipate this energy, I tried running tried free leg 5 ns runs with a Brownian integrator - this seemed to reduce the frequency of crashes - as 4 / 6 runs completed.
Questions and Comments on Software
A few questions and things I ran into while setting up and running the simulations, not all of which are related to SOMD2:
BoreschRestraints
object in sire which allows you to input the equilibrium values, rather than measuring them from the atoms suppliedis_decoupled
rather thanis_perturbed
and sets the charges, LJ terms, and atom types as end-state 1 properties in the ligand. However, as mentioned above, SOMD2 only checks foris_perturbed
and I got a sire error (not all required properties present) when I just set theis_perturbed
property in the decoupled ligand returned by the standard BSS workflow for ABFE. Given that the ABFE and RBFE workflows are very similar in BSS (handled by the same class) maybe it would be be better to have the properties set in a more similar way (e.g. haveis_perturbed
for the ABFE ligand).Thanks!
The text was updated successfully, but these errors were encountered: