Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Restraints #203

Closed
lohedges opened this issue Apr 19, 2021 · 26 comments · Fixed by #206
Closed

[FEATURE] Restraints #203

lohedges opened this issue Apr 19, 2021 · 26 comments · Fixed by #206
Assignees

Comments

@lohedges
Copy link
Member

This issue thread is intended to discuss improved restraint handling during equlibration in BioSimSpace. We currently only support backbone restraints using the restraint_backbone keyword, which relies on correct atom naming for formats supported by the various MD engines in order to work correctly. This makes it hard to take an AMBER format file and run a restrained equlibration in GROMACS, or vice-versa. It is clear that we need to support a wider range of restraint options and provide greater flexibility to the user.

A possible solution is to use a restraint option that could take the values None, "backbone", "heavy", "non-water", etc, as well as a list of atom indices if the user wants to restrain specific atoms, or a combination of the above. The latter would require the ability to get the required indices using our built in search functionality in a consistent and reliable way.

The following describes the way in which the supported engines implement restraints:

  • AMBER: This uses restraintmask. Currently this uses names to match backbone atoms, but we can pass explicit atom indices (or ranges of indices) if we want greater flexibility.

  • GROMACS: We use gmx genrestr to generate posre*.itp files for the atoms in each molecule that are restrained. This requires the correct naming and is limited to the options (although there are lots) supported by genrestr. We could write these files by hand (not much more work) if we could reliably get the atom indices.

  • OpenMM: We have a custom implementation that uses zero-mass dummy atoms to apply restraints to atoms. We already apply restraints by atom index, so this be easy to adapt.

  • NAMD: This requires a PDB restraint file, where the occupancy column specifies which atoms are restrained. Easy generalise to any atom index, rather than just backbone atoms.

  • SOMD: We don't support general restraints with SOMD.

Suggestions for how to search for atoms matching a restraint:

  • "heavy"
system.search("(atoms in not water) and not element H")
# Would also need to do some filtering for ions.
  • "non-water"
system.search("atoms in not water")
  • "backbone"
    Is there an easy way to do this without relying on atoms names and without looking at bonding? If there's another tool that can do this, perhaps we could use it.
@lohedges
Copy link
Member Author

Looking at other tools, e.g. MDTraj, ParmEd, MDAnalysis, they all select backbone atoms by name, mapped to a particular set of residues, also matched by name.

@jmichel80
Copy link
Contributor

A few comments

  • OpenMM : setting masses to zero freezes atoms during the integration. This is algorithmically quite different from applying a position restraint, although it may work for the purposes of equilibration.
  • SOMD : Positional restraints can be implemented with the keywords ''use restraints'' , ''restraint force constant'', "heavy mass restraint", ''unrestrained residues''. This may not give enough flexibility to restraint e.g. only backbone atoms.

I don't think it is possible to do ''backbone'' without using dictionaries of atom names.

@lohedges
Copy link
Member Author

Thanks for the info. Sorry, what I mean about SOMD it wouldn't work easily for the options we propose, since it doesn't have per-atom flexibility. As you say, I don't think you could restraint backbone atoms so I disable restraints with SOMD when equilibrating at present. Since we now support OpenMM, I don't think there's a particular need to use SOMD for non-FEP simulations anyway so it's not too much of an issue.

I agree with your comments in the other thread and will focus on GROMACS and OpenMM as a start.

@lohedges
Copy link
Member Author

OpenMM : setting masses to zero freezes atoms during the integration. This is algorithmically quite different from applying a position restraint, although it may work for the purposes of equilibration.

Yes, I agree it's a little different, but it is what was suggested by the OpenMM folks. It's, apparently, the only way to get the OpenMM restraint system to play nicely when working in the NPT ensemble. See here for some details.

@lohedges
Copy link
Member Author

Ironically it sounds like backbone restraints are the hardest thing here. Perhaps we should support that option but warn the user that it might be problematic if we convert between formats?

@jmichel80
Copy link
Contributor

jmichel80 commented Apr 19, 2021 via email

@lohedges
Copy link
Member Author

I just want to check that we shouldn't restrain ions (at least free ions) in any case. Perhaps backbone, heavy, and all are better names, with clear descriptions of what they refer to, i.e. non-water, non-ion, etc.

@jmichel80
Copy link
Contributor

i would treat ions similarly to water, so if restraining all protein atoms letting ions and water move.

@lohedges
Copy link
Member Author

I've made good progress with this and I've now updated the code to handle the new restraint system for all of the MD engines that we support. The user can now choose from keyword restraint types: "backbone", "heavy", or "all", or explicitly pass a list of atom indices.

The most difficult part was making sure that we can correctly obtain absolute or relative atom indices (absolute to the system, or relative to a molecule) in an efficient way. (We already did some of this for generating PLUMED files, but needed more flexibility.) There were also complications for GROMACS, since you apply restraints for each GROMACS molecule type, so we needed a way of mapping molecules in the system to their respective types. This was possible using the Sire.IO.GroTop parser directly with a little massaging.

One limitation with GROMACS is that, since the restraints apply to molecule types, not molecules, there isn't a way to apply restraints to one molecule of a particular type, i.e. they are applied to all molecules of that type. I'm not sure that this will be a problem in practice since we're only really considering protein-ligand complexes, and solvent / ions aren't considered for the restraints.

As mentioned in the other thread, GROMACS also complains when you turn on restraints for a system that was loaded from non-GROMACS files, e.g.

'gmx grompp' reported the following warnings:
WARNING 1 [file gromacs.top, line 256]:
  1890 non-matching atom names
  atom names from /home/lester/Code/BioSimSpace/demo/tmp/gromacs.top will be
  used
  atom names from /home/lester/Code/BioSimSpace/demo/tmp/gromacs.gro will
  be ignored

Since this is just a warning (which are elevated to errors by default) I simply detect it, and re-run gmx grompp with --maxwarn 1.

Since SOMD doesn't fit with the new restraint system (at least without work), I just raise an exception if the user tries to run a restrained equilibration protocol which suggests that they try a different engine.

Unrelated, but also in this feature branch: I've added support for running metadynamics and steered molecular dynamics simulations using AMBER.

I'll run a some trial simulations tomorrow morning and will make a pull-request when I'm happy.

@lohedges
Copy link
Member Author

Oh, and the restrain_backbone option is flagged as being deprecated. It still works for now, just being converted to restraint="backbone" behind the scenes. If both the restraint and restrain_backbone parameters are set, then restraint takes precedence and a warning is raised.

@lohedges lohedges linked a pull request Apr 21, 2021 that will close this issue
@lohedges
Copy link
Member Author

Just to note that I've exposed some of the restraint functionality to the user to try to make things as flexible as possible. Hopefully this will facilitate things like:

equilibration with restraints on backbone atoms and ligand

This could be done with:

# Here system is a protein-ligand complex

# Get the indices of backbone atoms in the protein.
backbone = system.getRestraintAtoms("backbone")

# Now get the indices of all ligand indices. (Here the ligand is molecule index 1 in the system.)
ligand = system.getRestraintAtoms("all", 1)

# Create the equilbration protocol.
protocol = BSS.Protocol.Equilibration(restraint=backbone+ligand)

@lohedges
Copy link
Member Author

lohedges commented Apr 26, 2021

Unfortunately it turns out that AMBER has a limit of 256 characters for the restraint mask, so we can't flexibly restrain atoms by index. (Even though I'd written some nice code to find contiguous blocks, etc.) This means that we can only really perform restraints for large proteins using a name mask. This isn't too much of an issue for backbone restraints, since those match a name template anyway, but it would mean that we can't reliably handle an arbitrary list of atom indices, or combine matches like shown above. It also means that the code won't work if converting from a different format, where the atoms don't follow the AMBER naming conventions.

@lohedges
Copy link
Member Author

Some discussions and possible kludges in this thread.

@kexul
Copy link
Contributor

kexul commented Sep 28, 2021

# Here system is a protein-ligand complex
# Get the indices of backbone atoms in the protein.
backbone = system.getRestraintAtoms("backbone")
# Now get the indices of all ligand indices. (Here the ligand is molecule index 1 in the system.)
ligand = system.getRestraintAtoms("all", 1)
# Create the equilbration protocol.
protocol = BSS.Protocol.Equilibration(restraint=backbone+ligand)

Hi @lohedges , I've used some similar code with yours but an error occurred.

Here is the code:

import BioSimSpace as BSS
import logging 

def run_process(system, protocol, work_dir):
    """
    Run system with protocol, get the system afterwards
    Parameters
    ----------
    system: sire system
    protocol: BioSimSpace Protocol

    Returns
    -------
    sire system
    """
    process = BSS.Process.Gromacs(system, protocol, work_dir=work_dir, ignore_warnings=True)
    process.addArgs({'-ntmpi':1, '-ntomp':'30', '-gpu_id':0})

    process.start()
    process.wait()
    logging.info(f'Potential energy: {process.getCurrentPotentialEnergy()}')
    system = process.getSystem()
    return system


mols = []
for ligand in ['C1=CC=CC=C1',  'CC1=CC=CC=C1']:
    process = BSS.Parameters.gaff(ligand)
    parametrised_molecule = process.getMolecule()
    mols.append(parametrised_molecule)

mapping = BSS.Align.matchAtoms(mols[0], mols[1])
mols[0] = BSS.Align.rmsdAlign(mols[0], mols[1], mapping)
merged = BSS.Align.merge(mols[0], mols[1], mapping)

solvated_system = BSS.Solvent.tip3p(merged, box=3*[5*BSS.Units.Length.nanometer], work_dir='solvation')

minimise_protocol = BSS.Protocol.Minimisation(250)
minimised_sys = run_process(solvated_system, minimise_protocol, work_dir='minimize')

restraint_heavy = minimised_sys.getRestraintAtoms('heavy')
pert_mol = minimised_sys.getPerturbableMolecules()[0]
restraint_ligand = minimised_sys.getRestraintAtoms('all', minimised_sys.getIndex(pert_mol))

nvt_protocol = BSS.Protocol.Equilibration(runtime=5*BSS.Units.Time.picosecond, 
                                       temperature=300*BSS.Units.Temperature.kelvin, 
                                       restraint=restraint_heavy)

equilibrated_sys = run_process(minimised_sys, nvt_protocol, work_dir='nvt')

Output:

Traceback (most recent call last):
  File "/root/group_ceph/FEP/prepare/test_case/git/test.py", line 56, in <module>
    equilibrated_sys = run_process(minimised_sys, nvt_protocol, work_dir='nvt')
  File "/root/group_ceph/FEP/prepare/test_case/git/test.py", line 24, in run_process
    process = BSS.Process.Gromacs(system, protocol, work_dir=work_dir, ignore_warnings=True)
  File "/root/group_ceph/FEP/prepare/BioSimSpace/Process/_gromacs.py", line 156, in __init__
    self._setup()
  File "/root/group_ceph/FEP/prepare/BioSimSpace/Process/_gromacs.py", line 203, in _setup
    self._generate_config()
  File "/root/group_ceph/FEP/prepare/BioSimSpace/Process/_gromacs.py", line 375, in _generate_config
    self._add_position_restraints(config)
  File "/root/group_ceph/FEP/prepare/BioSimSpace/Process/_gromacs.py", line 2166, in _add_position_restraints
    raise ValueError(msg) from None
ValueError: Unable to find restrained atom in the system?

@lohedges
Copy link
Member Author

Thanks, I'll reproduce here later. I imagine it's something to do with the system being perturbable and the naming for elements (which is used for the search) being different in perturbable and non-perturbable molecules. We used to extract the lambda = 0 state for GROMACS so all molecules would have a property called element. Now we can simulate the perturbable system directly, so the perturbable molecules have a property called element0 instead.

@kexul
Copy link
Contributor

kexul commented Sep 28, 2021

Thanks! 😄

I would like to point out that passing an str type of restraint worked perfectly for a perturbable system.

nvt_protocol = BSS.Protocol.Equilibration(runtime=5*BSS.Units.Time.picosecond, 
                                       temperature=300*BSS.Units.Temperature.kelvin, 
                                       restraint='heavy')

@kexul
Copy link
Contributor

kexul commented Sep 28, 2021

I found that invoke self._reset_mappings() in the beginning of _getRelativeIndices do the trick for me.

def _getRelativeIndices(self, abs_index):

@lohedges
Copy link
Member Author

lohedges commented Sep 28, 2021

Yes, that's the main issue. (It should be done in the constructor, but isn't for some reason. I must have deleted it by accident at some point.) However, it still doesn't work properly due to the presence of perturbable and non-perturbable molecules in the same system. I'm just trying to fix this now.

@kexul
Copy link
Contributor

kexul commented Sep 28, 2021

Yes, that's the main issue. However, it still doesn't work properly due to the presence of perturbable and non-perturbable molecules in the same system. I'm just trying to fix this now.

Get it√ Thanks for your efforts! Much appreciated!

@lohedges
Copy link
Member Author

Okay, I think this is fixed. Let me know if you come across any more issues.

Cheers.

@kexul
Copy link
Contributor

kexul commented Sep 28, 2021

Okay, I think this is fixed. Let me know if you come across any more issues.

Thanks! I'll test it and report.

@kexul
Copy link
Contributor

kexul commented Nov 14, 2021

Hi @lohedges , would the dummy atoms be restrained when using position restraints?

In the following picture, the initial positions of the atoms outlined by two red boxes are near the real atoms, but they behave differently after equilibration with position restraints:
99b12d81d5043223ee8fe4fb3acc114

@lohedges
Copy link
Member Author

Hmmm, good question. The restraint handling is by element, so dummies won't be restrained if you, say, restrain heavy atoms. They should be if done by index, though. It also depends on how the dummies are handled for non-FEP simulations by the various engines, since they aren't dummy atoms in the normal sense. I'll check during the week.

Cheers.

@kexul
Copy link
Contributor

kexul commented Nov 14, 2021

The restraint handling is by element, so dummies won't be restrained if you, say, restrain heavy atoms. They should be if done by index, though.

Thanks, I did use restraint=heavy in NPT equilibration, I'll add an index-based restraint and try again.

@lohedges
Copy link
Member Author

lohedges commented Nov 14, 2021

No problem. You can combine restraints so should be able to do something like.

idxs = system.getRestraintAtoms("backbone") + [system.getIndex(atom) for atom in system.search("element XX", property_map={"element" : "element0"})]

(Here I'm assuming the lambda=0 state.)

@kexul
Copy link
Contributor

kexul commented Nov 15, 2021

OK, index-based restraints solved my problem, now the two molecules after equilibration overlap fairly well.
c27ad69acd0c72f32d65aa9b324e3dd

idxs = system.getRestraintAtoms("backbone") + [system.getIndex(atom) for atom in system.search("element XX", property_map={"element" : "element0"})]

Thanks for the code, new trick learned 😄

I used the following code since BioSimSpace kindly removes duplicated indexes internally.

sys.getRestraintAtoms('all', sys.getIndex(sys.getPerturbableMolecules()[0])) + sys.getRestraintAtoms('heavy', allow_zero_matches=True)

annamherz pushed a commit that referenced this issue Mar 5, 2024
Internal updates due to Sire API fixes
annamherz pushed a commit that referenced this issue Mar 5, 2024
annamherz pushed a commit that referenced this issue Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants