-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEP tutorial comments #1
Comments
Hi @jmichel80 ,
|
Hi @JenkeScheen ,
|
Hi both, Here are a few comments:
Yes, we'll have standalone nodes for parameterisation and solvation. For the purposes of the tutorial I am assuming that you'll just have the pre-prepared systems so you would just talk about those stages and what the input/output would be, rather than actually running them. I'm don't think we will be able to use nodes for minimisation or equilibration though, since those would lose the concept of a system containing a merged molecule when writing the output to file. (We could just used the coordinates from the minimised / equilibrated systems and run
You can run minimisation / equilibration on a system containing a merged molecule. It simply creates the actual system at lambda = 0, then copies the coordinates back into the merged system when you call
If we want to run lambda windows in parallel then we are going to need to do this outside of BioSImSpace, i.e. just use it for the setup of the FEP simulation, as is done by The I'm still a little confused as to whether we will be assuming that the user will be setting up the FEP job on the system, or whether they are doing this locally and copying directories across. If the latter, then I don't think we're going to be able to reconstruct the |
Just reading this bit again...
Yes, that would work, i.e. equilibrating the complex prior to performing the merge. (This is what happens in your diagram, I think.) I imagine that this might confuse some people, i.e. solvating, equilibrating, extracting the molecules, merging, then solvating again. However, it's probably the easiest way if we want to decouple the stages. |
Hi, I have sketched the execution model here We can solve this in slum using a mixture of job, job arrays, job dependencies and a bit of bash scripting. We can build in some degree of resilience by having each script do some basic error checking (is the expected input present?). Ideally the pipeline is structured such that it could be executed again, not repeating steps that completed successfully. To keep it compatible with BSS philosophy it should be possible to execute the pipeline with different FEP engines (SOMD or Gromacs at it stands). I think this is possible as the user will specify the chosen executing engine in the planFEP notebook. I have uploaded a skeleton SLURM bash script that implements the execution model on plato. I have verified that each step is executed with correct dependencies using dummy python scripts. Next steps)
If the above can be implemented, the next step would be for @JenkeScheen to implement the required calls in the four different BSS execution scripts. The setup notebook would also have to be modified to generate as output the files 'ligands.dat', 'network.dat' , 'protocol.dat' and write the slurm bash scripts. |
Hmm, there are a lot of things here that aren't currently supported and would need to be implemented / tested if we want to do this in a clean way.
|
To support multiple restraint types it would also make far more sense to add a restraint option that can take matching keywords, e.g. |
Let me know what you think. |
Yes, the first three bullets were effectively what I was thinking, adding some logic to go from the SLURM indexes to the specific legs. I agree that restraints are important and it would be good to do this properly. I would like a more robust approach similar to what I suggested in my previous post. I already have the logic to translate backbone restraints for all engines that support them, and I could start working on doing the same for other restraints too. (If we agree on what we want to add.) I don't think this would be too bad, although I think some refactoring of the code would be needed. I would also much prefer a more flexible Let me know if you want me to start playing around with additional restraints and I'll start that ASAP. |
Ok let’s focus on the restraints. I will flesh out the run and analysis scripts tonight with SOMD specific commands. Will need some help with the Gromacs commands, presumably they can be extracted from what BSS would have written. We also need a good way to locate this BSS dependency. SOMD is easier since we can assume it may be found in $BSSHOME. You can follow the equilibration protocol of FESetup for the sequence of restraints if you need more specific details. Can post links to relevant parts of the code tonight if you need pointers. |
No problem. BioSImSpace will autodetect the path to GROMACS so it can write it out for us when the user specifies the engine. We can get the path using |
If possible, it would be good to agree on what type of restaints we want to support. In your example, I see things like the following:
I'm not sure that something this flexible would be easy to support across all engines, especially if just using keyword options for the restraint. Ideally we could use the search functionality to generate the list of atom indices involved in the restraint and pass those as an option instead. (Still allowing less flexible keywords too. So the Just to point out how fiddly all this is. Since we preserve atom naming by default (one of our core principles) it makes writing general restraint files a real pain. For example, if you take an AMBER format system and use
It doesn't complain when not adding restraints, or when passing a regular GROMACS system and using restraints. This makes me think that we'll have to handle all the atom index finding / masking, etc. manually for all packages and do everything explicitly by index. This would mean that we would also need to come up with robust ways of using the search functionality for detecting all types of atoms that we would want to restrain. |
I've created an issue thread for the discussion of equilibration restraints here. |
The ligand restraint is useful in case the initial pose has steric clashes and we don't want an early equilibration to push the ligand away from its starting binding mode. Basically we aim to relax a bit the protein and solvent around the starting pose in the hope that it will remain in place once restraints are released. If we cannot easily implement restraints that select ligand atoms we could skip this for now. From your post for #203 it looks like it could be some work to support flexible restraints for all supported MD engines. |
One downside of using a shell script for BSSanalyse.sh that just calls directly gmx mbar or analyse_freenrg is that the pipeline is not modular at this stage. We could have wanted to run an FEP with SOMD or GROMACS, and then use the same analysis tool for processing the previous outputs consistently to estimate a free energy. I will proceed with that route for now but we should make note of this issue. |
The script could detect the format of the output in the directories and select the appropriate analysis tool. I guess this would provide the flexibility that we want in the short-term, i.e. being able to run different bits with different engines and combine results. (Apologies if I've misunderstood the problem.) |
I've now added some functionality that you can do static analysis of an existing simulation using I've also added a |
I've added another update when so (N.B. The Sire macOS build from this morning timed out so all BioSimSpace macOS builds will fail until the re-run of the Sire build completes.) |
Hi @JenkeScheen ,
A few comments/questions after going through your setup notebook:
The notebook assumes the user has access to parameterised ligands (in vacuum) that have coordinates matching those of the parameterise protein input. We need separate BSS nodes to generate this type of input starting from 'sane' PDB/mol2 files.
The inputs haven't been energy minimised/equilibrated before starting the production runs. This may cause SOMD to crash. It would be better to equlibrate the solvated ligand0:complex, and then use a snapshot after equilbration to generate the merged molecule.
The slurm submission script will run each lambda value serially. It would be better to run the lambda windows in parallel to decrease time to answer.
The text was updated successfully, but these errors were encountered: