Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge-traj makes un-reimagable xtcs #117

Closed
lgsmith opened this issue Feb 21, 2024 · 7 comments
Closed

merge-traj makes un-reimagable xtcs #117

lgsmith opened this issue Feb 21, 2024 · 7 comments

Comments

@lgsmith
Copy link
Contributor

lgsmith commented Feb 21, 2024

Describe the bug

I had a dimer protein from OpenMM output with triclinic imaging that was broken up across many small trajectories. I wasn't expecting merge-traj to reimage this--just concatenate all the files together in a way that I could 're-run' rather than figuring out how to append in some clever fashion on my own. The input data had the classic issue where half of the dimer sometimes landed in a different periodic image--annoying for me because I wanted to follow up with fundamentally aperiodic calculations on the sampled conformations. So I tried all the reimaging tricks I know to unify the halves of the complex, but they were failing.

On a whim I grabbed one of the small input XTCs I'd applied merge-traj to previously. I watched it to verify that its imaging was goofy. But then I used cpptraj with autoimage on this fragment, and it unified the dimer with no issue, and also redistributed solvent around the image to make it look like a truncated octrahedron (instead of the triclinic cell openmm returns). In retrospect cpptraj and mdtraj weren't able to figure out what was up with merge-traj's output since both seemed to reimage the triclinic as some kind of cube.

To Reproduce

  1. Run several sequential openmm simulations of a truncated octahedron set up by tleap. Make sure you're telling openmm to keep the system imaged (I believe this is default behavior...)
  2. Concatenate them using merge-traj.
  3. Visualize merge to make sure it looks like the fragmented inputs
  4. Reimage fragments using cpptraj (autoimage and use the prmtop you used in step 1) and visualize to see the solvent box looking less like a whacky cube and more like a truncated octahedron.
  5. Apply same autoimage script to the merged traj.
  6. Visualize the 'reimaged' merge.

Expected behavior

I expect that the fragments and the merge would have all the same metadata (except frame count), and would therefore respond to autoimage the same.

LOOS version and platform

This is with 4.04 from conda on linux.

Additional context

Happy to provide trajectories if needed. It would be...interesting, if you couldn't replicate this with like, a water box.

Copy link
Member

Can you share the command line you used and the model? I'm confused because if you didn't turn on imaging it shouldn't have touched anything. If you did, an unlucky choice can put you in a bad spot that's hard to fix.

@lgsmith
Copy link
Contributor Author

lgsmith commented Feb 21, 2024

No reimaging requested. These are openmm simulations of a truncated octahedron, which it does by putting everything in the triclinic funky box I'm sure we've all seen before, so imaging with loos tools is not currently possible (#104 ).

I'd prefer a less public channel to share the model I was using, if possible. Perhaps I can simply email you with a DL link of some kind?

The command line looked like:

merge-traj $model $out $targets

Where:

  • model was a PDB that'd been stripped such that one could use it to read the trajectories (by loos, from a solvated prmtop, if that matters).
  • out was a path to a file called merged.xtc which was in a subdirectory of the working dir.
  • targets=$(find $raw_data_dir/ -name positions.xtc | sort -V) and where $raw_data_dir is just a folding at home project's data dir. This is thousands of small trajs, (an excellent application for this tool!) so copypasting the output from loos's repeat command line might not actually be what you want here.

@lgsmith
Copy link
Contributor Author

lgsmith commented Feb 21, 2024

Also here's the PDB making script I used to process the solvated inpcrd and prmtop:

import loos
from loos import pyloos
from pathlib import Path
import argparse
from sys import argv

p = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
p.add_argument('prmtop', type=Path,
               help='Coordinate free system specifying file that you would like to add coords to.')
p.add_argument('coords', type=Path,
               help='File with coordinates to match the atoms in "prmtop". Can be single structure or traj.')
p.add_argument('outfile', type=Path,
               help='Path where output PDB representing system should be written.')
p.add_argument('-s', '--selection', type=str, default='(all)',
               help='LOOS selection string to apply to the prmtop after it is matched to coordinates.')
p.add_argument('--index-to-use', '-i', type=int, default=0,
               help='Index into trajectory file loaded to parse coordinates.')

args = p.parse_args()
print('Working from:', args.prmtop)
model = loos.createSystem(str(args.prmtop))
coord_trj = pyloos.Trajectory(str(args.coords), model, subset=args.selection)
coord_ag = coord_trj.readFrame(args.index_to_use)
bb = loos.selectAtoms(coord_ag, 'backbone')
last_res = bb[len(bb) -1].resid()
chains_ag = loos.selectAtoms(coord_ag, f'resid <= {last_res}')
chains_vec = chains_ag.splitByMolecule()
for i, chain in enumerate(chains_vec):
    chain_label = chr(ord('@') + i)
    for at in chain:
        at.chainId(chain_label)
pdb = loos.PDB.fromAtomicGroup(coord_ag)
pdb.remarks().add(' '.join(argv))
args.outfile.write_text(str(pdb))

@lgsmith
Copy link
Contributor Author

lgsmith commented Feb 26, 2024

Hey folks, is there anything missing here? Were you able to view the files I sent?

@agrossfield
Copy link
Member

The problem is the triclinic cell. At this point, we do not support anything other than 90 degree cell angles, to the point that I'm not actually sure that we even consistently bother to read in the box angles.

I think at one point we discussed preserving the angles, but a) I think they're handled very differently by different file formats (I have a vague recollection that some store 3 vectors while others store 3 amplitudes and 3 angles), so "preserving" isn't as easy as it sounds, and b) until we actually handle imaging operations correctly with non-orthogonal angles, keeping the angles could create a false sense of security.

Handling other boxes is on the long-term todo list, but it's a pretty major revision: we'll have to add infrastructure to store the box type and have those classes do the imaging (and then make sure that there aren't any codes that do their imaging "manually" rather than using AtomicGroup.reImage(). This is doable, but it's a fair amount of work deep in the library.

@lgsmith
Copy link
Contributor Author

lgsmith commented Feb 29, 2024

So to be clear, we aren't even willing to copy whatever is there. Because that's all I wanted or needed at this point.

@agrossfield
Copy link
Member

I don't think so, because it won't be robust. If, as I recall, one format stores angles and one stores vectors, we can't passively copy through without understanding -- we'd end up ok if you did dcd to dcd but not dcd to xtc.

The only safe thing to do is to say we don't support non-orthorhombic boxes and you're in danger if you try to use loos with them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants