-
Notifications
You must be signed in to change notification settings - Fork 25
Working with trajectories in python
When working with trajectories in python, you have 2 basic options. You can use the C++ Trajectory class, or the more pythonic pyloos.Trajectory and its derivatives (more on them later).
There's nothing wrong with using the C++ Trajectory in your python code. It's fast and convenient, but the code isn't especially pythonic. For example, a typical loop over frames in a trajectory would look like:
// assume you've already defined an AtomicGroup model and Trajectory traj
while (traj.readFrame()) {
traj.updateGroupCoords(model);
// do something with the coordinates in model
}
The readFrame
call tries to read the next set of coordinates in from the file (returning false if it hits the end), then updateGroupCoords
copies the coordinates into the associated AtomicGroup. If you supply an integer to readFrame, it will read that specific frame, so things like skipping, striding, and random access are easily implemented by the tool writer. It's worth noting that skip and stride are trivially supported in the C++ if you use loos' program options functionality.
The goal of the pyloos.Trajectory class is to remove some of the manual nature of this and make a trajectory act more like an iterator. The equivalent python code would look like:
for frame in traj:
// do something with model
Note: in this case, frame is also an AtomicGroup
. In both the C++ and python examples, if you've already made additional AtomicGroup
s before the loops (e.g. by selecting from model
), you don't have to manually update them. Updating the root AtomicGroup
is sufficient.
The real advantage to working with pyloos.Trajectory
comes from the additional functionality available when you first instantiate it. Specifically, you can do something like:
// model is an AtomicGroup, traj_file is the name of the trajectory file, and skip_val and stride_val are input integers
traj = loos.pyloos.Trajectory(traj_file, model,
skip=skip_val, stride=stride_val)
for frame in traj:
// do something useful
This is far more convenient than manually tracking which frame to read next. In addition, you can specify an arbitrary list of frames to read:
frame_list = [0, 5, 7, -2]
traj = loos.pyloos.Trajectory(traj_file, model, iterator=iterator(frame_list))
for frame in traj:
// this will read frames 0, 5, 7, and the 2nd frame from the end
The are a few additional classes with more functionality available:
-pyloos.VirtualTrajectory
allows you to treat multiple pyloos.Trajectory
s as if they were one, removing the need to loop over trajectories and then frames.
-pyloos.AlignedVirtualTrajectory
goes one step further and pre-aligns the structures for you so that your code doesn't have to do so explicitly. It can use either a reference structure or iteratively align the trajectories to their average. The latter is computationally more expensive but will often produce more meaningful RMSDs.
The two VirtualTrajectory
classes support skip
and stride
, applied to the whole unified trajectory: specifying skip=5
will skip the first 5 frames of the combined trajectory, not of each individual trajectory. Similarly, if the number of frames in the first trajectory isn't a multiple of stride
, you will not start reading the next trajectory at its 0th frame. If that's not the behavior you want -- if for example you want to skip the first 5 frames of each trajectory -- you would specify that skip when instantiating the individual pyloos.Trajectory
objects.