Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of concept - ParallelAnalysisBase and extensible backend support #4269

Closed
wants to merge 5 commits into from

Conversation

p-j-smith
Copy link
Member

@p-j-smith p-j-smith commented Aug 29, 2023

Related to #4158, #4162

This is a proof of concept of how to implement ParallelAnalysisBase and allow downstream developers and end users to easily add their own backends.

I've opened this pr only to help with discussion in #4158 - I do not intend to go forwards with it and will close it and continue the discussion in #4158.

Changes compared to #4162:

I've also:

  • updated RMSD analysis to use ParallelAnalysisBase and added a single test to illustrate that the multiprocessing backend can used with this implementation

Edit:To reduce the amount of duplicated code between AnalysisBase and ParallelAnalysisBase, the _compute method has been moved into AnalysisBaseand the _setup_computation_groups method had been simplified to remove code duplicated in _setup_frames

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4269.org.readthedocs.build/en/4269/

@github-actions
Copy link

github-actions bot commented Aug 29, 2023

Linter Bot Results:

Hi @p-j-smith! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ✅ Passed

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/6026902586/job/16350879245


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

@codecov
Copy link

codecov bot commented Aug 29, 2023

Codecov Report

Patch coverage: 60.55% and project coverage change: -0.15% ⚠️

Comparison is base (957430b) 93.40% compared to head (f3927b7) 93.26%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4269      +/-   ##
===========================================
- Coverage    93.40%   93.26%   -0.15%     
===========================================
  Files          169      184      +15     
  Lines        22204    23416    +1212     
  Branches      4064     4088      +24     
===========================================
+ Hits         20740    21839    +1099     
- Misses         948     1057     +109     
- Partials       516      520       +4     
Files Changed Coverage Δ
package/MDAnalysis/analysis/__init__.py 100.00% <ø> (ø)
package/MDAnalysis/analysis/base.py 82.77% <58.97%> (-14.18%) ⬇️
package/MDAnalysis/analysis/backends.py 59.25% <59.25%> (ø)
package/MDAnalysis/analysis/rms.py 93.10% <100.00%> (+0.08%) ⬆️

... and 14 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful to have code for discussion.

I am going to block the PR just to be able to enforce your stated intentions:

I've opened this pr only to help with discussion in #4158 - I do not intend to go forwards with it and will close it and continue the discussion in #4158.

Comment on lines 759 to 770
if frames is None:
start, stop, step = self._trajectory.check_slice_indices(start, stop, step)
used_frames = np.arange(start, stop, step)
elif not all(opt is None for opt in [start, stop, step]):
raise ValueError("start/stop/step cannot be combined with frames")
else:
used_frames = frames

if all((isinstance(obj, bool) for obj in used_frames)):
arange = np.arange(len(used_frames))
used_frames = arange[used_frames]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like code that's duplicated in AnalysisBase.run() and would eventually be refactored.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I hadn't spotted that duplication. I've updated ParallelAnalysisBase to remove this duplication as well as that in the inner loop the in ParallelAnalysisBase._compute and AnalysisBase.run

Comment on lines 819 to 827
for idx, ts in enumerate(ProgressBar(trajectory, verbose=verbose, **progressbar_kwargs)):
i = frame_indices[idx]
self._frame_index = i
self._ts = ts
self.frames[i] = ts.frame
self.times[i] = ts.time
self._single_frame()
logger.info("Finishing up")
return self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we refactor the inner loop so that the same code is shared between AnalysisBase and ParallelAnalysisBase?

I am trying to get a sense if we can mitigate the class proliferation problem by keeping as much code as possible shared.

Comment on lines +854 to +855
if backend is None:
return super().run(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that.

)

# Start preparing the run
super()._setup_frames(trajectory=self._trajectory, start=start, stop=stop, step=step, frames=frames)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be self._setup_frames() I think (?).

@p-j-smith
Copy link
Member Author

closing as this pr won't be developed any further, but I think comments should still be possible once closed

@p-j-smith p-j-smith closed this Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants