Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added tqdm bar for: pca.transform() #4531

Merged
merged 11 commits into from
Mar 28, 2024

Conversation

SampurnaM
Copy link
Contributor

@SampurnaM SampurnaM commented Mar 25, 2024

Fixes #3773

Changes made in this Pull Request:

  • Added an tqdm function over pca.transform() iterations, so that verbose = True shows a progress bar like pca.PCA.run() ( As per comments in the issue)
  • I made a separate class called ProgressDual for ProgressBar with two iterables instead of one, but feels like an overkill so haven't added/pushed to this commit.

( I am very much a beginner, please let me know where I have messed up/ skipped something obvious!)

PR Checklist

Developers certificate of origin

@pep8speaks
Copy link

pep8speaks commented Mar 25, 2024

Hello @SampurnaM! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 427:80: E501 line too long (80 > 79 characters)

Comment last updated at 2024-03-28 19:08:38 UTC

Copy link

github-actions bot commented Mar 25, 2024

Linter Bot Results:

Hi @SampurnaM! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ✅ Passed

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/8472460092/job/23214620251


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

@orbeckst
Copy link
Member

Thank you for your contribution. I edited your comments so that the PR is properly linked to issue #3773 and use standard notation for the check boxes.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution.

Please address my inline comments and

  • add yourself to AUTHORS
  • add an entry to CHANGELOG

I am not 100% sure if we test progress bars so there may be additional requests for tests. However, to demonstrate that it works for you, please run your code yourself and attach a screenshot to this issue in which you show what it looks like with the new progress bar.

Comment on lines 420 to 423
if verbose == True:
self.disabled = False
else:
self.disabled = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to store the value in the new attribute self.disabled?

I think you'll only ever use it in this function.

So unless there are really important reasons (please explain!), remove self.disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially going to check if there were any other open issues for adding ProgressBar/verbose option in PCA, so I saved it so we can reuse it if needed, but you are right, it doesn't seem so. Removed now.

package/MDAnalysis/analysis/pca.py Outdated Show resolved Hide resolved
Comment on lines 427 to 428
# print("Iteration", i, "Timestep",ts)
# verbose=self._verbose, desc="Transform progress"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented out code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

package/MDAnalysis/analysis/pca.py Outdated Show resolved Hide resolved

from ..lib import util
from ..due import due, Doi
from .base import AnalysisBase
from tqdm.auto import tqdm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move imports of other packages above the MDA imports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -382,6 +383,7 @@ def transform(self, atomgroup, n_components=None, start=None, stop=None,
Include every `step` frames in the PCA transform. If set to
``None`` (the default) then every frame is analyzed (i.e., same as
``step=1``).
verbose: Boolean,optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a .. versionadded:: 2.8.0 to this new kwarg (see other doc strings for how to do it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the detailed guidance!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added now :)

self.disabled = True

# for i, ts in tqdm(enumerate(traj[start:stop:step]), desc="Transform progress"):
for i, ts in tqdm(enumerate(traj[start:stop:step]), disable=self.disabled, desc="Transform progress"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just write

disable=not verbose

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check PEP8: this line almost certainly needs to be broken

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies. Done now.

@@ -561,7 +571,7 @@ def project_single_frame(self, components=None, group=None, anchor=None):
for res in group.residues:
# n_common is the number of pca atoms in a residue
n_common = pca_res_counts[np.where(
pca_res_indices == res.resindex)][0]
pca_res_indices == res.resindex)][0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't format other code — it makes the diff noisy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo this one, autopep8 is less clear than the original.

package/MDAnalysis/analysis/pca.py Show resolved Hide resolved
@@ -398,7 +400,7 @@ def transform(self, atomgroup, n_components=None, start=None, stop=None,
if isinstance(atomgroup, Universe):
atomgroup = atomgroup.atoms

if(self._n_atoms != atomgroup.n_atoms):
if (self._n_atoms != atomgroup.n_atoms):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formatting change is ok as it was really, really ugly ;-).

Also remove the parentheses.

Suggested change
if (self._n_atoms != atomgroup.n_atoms):
if self._n_atoms != atomgroup.n_atoms:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link

codecov bot commented Mar 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.63%. Comparing base (f5335b4) to head (92d5541).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4531      +/-   ##
===========================================
- Coverage    93.65%   93.63%   -0.03%     
===========================================
  Files          168      180      +12     
  Lines        21215    22295    +1080     
  Branches      3908     3908              
===========================================
+ Hits         19869    20876    +1007     
- Misses         888      961      +73     
  Partials       458      458              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a few fixes (see comments) and please post a screen shot showing that it works locally.

Comment on lines 17 to 27
03/26/24 orbeckst, SampurnaM

* 2.8.0

Fixes:
* Added a tqdm progress bar for ``MDAnalysis.analysis.pca.PCA.transform()``
It can be displayed or disabled using ``verbose`` parameter
``True = display | False = display off``
(PR #4531)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add everything to the existing 2.8.0 block: It should look like


??/??/?? IAlibay, HeetVekariya, marinegor, lilyminium, RMeli,
         ljwoods2, aditya292002, pstaerk, PicoCentauri, BFedder.
         SampurnaM

 * 2.8.0

and then add under Enhancements

* Added a tqdm progress bar for `MDAnalysis.analysis.pca.PCA.transform()`
  (PR #4531)

(Minimal detail is sufficient, and break lines a 79 characters)

package/MDAnalysis/analysis/pca.py Show resolved Hide resolved
self.results.p_components = self._p_components[:, :n]
self._n_components = n

def transform(self, atomgroup, n_components=None, start=None, stop=None,
step=None):
step=None, verbose=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the default to False because it's not clear what None will do from the docs. If it behaves like False then be clear and set it to False in the first place.

@@ -391,14 +397,17 @@ def transform(self, atomgroup, n_components=None, start=None, stop=None,
.. versionchanged:: 0.19.0
Transform now requires that :meth:`run` has been called before,
otherwise a :exc:`ValueError` is raised.
.. versionadded:: 2.8.0
Transform now has shows a tqdm progressbar, which can be toggled
on with verbose = True, or off with verbose = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. When adding to the changes at the bottom of the class, this must be .. versionchanged:: 2.8.0 because there's only one time that the whole class was added. (Alternatively one can add versionchanged/added for individual blocks – see docs on numpy docs/sphinx). The way you're doing is her is fine, just change to versionchanged.
  2. you can use markup
    on with ``verbose = True``, or off with ``verbose = False``

@@ -561,7 +571,7 @@ def project_single_frame(self, components=None, group=None, anchor=None):
for res in group.residues:
# n_common is the number of pca atoms in a residue
n_common = pca_res_counts[np.where(
pca_res_indices == res.resindex)][0]
pca_res_indices == res.resindex)][0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo this one, autopep8 is less clear than the original.

@orbeckst orbeckst self-assigned this Mar 26, 2024
@SampurnaM
Copy link
Contributor Author

Also adding the screenshots here in case somebody else wants to have look in the future:

image
screebshot_jupyterlab

@SampurnaM
Copy link
Contributor Author

Oh no.
I mistakenly added the package/MDAnalysis/lib/log.py to the last commit, and pushed the commit before running darker. Sorry.

What I did to reverse the commit:

git reset HEAD~
Unstaged changes after reset:
M package/CHANGELOG
M package/MDAnalysis/analysis/pca.py
M package/MDAnalysis/lib/log.py
git rebase -i HEAD~1
error: cannot rebase: You have unstaged changes.
error: Please commit or stash them.
git add package/MDAnalysis/analysis/pca.py
git add package/CHANGELOG
git commit -m "Updated fixes for #4531"
[pca_transform_progress 7d63aa8] Updated fixes for #4531
Committer: sampurna <sampurna@JoyRamkrishna.>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly:

git config --global user.name "Your Name"
git config --global user.email [email protected]

After doing this, you may fix the identity used for this commit with:

git commit --amend --reset-author

2 files changed, 11 insertions(+), 19 deletions(-)
git rebase -i 1fe9e8cfc~1
error: cannot rebase: You have unstaged changes.
error: Please commit or stash them.
git push origin pca_transform_progress
To https://github.com/SampurnaM/mdanalysis.git
! [rejected] pca_transform_progress -> pca_transform_progress (non-fast-forward)
error: failed to push some refs to 'https://github.com/SampurnaM/mdanalysis.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. If you want to integrate the remote changes,
hint: use 'git pull' before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

##Issue on autopep8:
I tried to reformat line 574 on pca.py using Notepad to reverse the auto formatting.
darker --diff -r upstream/develop package/MDAnalysis -L flake8

package/MDAnalysis/analysis/pca.py:576:13: E101 indentation contains mixed spaces and tabs [flake8]

package/MDAnalysis/authors.py:1:1: E265 block comment should start with '# ' [flake8]

package/MDAnalysis/core/PR_prep_driver_code_sm_24_3_24.py:1:1: E265 block comment should start with '# ' [flake8]

And now there is a conflict in package/AUTHORS, so I can't push the new commit :(

package/MDAnalysis/core/PR_prep_driver_code_sm_24_3_24.py:3:1: F401 'MDAnalysis.tests.datafiles.NCDF' imported but unused [flake8]
package/MDAnalysis/core/PR_prep_driver_code_sm_24_3_24.py:3:89: E501 line too long (89 > 88 characters) [flake8]

@orbeckst
Copy link
Member

On GitHub there's a "Resolve conflicts" button that is a quick way to resolve simple conflicts. Try using it , see https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-on-github for help. For CHANGELOG, just rearrange things so that your new entry is at the top of the subsection and that your user handle is at the end of the contributors list for the release.

@orbeckst
Copy link
Member

Don't worry about keeping the history of the PR pretty, just push commits as they come. We will squash-merge everything into a single commit on merge.

@orbeckst
Copy link
Member

Please ping me with @orbeckst when you think the PR is ready for a possibly final review.

@SampurnaM
Copy link
Contributor Author

@orbeckst : I think I am ready for the final review. Please let me know :)

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of quick comments - edit: whomever reviews next please feel free to discard my review!

package/CHANGELOG Outdated Show resolved Hide resolved
package/CHANGELOG Outdated Show resolved Hide resolved
@@ -561,7 +573,7 @@ def project_single_frame(self, components=None, group=None, anchor=None):
for res in group.residues:
# n_common is the number of pca atoms in a residue
n_common = pca_res_counts[np.where(
pca_res_indices == res.resindex)][0]
pca_res_indices == res.resindex)][0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line contains a tab space, could you change this to only be standard spaces please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you notice?!

But yes, no TABS!!!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you notice?!

The honest answer is that the formatting was bothering me (I usually newline at the [ and then do a 4 space identation, etc...), so I was dragging along to count the number of spaces and my mouse jumped..

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Nice work!

Please address @IAlibay 's comments.

package/CHANGELOG Outdated Show resolved Hide resolved
@@ -561,7 +573,7 @@ def project_single_frame(self, components=None, group=None, anchor=None):
for res in group.residues:
# n_common is the number of pca atoms in a residue
n_common = pca_res_counts[np.where(
pca_res_indices == res.resindex)][0]
pca_res_indices == res.resindex)][0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you notice?!

But yes, no TABS!!!

@IAlibay IAlibay merged commit dfcd06f into MDAnalysis:develop Mar 28, 2024
21 of 23 checks passed
@orbeckst
Copy link
Member

Congratulations @SampurnaM on your first merged PR! Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PCA not showing progress bar on mean structure calculation
4 participants