-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend lammpsdump to accept arbitrary columns #3608
Conversation
Hello @pstaerk! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2024-02-29 13:52:10 UTC |
Codecov Report
@@ Coverage Diff @@
## develop #3608 +/- ##
===========================================
+ Coverage 93.04% 94.12% +1.07%
===========================================
Files 172 190 +18
Lines 22731 24675 +1944
Branches 3308 3327 +19
===========================================
+ Hits 21150 23225 +2075
- Misses 1028 1388 +360
+ Partials 553 62 -491
Continue to review full report at Codecov.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3608 +/- ##
===========================================
- Coverage 93.38% 93.38% -0.01%
===========================================
Files 171 183 +12
Lines 21744 22843 +1099
Branches 4014 4023 +9
===========================================
+ Hits 20305 21331 +1026
- Misses 952 1025 +73
Partials 487 487 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really good start, see my comments. Main thing is I would add an __init__
kwarg and force people to specify what additional columns they want to save memory.
You will also need to add tests in the testsuite, feel free to add an additional small file if you need.
Also @pstaerk just keep an eye on PEP8 |
Hello! I have been following this set of issues and PRs that is looking to update the lammps dump readers. I really like this general solution to reading in arbitrary data, and I appreciate that it can auto-detect the columns in the dump file, so that all of them can be read in if needed. Just wanted to drop in and say thanks for your work on this :) |
@pstaerk just ping me when you want a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work @pstaerk,
A few things to change but its looking good overall. See my comments for details.
Additionally,
- You will need to address the PEP8 issues and formatting
- You will need to add some docs.
- You will need a CHANGELOG entry
- Don't forget to add yourself to AUTHORS also if you are not already on there.
:)
@hmacdope I hope that with this, I finally have all the things done that are required for the PR :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Few queries and changes suggested.
Would you also be able to fix conflicts? There were changes made to the parser in #3844 and you will need to work in with those. :)
Could you please also introduce yourself on the mailing list as merging this PR will make you part of the MDAnalysis community. :)
@@ -490,7 +525,7 @@ class DumpReader(base.ReaderBase): | |||
|
|||
@store_init_arguments | |||
def __init__(self, filename, lammps_coordinate_convention="auto", | |||
**kwargs): | |||
additional_columns=False, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say None
is more idiomatic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should also be solved with @hejamu work
# Create the data arrays for additional attributes which will be saved | ||
# under ts.data | ||
additional_keys = [] | ||
if len(attrs) > 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this check for >3?
@@ -425,6 +427,7 @@ def u(self, tmpdir, request): | |||
# no conversion needed | |||
f = LAMMPSDUMP | |||
else: | |||
# Select if one wants to use the additional column format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this comment is here for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you address this?
def u_additional_columns(self): | ||
f = LAMMPSDUMP_additional_columns | ||
top = LAMMPSdata_additional_columns | ||
yield (mda.Universe(top, f, format='LAMMPSDUMP', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just return the tuple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution. I haven't been able to do a full review but have a few comments/questions.
name of the data column. For instance, if you have time-dependent charges | ||
saved in a LAMMPS dump such as | ||
|
||
.. code-block:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use python formatting for this block, just use something generic
additional_columns=['q', 'l']) | ||
|
||
The additional data is then available for each time step via | ||
(as the value of the `data` dictionary, sorted by the ids of the atoms). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use reST role for the data attr
@@ -490,7 +525,7 @@ class DumpReader(base.ReaderBase): | |||
|
|||
@store_init_arguments | |||
def __init__(self, filename, lammps_coordinate_convention="auto", | |||
**kwargs): | |||
additional_columns=False, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur.
Linter Bot Results:Hi @pstaerk! Thanks for making this PR. We linted your code and found the following: Some issues were found with the formatting of your code.
Please have a look at the Please note: The |
28dca81
to
afa886e
Compare
afa886e
to
27cb7be
Compare
Sorry for being so slow @pstaerk, i'll have a look ASAP |
Sorry, won’t have time to review over the next few weeks.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fantastic work @pstaerk, will be a big quality of life improvement for LAMMPS users, given how much they use arbitrary columnar data. Sorry it has taken me so long to get to. Just a few improvements to make and we should be able to go ahead.
@@ -425,6 +427,7 @@ def u(self, tmpdir, request): | |||
# no conversion needed | |||
f = LAMMPSDUMP | |||
else: | |||
# Select if one wants to use the additional column format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you address this?
Co-authored-by: Hugo MacDermott-Opeskin <[email protected]>
Ok, I hope to have addressed all the requested points @hmacdope :) . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -156,6 +156,8 @@ | |||
"LAMMPSdata_deletedatoms", # with deleted atoms | |||
"LAMMPSdata_triclinic", # lammpsdata file to test triclinic dimension parsing, albite with most atoms deleted | |||
"LAMMPSdata_PairIJ", # lammps datafile with a PairIJ Coeffs section | |||
# structure for the additional column lammpstrj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments after data
@@ -166,6 +168,8 @@ | |||
"LAMMPSDUMP_chain1", # Lammps dump file with chain reader | |||
"LAMMPSDUMP_chain2", # Lammps dump file with chain reader | |||
"LAMMPS_chain", # Lammps data file with chain reader | |||
# lammpsdump file with additional data (an additional charge column) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments after data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work all! I am happy for this to go ahead. 🥇
Kicking CI |
CI is not cooperating, trying again. |
Fixes #
Changes made in this Pull Request:
This handles issues #3504 and addresses PR #3448.
PR Checklist