Extend lammpsdump to accept arbitrary columns #3608

pstaerk · 2022-04-06T14:01:31Z

Fixes #

Changes made in this Pull Request:

This updates the lammpsdump Parser to be able to parse arbitrary data columns such as charge etc.

This handles issues #3504 and addresses PR #3448.

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

pep8speaks · 2022-04-06T14:01:34Z

Hello @pstaerk! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file testsuite/MDAnalysisTests/datafiles.py:

Line 159:80: E501 line too long (85 > 79 characters)
Line 170:80: E501 line too long (106 > 79 characters)
Line 557:80: E501 line too long (89 > 79 characters)
Line 558:80: E501 line too long (94 > 79 characters)

Comment last updated at 2024-02-29 13:52:10 UTC

codecov · 2022-04-06T14:20:12Z

Codecov Report

Merging #3608 (d5e0310) into develop (95791dd) will increase coverage by 1.07%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #3608      +/-   ##
===========================================
+ Coverage    93.04%   94.12%   +1.07%     
===========================================
  Files          172      190      +18     
  Lines        22731    24675    +1944     
  Branches      3308     3327      +19     
===========================================
+ Hits         21150    23225    +2075     
- Misses        1028     1388     +360     
+ Partials       553       62     -491

Impacted Files	Coverage Δ
package/MDAnalysis/coordinates/LAMMPS.py	`96.29% <100.00%> (+6.61%)`	⬆️
package/MDAnalysis/lib/NeighborSearch.py	`96.42% <0.00%> (-3.58%)`	⬇️
package/MDAnalysis/topology/tpr/obj.py	`96.96% <0.00%> (-3.04%)`	⬇️
...onality_reduction/DimensionalityReductionMethod.py	`97.05% <0.00%> (-2.95%)`	⬇️
...sis/analysis/encore/clustering/ClusteringMethod.py	`95.45% <0.00%> (-1.43%)`	⬇️
package/MDAnalysis/converters/RDKitParser.py	`96.21% <0.00%> (-0.57%)`	⬇️
package/MDAnalysis/topology/guessers.py	`99.21% <0.00%> (-0.02%)`	⬇️
package/MDAnalysis/units.py	`100.00% <0.00%> (ø)`
package/MDAnalysis/lib/_cutil.pyx	`100.00% <0.00%> (ø)`
package/MDAnalysis/lib/mdamath.py	`100.00% <0.00%> (ø)`
... and 122 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95791dd...d5e0310. Read the comment docs.

codecov · 2022-04-06T14:29:50Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.38%. Comparing base (2c1aa4b) to head (bac1f4c).

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3608      +/-   ##
===========================================
- Coverage    93.38%   93.38%   -0.01%     
===========================================
  Files          171      183      +12     
  Lines        21744    22843    +1099     
  Branches      4014     4023       +9     
===========================================
+ Hits         20305    21331    +1026     
- Misses         952     1025      +73     
  Partials       487      487

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hmacdope

Really good start, see my comments. Main thing is I would add an __init__ kwarg and force people to specify what additional columns they want to save memory.

You will also need to add tests in the testsuite, feel free to add an additional small file if you need.

package/MDAnalysis/coordinates/LAMMPS.py

hmacdope · 2022-04-07T09:39:48Z

Also @pstaerk just keep an eye on PEP8

aliehlen · 2022-05-02T23:18:48Z

Hello! I have been following this set of issues and PRs that is looking to update the lammps dump readers. I really like this general solution to reading in arbitrary data, and I appreciate that it can auto-detect the columns in the dump file, so that all of them can be read in if needed. Just wanted to drop in and say thanks for your work on this :)

testsuite/MDAnalysisTests/coordinates/test_lammps.py

hmacdope · 2022-05-16T11:33:49Z

@pstaerk just ping me when you want a review.

package/MDAnalysis/coordinates/LAMMPS.py

hmacdope

Thanks for the great work @pstaerk,

A few things to change but its looking good overall. See my comments for details.
Additionally,

You will need to address the PEP8 issues and formatting
You will need to add some docs.
You will need a CHANGELOG entry
Don't forget to add yourself to AUTHORS also if you are not already on there.

:)

package/MDAnalysis/coordinates/LAMMPS.py

testsuite/MDAnalysisTests/coordinates/test_lammps.py

package/MDAnalysis/coordinates/LAMMPS.py

pstaerk · 2022-11-08T13:35:19Z

@hmacdope I hope that with this, I finally have all the things done that are required for the PR :).

hmacdope

Looking good! Few queries and changes suggested.

Would you also be able to fix conflicts? There were changes made to the parser in #3844 and you will need to work in with those. :)

Could you please also introduce yourself on the mailing list as merging this PR will make you part of the MDAnalysis community. :)

package/MDAnalysis/coordinates/LAMMPS.py

hmacdope · 2022-11-09T11:12:49Z

package/MDAnalysis/coordinates/LAMMPS.py

@@ -490,7 +525,7 @@ class DumpReader(base.ReaderBase):

    @store_init_arguments
    def __init__(self, filename, lammps_coordinate_convention="auto",
-                 **kwargs):
+                 additional_columns=False, **kwargs):


I would say None is more idiomatic here.

should also be solved with @hejamu work

hmacdope · 2022-11-09T11:14:19Z

package/MDAnalysis/coordinates/LAMMPS.py

+        # Create the data arrays for additional attributes which will be saved
+        # under ts.data
+        additional_keys = []
+        if len(attrs) > 3:


Why is this check for >3?

hmacdope · 2022-11-09T11:15:56Z

testsuite/MDAnalysisTests/coordinates/test_lammps.py

@@ -425,6 +427,7 @@ def u(self, tmpdir, request):
            # no conversion needed
            f = LAMMPSDUMP
        else:
+            # Select if one wants to use the additional column format


Not sure what this comment is here for?

Can you address this?

hmacdope · 2022-11-09T11:16:45Z

testsuite/MDAnalysisTests/coordinates/test_lammps.py

+    def u_additional_columns(self):
+        f = LAMMPSDUMP_additional_columns
+        top = LAMMPSdata_additional_columns
+        yield (mda.Universe(top, f, format='LAMMPSDUMP',


I would just return the tuple?

testsuite/MDAnalysisTests/coordinates/test_lammps.py

package/AUTHORS

orbeckst

Thank you for the contribution. I haven't been able to do a full review but have a few comments/questions.

package/CHANGELOG

package/MDAnalysis/coordinates/LAMMPS.py

orbeckst · 2022-11-11T22:22:23Z

package/MDAnalysis/coordinates/LAMMPS.py

+    name of the data column. For instance, if you have time-dependent charges
+    saved in a LAMMPS dump such as
+
+    .. code-block:: python


don't use python formatting for this block, just use something generic

package/MDAnalysis/coordinates/LAMMPS.py

orbeckst · 2022-11-11T22:24:49Z

package/MDAnalysis/coordinates/LAMMPS.py

+                         additional_columns=['q', 'l'])
+
+    The additional data is then available for each time step via
+    (as the value of the `data` dictionary, sorted by the ids of the atoms).


use reST role for the data attr

package/MDAnalysis/coordinates/LAMMPS.py

orbeckst · 2022-11-11T22:26:41Z

package/MDAnalysis/coordinates/LAMMPS.py

@@ -490,7 +525,7 @@ class DumpReader(base.ReaderBase):

    @store_init_arguments
    def __init__(self, filename, lammps_coordinate_convention="auto",
-                 **kwargs):
+                 additional_columns=False, **kwargs):


package/MDAnalysis/coordinates/LAMMPS.py

github-actions · 2023-09-29T17:26:16Z

Linter Bot Results:

Hi @pstaerk! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location	Outcome
main package	⚠️ Possible failure
testsuite	⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/8097120276/job/22127421709

Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

hejamu · 2023-09-29T17:47:43Z

@orbeckst I implemented the changed discussed offline. None is now the default for additional_columns and True parses all columns that are not parsable already.

@hmacdope maybe you can take another look also.

Docs still need tuning.

hejamu · 2024-01-30T16:50:41Z

Pinging @hmacdope @orbeckst :)

hmacdope · 2024-01-30T16:52:27Z

Sorry for being so slow @pstaerk, i'll have a look ASAP

orbeckst · 2024-01-30T17:39:51Z

Sorry, won’t have time to review over the next few weeks.

hmacdope

This is fantastic work @pstaerk, will be a big quality of life improvement for LAMMPS users, given how much they use arbitrary columnar data. Sorry it has taken me so long to get to. Just a few improvements to make and we should be able to go ahead.

package/MDAnalysis/coordinates/LAMMPS.py

hmacdope · 2024-02-12T20:26:59Z

testsuite/MDAnalysisTests/coordinates/test_lammps.py

@@ -425,6 +427,7 @@ def u(self, tmpdir, request):
            # no conversion needed
            f = LAMMPSDUMP
        else:
+            # Select if one wants to use the additional column format


Can you address this?

testsuite/MDAnalysisTests/coordinates/test_lammps.py

package/MDAnalysis/coordinates/LAMMPS.py

Co-authored-by: Hugo MacDermott-Opeskin <[email protected]>

pstaerk · 2024-02-26T14:40:56Z

Ok, I hope to have addressed all the requested points @hmacdope :) .

hmacdope

Two tiny nitpicks on formatting, but other than that this is good to go. Amazing amazing work @pstaerk, and @hejamu this will be such a fantastic boost for LAMMPS users.

hmacdope · 2024-02-27T22:31:57Z

testsuite/MDAnalysisTests/datafiles.py

@@ -156,6 +156,8 @@
    "LAMMPSdata_deletedatoms",  # with deleted atoms
    "LAMMPSdata_triclinic",  # lammpsdata file to test triclinic dimension parsing, albite with most atoms deleted
    "LAMMPSdata_PairIJ",  # lammps datafile with a PairIJ Coeffs section
+    # structure for the additional column lammpstrj


Comments after data

hmacdope · 2024-02-27T22:32:05Z

testsuite/MDAnalysisTests/datafiles.py

@@ -166,6 +168,8 @@
    "LAMMPSDUMP_chain1", # Lammps dump file with chain reader
    "LAMMPSDUMP_chain2", # Lammps dump file with chain reader
    "LAMMPS_chain", # Lammps data file with chain reader
+    # lammpsdump file with additional data (an additional charge column)


Comments after data.

testsuite/MDAnalysisTests/datafiles.py

hmacdope

Great work all! I am happy for this to go ahead. 🥇

hmacdope · 2024-02-27T23:40:22Z

Kicking CI

hmacdope · 2024-03-01T00:09:30Z

CI is not cooperating, trying again.

github-actions bot added the Component-Readers label Apr 6, 2022

hmacdope self-assigned this Apr 6, 2022

schlaicha mentioned this pull request Apr 7, 2022

WIP: Lammpsdump velocities (and other fields) #3448

Closed

3 tasks

hmacdope requested changes Apr 7, 2022

View reviewed changes

aliehlen mentioned this pull request May 3, 2022

Allow LammpsDumpParser Topology reader more flexibility in coordinate column data. #3449

Open

hmacdope reviewed May 8, 2022

View reviewed changes

testsuite/MDAnalysisTests/coordinates/test_lammps.py Outdated Show resolved Hide resolved

pstaerk requested a review from hmacdope May 17, 2022 08:10

philippmisof reviewed May 19, 2022

View reviewed changes

package/MDAnalysis/coordinates/LAMMPS.py Outdated Show resolved Hide resolved

hmacdope requested changes May 24, 2022

View reviewed changes

RMeli reviewed Jun 28, 2022

View reviewed changes

package/MDAnalysis/coordinates/LAMMPS.py Outdated Show resolved Hide resolved

IAlibay mentioned this pull request Aug 19, 2022

Favour Aux over ts.data or possibly unify the two? #3778

Open

hmacdope requested changes Nov 9, 2022

View reviewed changes

orbeckst reviewed Nov 11, 2022

View reviewed changes

Philipp Stärk added 10 commits September 29, 2023 13:52

Extended mdanalysis to accept other attributes as well.

89c7363

Able to parse arbitrary columns now.

99b7875

Tried to fix most of the pep8 problems.

36e9bb6

First try at testing the additional column part.

3406e5b

Testing multi read columns as well.

87ed9ce

Fix the no additional columns case.

efca264

Implemented requested changes to docs.

2f02395

Implemented the requested changes to the tests.

c8f61db

Incorporated the PEP8 comments.

140b062

Third round of PEP...

0365404

hejamu force-pushed the extend_lammpsdump branch from 28dca81 to afa886e Compare September 29, 2023 17:37

refine tests

27cb7be

hejamu force-pushed the extend_lammpsdump branch from afa886e to 27cb7be Compare September 29, 2023 17:38

hmacdope requested changes Feb 12, 2024

View reviewed changes

pstaerk and others added 6 commits February 26, 2024 14:42

Small typo

1e8557b

Co-authored-by: Hugo MacDermott-Opeskin <[email protected]>

Removed comment

07d0c18

Addressed hmacdope's comments regarding issue link

5f36ff4

Addressed hmacdope's comments regarding file paths.

e45e764

Added warning if keys are not in lammpsdump file.

546dd43

Tested the formatting error of additional_columns.

adec794

hejamu added 5 commits February 27, 2024 22:27

Make changed lines comply with pep8

57b895d

80 is indeed longer than 79...

d054946

Merge remote-tracking branch 'origin/develop' into extend_lammpsdump

bb568b3

Fix test

9f14a8a

Fix test

bcb044f

hmacdope requested changes Feb 27, 2024

View reviewed changes

Don't format datafiles.py

8ef649c

hmacdope approved these changes Feb 27, 2024

View reviewed changes

Added test of warning.

bac1f4c

hmacdope mentioned this pull request Mar 1, 2024

Test timeout on multiprocessing Pool XTC test in CI #4475

Closed

hmacdope merged commit 3c83b8f into MDAnalysis:develop Mar 4, 2024
23 of 24 checks passed

hejamu mentioned this pull request Mar 4, 2024

Make other properties time dependent (e.g. charges) #3504

Closed

Extend lammpsdump to accept arbitrary columns #3608

Extend lammpsdump to accept arbitrary columns #3608

Conversation

pstaerk commented Apr 6, 2022 • edited by hmacdope Loading

PR Checklist

pep8speaks commented Apr 6, 2022 • edited Loading

Comment last updated at 2024-02-29 13:52:10 UTC

codecov bot commented Apr 6, 2022 • edited Loading

Codecov Report

codecov bot commented Apr 6, 2022 • edited Loading

Codecov Report

hmacdope left a comment

Choose a reason for hiding this comment

hmacdope commented Apr 7, 2022

aliehlen commented May 2, 2022

hmacdope commented May 16, 2022

hmacdope left a comment

Choose a reason for hiding this comment

pstaerk commented Nov 8, 2022

hmacdope left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 29, 2023 • edited Loading

Linter Bot Results:

hejamu commented Sep 29, 2023

hejamu commented Jan 30, 2024

hmacdope commented Jan 30, 2024

orbeckst commented Jan 30, 2024 via email

hmacdope left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pstaerk commented Feb 26, 2024

hmacdope left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmacdope left a comment

Choose a reason for hiding this comment

hmacdope commented Feb 27, 2024

hmacdope commented Mar 1, 2024

pstaerk commented Apr 6, 2022 •

edited by hmacdope

Loading

pep8speaks commented Apr 6, 2022 •

edited

Loading

codecov bot commented Apr 6, 2022 •

edited

Loading

codecov bot commented Apr 6, 2022 •

edited

Loading

github-actions bot commented Sep 29, 2023 •

edited

Loading

hmacdope left a comment •

edited

Loading