Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spectrum refactoring, Spectrum2DCollection #305

Merged
merged 37 commits into from
Sep 26, 2024
Merged

Conversation

ajjackson
Copy link
Collaborator

@ajjackson ajjackson commented Jul 12, 2024

As part of a major refactor of Abins, we would like to use a Spectrum2DCollection object which behaves similarly to Spectrum1DCollection. This allows the code to largely treat 1D and 2D data interchangably, simplifying the logic and providing access to select/groupby "table-like" operations on the data.

This is implemented by factoring out the "collection"-specific methods to a "mixin" class. This gives a reasonable class heirarchy in which Spectrum1D, Spectrum2D, Spectrum1DCollection and Spectrum2DCollection are all children of Spectrum but not of each other.

In the process, we also found some limitations and inefficiencies in the existing Spectrum objects:

  • iteration over Spectrum1DCollection is very slow due to redundant copying, checks and unit conversion
    • to allow cheaper iterations over metadata without creating new objects, this PR adds an iter_metadata method that iterates over metadata lines, merging the top-level and "line-data" parts on the fly. This is also a good building block to simplify some of the other metadata-handling methods.
  • select() and group_by() do not properly handle metadata that is common to all rows (and so in the top-level of the metadata dict). This is made more robust in reimplementations of those methods.

@ajjackson ajjackson changed the title Spectrum refactoring, Spectrum2d collection Spectrum refactoring, Spectrum2DCollection Jul 12, 2024
Copy link
Contributor

github-actions bot commented Jul 12, 2024

Test Results

    22 files  ± 0      22 suites  ±0   30m 6s ⏱️ +50s
 1 054 tests + 6   1 048 ✅ + 6   6 💤 ±0  0 ❌ ±0 
10 480 runs  +60  10 417 ✅ +60  63 💤 ±0  0 ❌ ±0 

Results for commit 0f7ec81. ± Comparison against base commit 8268881.

♻️ This comment has been updated with latest results.

@ajjackson ajjackson added this to the v1.4.0 milestone Jul 22, 2024
@ajjackson ajjackson added the enhancement New feature or request label Jul 22, 2024
@ajjackson ajjackson assigned ajjackson and unassigned ajjackson Jul 22, 2024
@ajjackson ajjackson force-pushed the spectrum-2d-collection branch 2 times, most recently from 53d8b77 to 92b3af4 Compare July 31, 2024 14:56
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 87.12871% with 13 lines in your changes missing coverage. Please review.

Project coverage is 95.47%. Comparing base (7608b14) to head (a50fc98).
Report is 20 commits behind head on master.

Files Patch % Lines
euphonic/spectra.py 87.12% 11 Missing and 2 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #305      +/-   ##
==========================================
- Coverage   95.85%   95.47%   -0.39%     
==========================================
  Files          28       28              
  Lines        3982     4021      +39     
  Branches      803      816      +13     
==========================================
+ Hits         3817     3839      +22     
- Misses         97      109      +12     
- Partials       68       73       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ajjackson ajjackson force-pushed the spectrum-2d-collection branch 2 times, most recently from 8a5a1d0 to 58306c3 Compare September 18, 2024 10:41
- This version of select() should be more robust in dealing with
  parameters that exist in "top level" of metadata dict
- I hope it is also easier to understand
Some of this was done in another branch, and this one rebased
Pylint doesn't like the call to ._set_spectrum_data on another class
instance because it doesn't understand that this instance was just
created with the current class. I think that means the warning is ok
to suppress.
This adds toolz as a dependency to the main package. We don't
anticipate that causing a lot of problems; it is a small, stable,
pure-python library also available on conda-forge.
Linters hate the named lambdas. I think they are quite nice because
they are compact and immediately draw attention to the "one-liner"
they attach a name to...

but maybe the more explicit form with type hints will make it easier
for someone to understand in future.
This structure seems a bit more legible and should reduce redundancy
in Spectrum2DCollection
- Refactor re-used data import to use pytest fixture
- Remove initial from_spectra test; new one covers it all
- Test from_spectra with inconsistent input
- Test mixin-supplied methods
@ajjackson ajjackson force-pushed the spectrum-2d-collection branch from 58306c3 to 56dfcb9 Compare September 18, 2024 13:06
@ajjackson ajjackson marked this pull request as ready for review September 18, 2024 13:06
@ajjackson ajjackson requested a review from oerc0122 September 18, 2024 13:07
Copy link
Collaborator

@oerc0122 oerc0122 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few code-style questions. Functionally, I'm trusting it's good.

euphonic/spectra.py Outdated Show resolved Hide resolved
euphonic/spectra.py Show resolved Hide resolved
euphonic/spectra.py Show resolved Hide resolved
euphonic/spectra.py Outdated Show resolved Hide resolved
euphonic/spectra.py Show resolved Hide resolved
euphonic/spectra.py Show resolved Hide resolved
Make things a little cleaner and more idiomatic

Co-authored-by: Jacob Wilkins <[email protected]>
euphonic/spectra.py Show resolved Hide resolved
euphonic/spectra.py Outdated Show resolved Hide resolved
Comment on lines 925 to 928
select_key_values = dict(
(key, (value,)) if isinstance(value, (int, str)) else (key, value)
for key, value in select_key_values.items()
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you've got toolz does this mean we could do this with a mapvalues?

- This avoids some of the repetition in dict comprehensions to remove
  an element
- Here we also slightly rework _combine_metadata so it is clearer what
  each variable represents.
list comprehension is little clunky but avoids the 1-length special
case: cleaner overall
Via discussion / pair-programming with @oerc0122

- Use native dict comprehension over keyfilter in iter_metadata: it's
  a bit ugly but no more complicated, and should be easier to read
  "casually"

- Clearer comment re: value-pair combination

- Replace a lambda with named partial function and toolz complement
@ajjackson ajjackson requested a review from oerc0122 September 26, 2024 10:54
euphonic/spectra.py Outdated Show resolved Hide resolved
@ajjackson ajjackson enabled auto-merge (squash) September 26, 2024 13:56
@ajjackson ajjackson merged commit ada9e25 into master Sep 26, 2024
9 checks passed
@ajjackson ajjackson mentioned this pull request Sep 26, 2024
@ajjackson ajjackson deleted the spectrum-2d-collection branch December 12, 2024 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants