-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add SpectrumLookup example #471
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThe documentation for mass spectra filtering has been updated. The changes expand on the efficient use of Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant API
participant SpectrumLookup
User->>API: Call readSpectra(inp, "scan=(?<SCAN>\\d+)")
Note over API: Load spectra efficiently without unwanted data
API->>SpectrumLookup: Invoke findByScanNumber(v_scan_nrs)
SpectrumLookup-->>API: Return filtered spectra
API-->>User: Deliver filtered MSExperiment data
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (2)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/source/user_guide/ms_data.rst (2)
656-657
: Spelling and Clarity in PeakFileOptions ExplanationThe added text clarifies that unwanted data is not loaded into memory when using :py:class:
~.PeakFileOptions
. Please correct “chose” to “choose” to improve grammatical accuracy and consider a slight rephrase for clarity.
697-717
: New SpectrumLookup Example ImplementationThis example effectively demonstrates how to use the :py:class:
~.SpectrumLookup
class to extract spectra based on vendor scan numbers extracted from native IDs. A few points for improvement:
- Regex Pattern Verification: The code uses the pattern
"scan=(?<SCAN>\d+)"
. Note that in standard Python regex syntax, named groups are usually defined with(?P<SCAN>\d+)
. If the SpectrumLookup API specifically requires the<SCAN>
syntax, it would be helpful to include a comment or reference to the documentation confirming this.- Variable Naming: In the loop, the variable
v_scan_nrs
represents a single scan number. Renaming it (e.g., tov_scan_nr
) could improve clarity.- Example Consistency: The comment notes that the test.mzML file contains 4 spectra starting at scan=19, yet the example only processes scan numbers
[19, 20]
. Make sure that this example accurately represents the expected data set or add a clarifying comment if the intent is to filter a subset.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/source/user_guide/ms_data.rst
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: build-test
- GitHub Check: build-test
🔇 Additional comments (2)
docs/source/user_guide/ms_data.rst (2)
691-693
: Clarification Note for Scan NumbersThe note explaining that the scan numbers correspond to the indices in the mzML file rather than the vendor scan numbers is useful. It provides important context, especially in cases where the data might have been previously sliced or filtered.
694-696
: Section Title for SpectrumLookup ExampleThe new section title “Advanced Filtering of NativeID via SpectrumLookup” is clearly set off with the appropriate underline style and clearly signals the introduction of a new advanced filtering technique.
# Bruker may have: | ||
# <spectrum index="0" id="scan=19" defaultArrayLength="15"> | ||
# thus we can use (this would also work for Thermo native IDs) | ||
lookup.readSpectra(inp, "scan=(?<SCAN>\d+)") ## required: creates an internal look-up table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot but don't we autodetect between multiple regexes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do that in some downstream tools only by calling getRegexFromNativeId.
And we also have a default value for the regex in readSpectra for which the overload is missing in the pxd
fixes #7526
Summary by CodeRabbit