-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add minimal terms for ion mobility frames #365
Conversation
Seems reasonable, except what instrument has profile IM values? Waters has 200 bins and their masses are profile; 200 doesn't seem enough to count as profile. Bruker is 1000ish bins but the masses are explicitly centroided at acquisition. I'll be honest I didn't follow much of the linked issue discussion. |
@chambm What we want to express is whether or not the IM dimension is expected to contain a series of points for an analyte, or whether it has been processed s.t. each point refers to a distinct analyte (or two analytes so close we can't tell them apart). This is the same idea for Put another way, "Can I treat this spectrum as a peak list or do I need to do something to it before I can go off and use them?". The bin spacing is a valid point for IM peak resolution, but I don't think that's something we can get from the vendor software and will vary in definition by unit and/or vendor. |
For centroid spectra it's still extremely common to have multiple peaks for a single analyte, e.g. isotopes and charge states. Only a deconvolved and deisotoped centroided spectrum would (potentially?) be one peak per analyte (even then it depends how you define "analyte" I think). But if I understand what you're saying, when coming straight from ProteoWizard, IM representation would always be |
Apologies, I should have said ion, not analyte. But yes, you have the idea, I think. I'm not aware of any filters in MSConvert that collapse the IM dimension, except maybe |
ScanSumming just drops it entirely. I'm still not sure about this "centroid/profile" distinction in the IM dimension. I think it needs more consideration from other IM-interested folks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to me, but I am not very close to ion mobility data and software, so I am uncertain what the needs really are.
I admit I worked primarily on larger molecules with fairly wide mobilograms/arrival time distributions using Waters Cyclic and Bruker timsTOF data, where frames/cycles do form well-defined peaks, and my approach to both was to gather everything into isotopic pattern fitted groups and look for structure over the mobility and RT dimensions. The example file in the linked issue are single frame IM feature extractions at a single RT. The same idea is applied to timsTOF data in IonQuant, where m/z-IM-rt abundances are extracted from contiguous observations, but they use identification seeds instead of using traditional feature detection techniques. I think this article states the same idea is used in MaxQuant for timsTOF data. |
Hey there! new to this whole process so here are my 2 cents (as someone who currently writes software for timsTOF data), I think the distinction between the two representations makes a lot of sense, I have certainly struggled with what name to give intermediate representations of data processing where the ion-mobility dimension is ... 'centroided' ... without it being discarded and I also see the process as analogous to centroiding in the m/z dimension. (graphical illustration to make sure we are talking about the same thing, disregard the cluster index in the right panel) This is essentially what I implemented here: lazear/sage#166 if anyone is interested ... Right now it all happens in memory and not exported, but I can imagine wanting to export it to disk at some point. What issues do I see?
Regarding this point:
I certainly feel like it would have the same utility as the profile-centroid distinction in the m/z dimension, is there any reason why you feel that might not be the case? |
The 3D spectrum / ion mobility frame concept in general doesn't. That behavior is up to the reader to understand from the instrument metadata. That's why ProteoWizard defines "raw" frames differently depending upon what it infers from the TDF file contents, inferred by reading the distinct SQL tables: https://github.com/ProteoWizard/pwiz/blob/8a73de245643eeddb423505f11dd3e841bb78ec4/pwiz_aux/msrc/utility/vendor_api/Bruker/TimsData.cpp#L297-L471. The timsrust library does something similar. For Waters, depending upon the combination of configurations with cyclic IMS, SONAR, and other tricks, you get different topologies, but only the trivial MS1/low energy configuration matches the timsTOF configuration. |
@jspaezp I forgot to respond to this point. Are you referring to #361 (comment) 's |
I have adapted some examples from Waters cIMS HDMSe: The same done with timsTOF DDA-PASEF data on a similar sample: I don't have a DIA-PASEF example handy yet. @chambm what evidence would you need to see for the profile/centroid concept to make sense for ion mobility data? |
Are any of those plots showing "centroid IMS" data? |
No, none of these show centroids in the ion mobility dimension since I had interpreted your previous argument to be that you didn't think that the IM dimension fit the definition of being in profile mode because it was discretized. Centroid ion mobility would be singular points in the ion mobility dimension. The "idealized" scenario there is that you've done all the work you would want in the ion mobility dimension to collapse them down to a single point: Reduces to That is the use-case @timosachsenberg describes in the original issue, I think. |
Thanks, that's a useful illustration. Then you would (for some intermediate storage reason probably) want to write that back out to mzML? |
Right. It could then be read back in like a checkpoint or used for a step implemented elsewhere that doesn't know how to deal with IM profiles, since IM introduces another layer of signal processing. I struggled with justification because we're in a phase of all-in-one tools that read straight from vendor files and then don't write the data out for "public consumption" until its in a nice, well behaved CSV. |
Yes, or if one wants to align more closely with the concept of profile vs. centroid, I would suggest using a single point per isotopic peak. In all-in-one tools, this may not be strongly justified. However, given the significant data reduction, such a format could be ideal for modular tools that want to use open standards. Maybe something we can also discuss at HUPO-PSI if some of you folks are there. |
Closes #361
This adds a minimal set of terms that were important for #361, omitting special cases and obsolete representations.
Open questions remaining:
scan window lower limit
+ upper for ion mobility?