Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues converting Waters raw to mzML #1922

Closed
jonaheaton opened this issue Dec 17, 2021 · 52 comments
Closed

Issues converting Waters raw to mzML #1922

jonaheaton opened this issue Dec 17, 2021 · 52 comments

Comments

@jonaheaton
Copy link

jonaheaton commented Dec 17, 2021

I'm am trying to convert Waters raw data originating from a XEVO-G2XSQTOF machine to an mzML format. When I use the msconvert windows GUI app, it crashes without any error message at all, and using the command line I get the following message: "caught unknown exception" please report this error

I am using the vendor peak picking ("peakPicking true 1-") and looking in the raw data folder, I do see that a centroid.raw file is created and there are data files inside with what I suspect is the properly centroided data. However MSconvert appears to run into an error before this data is written to mzML file.
command_output.txt
I don't think I am the first one with this issue:
Example 1: #775
Example 2: https://sourceforge.net/p/proteowizard/mailman/message/36714564/

Both the "_HEADER.TXT" and the "_extern.inf " have three functions, although there appears to be .DAT and .IDX files for a 4th function in the raw data. Deleting those files associated with the 4th function doesn't help.

Thanks for any help you can give!
Best,
Jonah

@chambm
Copy link
Member

chambm commented Dec 17, 2021

Hi Jonah,

Are you using the latest pwiz version? Do you get the crash without vendor peak picking? I'll probably need to see the data to reproduce this myself and report it to Waters. Are you able to share it?

@jonaheaton
Copy link
Author

Hi thank you for the super fast response!
Yes I am using the latest version:
The version I ran with a docker image: ProteoWizard release: 3.0.21334 (00c4b77)
The version I ran with the windows GUI: ProteoWizard 3.0.21350 64-bit

It does NOT crash when I remove the Vendor peak picking. I need to get permission before sharing, but I suspect that won't be a problem. I'll get back to you when I get the "ok"

@jonaheaton
Copy link
Author

I got permission to share one of the data files with you. What is the best way to share ~1.3 gb file with you?

@chambm
Copy link
Member

chambm commented Dec 17, 2021

Check my profile to send me an email and I'll send you a place to upload to.

@chambm
Copy link
Member

chambm commented Dec 20, 2021

I was able to get your file converted with peakPicking by deleting/moving Func003 instead of Func004. I don't understand why. That function's data files are very small though, so I doubt much is lost. What kind of instrument method (configuration of functions) was used to collect these data?

@jonaheaton
Copy link
Author

jonaheaton commented Dec 20, 2021

Huh, that is weird. Thank you for looking into that! I was told that the method of acquisition is MSE, but beyond that I don't have any other details about the method and instrument. Looking at the header file, it looks like the instrument is Xevo G2-XS QTof. I'll have to ask those who performed the experiments and get back to you more details.

@SivanXW
Copy link

SivanXW commented Nov 8, 2022

Hello, has this issue been solved yet? I met the totally same problem when I tried to convert Waters raw data to mzML format using the vendor peak picking.
Thanks for your kind help!

@chambm
Copy link
Member

chambm commented Nov 8, 2022

No real fixes for this issue AFAIK. Does it crash without vendor peak picking? Are there a small func00*.* files you can move to a subdirectory to get it working?

@SivanXW
Copy link

SivanXW commented Nov 9, 2022

Thanks for your response. It works well without vendor peak picking. I haven't found a func00* files as you mentioned.
But it's weird that when I restarted my computer and try to covert individual files instead of batch files, it worked suddenly.

@Sanchezillana
Copy link

Sanchezillana commented Dec 8, 2022

Hi! I have the similar vendor peakpicking issue converting Waters .raw DDA files. Same LC and MS methods, same everything and sometimes the files are converted and sometimes not. 6 functions (1xM1, 3xMS2, 1xMS1(lockmass), 1xUV). I use vendor peakpicking and subscan filtering 1-4 and it works only for some files. Also, if I copy the files again from the PC connected to the LC-MS to my PC, then different files worked! What is going on with Waters??
With the CWT works fine, I am considering using it...

@chambm
Copy link
Member

chambm commented Dec 8, 2022

Waters is working on an SDK update that will provide a special DDA processing mode for Waters data. Should be merged in the next week or so. I can't say for sure it'll be more stable than the current SDK but they are engaged on it. But hearing that the same data sometimes crashes and sometimes doesn't is somewhat disturbing. It suggests some undefined behavior in the SDK where it may or may not crash depending on what's in uninitialized memory.

@Sanchezillana
Copy link

Thank you for your answer and good news. Crossing fingers with that SDK, I am very interested in processing DDA data from Waters. If you need my files for testing or something please let me know. Any advice for working with this undefined behavior meanwhile?

@chambm
Copy link
Member

chambm commented Feb 7, 2023

@Sanchezillana OK, MSConvertGUI is updated with a Waters DDA processing filter that will enable their newly added DDA processing mode. However it seems to do everything EXCEPT peak picking. I have to do CWT peak picking, then the results are good. Still waiting on Waters to tell me if the lack of peak picking is a bug.

@Sanchezillana
Copy link

@Sanchezillana OK, MSConvertGUI is updated with a Waters DDA processing filter that will enable their newly added DDA processing mode. However it seems to do everything EXCEPT peak picking. I have to do CWT peak picking, then the results are good. Still waiting on Waters to tell me if the lack of peak picking is a bug.

Hi @chambm ! Thank you for your answer. I have tried now with the Waters DDA filter on the msconvert version 3.0.23039-e4357b8 from sourceforge and I have the same issue with the precursor ion in my file (among others). The only diference is that the lockmass scans are removed. My files were acquired in fastDDA mode+PDA scan (UV scan) with lockmass correction aplied during the acquisition.

I can share a file with you: https://universitatdevalencia-my.sharepoint.com/:u:/g/personal/angel_illana_uv_es/EWWIb-FBuTpPsyd8qdC_8jgB4t1PAJBC-oSre_u_kW1o7A?e=CdmPxs

I checked:

  1. conversion of raw file with CWT pickPicking filter only = strange file with the UV scans + precursor ion issue
  2. conversion of raw file with CWT pickPicking filter + Waters DDA filter = FAILS
  3. manual delete of UV functions ( _FUNC006.DAT and _FUNC006.IDX) + CWT pickPicking filter = Works with precursor ion issue and include the lockmass scans
  4. manual delete of UV functions ( _FUNC006.DAT and _FUNC006.IDX) + CWT pickPicking filter + Waters DDA filter = Works with precursor ion issue but delete the lockmass scans.

Maybe I am using the wrong version?

@chambm
Copy link
Member

chambm commented Feb 9, 2023

I can repro the DDA processing failure with the UV functions. I'll pass that along to Waters because it looks like it's in their code.

When I test your file after removing the UV function manually, I do see differences between the precursor m/z with DDA processing on vs. off. So what do you mean "precursor ion issue"? And is excluding the lockmass scans a problem? You want to keep those?

@Sanchezillana
Copy link

I mean that when I open the converted files (point 4 in my coment) with MzMine I noted that the precursor ion is still not the correct one. It is cool that the lockmass was deleted, 0 problem with that!
imagen

@chambm

This comment was marked as resolved.

@Sanchezillana
Copy link

I mean that the mass in the field "precursor ion" of MS2 scan is not the exact mass of the ion picked in MS1 for fragmentation. When I converted files for Thermo, for example, this doesn't happen.

@elnurgar
Copy link

With thermo files it doesn't happen. It happens with Waters and Bruker data. Therefore I prepared a script that can solve this problem.

@elnurgar
Copy link

The MS1 and MS2 scans are well calibrated, only the precursor ion value is not calibrated https://github.com/elnurgar/mzxml-precursor-corrector

@chambm
Copy link
Member

chambm commented Feb 10, 2023

OK yes I see the issue now. The DDA processing does seem to change the precursor m/z, but it doesn't seem to be fully lockmass correcting it. Is it supposed to do so @pete-reay-waters ?

@chambm
Copy link
Member

chambm commented Feb 10, 2023

With thermo files it doesn't happen. It happens with Waters and Bruker data. Therefore I prepared a script that can solve this problem.

In the 2000s it was a pretty common problem with Thermo FT/Orbi such that there are two different filters in msconvert to deal with it: precursorRecalculator and precursorRefiner! But I don't think those were ever extended to work with other vendors and TOFs. Waters really should have this fixed in their SDK.

@pete-reay-waters
Copy link
Contributor

Thanks for the info @chambm I've added it to our backlog. I'm out of office next week, but we'll investigate.

@Sanchezillana
Copy link

OK yes I see the issue now. The DDA processing does seem to change the precursor m/z, but it doesn't seem to be fully lockmass correcting it. Is it supposed to do so @pete-reay-waters ?

Thank you guys @elnurgar @chambm @pete-reay-waters . I thouth that this wrong precursor ion mz was the center of the quadrupole isolation window instead of the correct MS1 exact mass, but... is it something related to the lockmass? What a mess. I will try your script @elnurgar with my data and give you feedback.

@elnurgar
Copy link

@Sanchezillana, I don't think that it is related to the lock mass. I don't know what is the issue with Waters, as it has a lock mass and calibrates spectra during the analysis. On Bruker Impact II, the calibration solution is injected in the beginning of each run, and Bruker calibrates spectra afer acquisition of each chromatogram. Therefore, the values of precursors retained for the fragmentation during the analysis can be drifted. However, after acquisition, when we vizualize spectra on vendor software we see the correct m/z precursor value and when we export to mzML or mzXML this value is not calibrated. However the error is not so high as with Waters data. With Bruker data we have 50-80 ppm drift, whereas with Waters data almost 1000 ppm.

@Sanchezillana
Copy link

Sanchezillana commented Feb 11, 2023

@Sanchezillana, I don't think that it is related to the lock mass. I don't know what is the issue with Waters, as it has a lock mass and calibrates spectra during the analysis. On Bruker Impact II, the calibration solution is injected in the beginning of each run, and Bruker calibrates spectra afer acquisition of each chromatogram. Therefore, the values of precursors retained for the fragmentation during the analysis can be drifted. However, after acquisition, when we vizualize spectra on vendor software we see the correct m/z precursor value and when we export to mzML or mzXML this value is not calibrated. However the error is not so high as with Waters data. With Bruker data we have 50-80 ppm drift, whereas with Waters data almost 1000 ppm.

So, does anyone known the real reason? I found that answer by Waters support team months ago: https://support.waters.com/KB_Inf/MassLynx/WKB28313_Why_doesnt_the_MSMS_set_mass_in_the_header_of_the_spectrum_window_match_the_peak_mass?mt-learningpath=dda_qanda

I understood that was some problem related with the mass calibration and the way of MassLynxs saves the precursor ion, since it saves the quadrupole isolation window instead of the precursor exact mass measured in the survey MS1 scan.

@Sanchezillana
Copy link

@elnurgar I ran your program after removal (del /s command on cmd) of UV scans and msconvert convertion to mzML (CWT peakpeaking and Waters DDA filter). Aparently, I've have nice .mzML files when I opened it on MzMine3. I'll give you more feedback if I found something during data processing.

@elnurgar
Copy link

@Sanchezillana thank you for your first feedback. My colleague that works on Bruker Impact II told me recently that the error on his machine was also important at about 120 ppm. This problem can also occur sometimes on orbitrap, but rarely. No information about shimadzu or sciex.

@pete-reay-waters
Copy link
Contributor

Hi @chambm regarding the two issues above, we have logged them and plan to fix them in the next version, which should be available by the summer.

  • INFDAAP-113: DDA processing fails when non-MS functions are present
  • INFDAAP-112: Wrong lockmass correction of precursor m/z

We believe the cause of the second one is that the spectra have been lockmass corrected at source (before being processed by msconvert). The logic thus disables lockmass correction for the dataset. However, that also impacts the precursor m/z. We have plans to change the logic in the MassLynx SDK to allow correction of the precursor, even when the spectra are already corrected.

@Sanchezillana
Copy link

@elnurgar I ran your program after removal (del /s command on cmd) of UV scans and msconvert convertion to mzML (CWT peakpeaking and Waters DDA filter). Aparently, I've have nice .mzML files when I opened it on MzMine3. I'll give you more feedback if I found something during data processing.

I tried also to do the same at the files centroided with MassLynx (since the vendor peakpicking doesn't works well on msconvert for Waters and DDA correction). The mass that I obtained are not exactly the same if the peaks are centroided with msconvert CWT. More accurate if converted before with MassLynx.

@tsufz
Copy link

tsufz commented Feb 27, 2023

Hi @pete-reay-waters, thank you so much for the information. This action would be very appreciated by your user community. I talked to many people working in nontargeted analysis. They will be delighted, if Waters' mass spectra will be available for processing in their workflows.

Yours
Tobias

@GSH-09
Copy link

GSH-09 commented Apr 29, 2023

@Sanchezillana OK, MSConvertGUI is updated with a Waters DDA processing filter that will enable their newly added DDA processing mode. However it seems to do everything EXCEPT peak picking. I have to do CWT peak picking, then the results are good. Still waiting on Waters to tell me if the lack of peak picking is a bug.

Are there any special settings to correctly convert profile full scan data + DDA data with lockmass correction?

I exported .RAW from Unifi, used MSConvertGUI to create the mzML (downloaded yesterday; used "generic" and I didn't add any filters), and reviewed data using MzMine3. The full scan MS1 data was there, but the DDA MS2 data was not there. It was also unclear if MSConvert used lockmass data to correct for mass accuracy.

edit: looks like MS2 data is present, but are all being attributed to one incorrect precursor m/z (537.5 neg esi), at least when using MzMine3.

edit2: I tried Waters DDA Processing, but got an error:
Failed - System.Exception: [pwiz::CLI::msdata::ReaderList::read] Unhandled exception: Incorrect acquisition type
at pwiz.CLI.msdata.ReaderList.read(String filename, MSDataList results, ReaderConfig config)
at MSConvertGUI.MainLogic.processFile(String filename, Config config, ReaderList readers, Map2 usedOutputFilenames) at MSConvertGUI.MainLogic.Go(Config config, Map2 usedOutputFilenames)

@greencodefairy
Copy link

Hi @pete-reay-waters, has there been some update regarding the Waters' issue(s) described in this thread?
Yours,
Laura

@pete-reay-waters
Copy link
Contributor

Hi @pete-reay-waters, has there been some update regarding the Waters' issue(s) described in this thread? Yours, Laura

Hi Laura @greencodefairy

We're currently investigating the issue with vendor peak picking crashing on DDA data, this is on our Kanban board now. Re the other two issues I'll provide an update early next week; we haven't addressed them yet but they definitely haven't been forgotten, I just need to check with the revelevant person on Tuesday if we have any more information.

Kind regards
Pete

@pete-reay-waters
Copy link
Contributor

Hi, another brief update on this. We are now actively working on the crashes in centroiding of DDA data.

Re the other two mentioned above,

  • INFDAAP-113: DDA processing fails when non-MS functions are present. We feel this one will be a simple fix and we will address it in due course

  • INFDAAP-112: Wrong lockmass correction of precursor m/z. The fix for this one is more involved. I understand that the issue occurs when you have data which has been lockmass corrected at acquisition time ("realtime" lockmass correction), then the set mass is not corrected later; the complication is then to apply 'partial' lockmass processing in msconvert. As a workaround, we can suggest not using realtime lockmass correction, in which case the lockmass processing in msconvert will correct both the spectra and set mass.

We'd welcome feedback on the second issue - is the workaround acceptable, or is this an essential fix for anyone?

@septermus
Copy link

Hi. Have just spent several month collecting a bunch of Synapt data for Metaboloics in what would seem to be the worst of all combnation (MS1, MS2, Centroid with on the fly Lockmass correction, plus UV, then found all the above issue! Do we have a fix from Waters on the problem?

@pete-reay-waters
Copy link
Contributor

Hi @septermus we have fixes for several issues including those two above - the fixes will be coming in November

@septermus
Copy link

That will be great! My birthday is 3rd of November so I'll take it as a much welcome present to be able to get all my data analyzed properly! Will you be announcing it on this thread?

@pete-reay-waters
Copy link
Contributor

Yes, once we've pushed the changes to pwiz's main branch we'll mention it in this thread.

Our expectation is the changes might be later in November than the 3rd, but hopefully you can still celebrate soon.

@septermus
Copy link

Great. In the meantime, don't suppose you would know why MassLynx 4.2 (which is for Win 10) won't plot TIC's of and chromatograms when files are loaded. I can see the spectra fine, do all manipulation but no line based plot are visible, the plot area is completely blank. Occurs for both centroid and profile data. Data is from Synapt GS and GSi and occurs for any file type MS, MS2, MSe. This happens on every Windows 10 PC I have tried (4 now!). Tried different screen resolutions but always the same. Data looks fine in MassLynx4.1 on Win 7 PC's. Any advice greatly appreciated.
Regards
Simon

@pete-reay-waters
Copy link
Contributor

@septermus sorry I have not worked on the MassLynx application so I'll have to direct you to the normal support channels for MassLynx - hopefully they'll be able to get you sorted. Let me know if you need help finding that.

@Sanchezillana
Copy link

Sanchezillana commented Oct 26, 2023

Do you know this program https://microapps.on-demand.waters.com/home/showmarkdown/data-as-a-product ? They claim that manage to convert waters .raw to MZml and fix the lockmass precursor issue.

I don't check it but maybe you can use their approach for correcting the issues.

Ángel SI

@Sanchezillana
Copy link

Hi @septermus we have fixes for several issues including those two above - the fixes will be coming in November

This sounds great! Thank you for your hard work on this!

@pete-reay-waters
Copy link
Contributor

Do you know this program https://microapps.on-demand.waters.com/home/showmarkdown/data-as-a-product ? They claim that manage to convert waters .raw to MZml and fix the lockmass precursor issue.

I don't check it but maybe you can use their approach for correcting the issues.

Ángel SI

Thanks for sharing this link. The data-as-a-product app is made by the team I work on.

The data-as-a-product app actually uses msconvert internally to produce the mzML.

As a further layer, internally msconvert uses the Waters MassLynx SDK to read the Raw data.

In fact the changes we are talking about bringing into msconvert in November (in this thread) are already included in the data-as-a-product application, which has a private build of msconvert and the MassLynx SDK included. We are currently working on creating a new release of the MassLynx SDK, and will be submitting a PR to msconvert so that all users of msconvert get the benefits of these fixes.

I'd also mention some other value that the data-as-a-product app adds to the core functionality msconvert provides:

  • You can specify a quad isolation window to include in the mzML for DDA data, which some 3rd party software requires (it is down to the user to enter the lower and upper offset they wish to include in the file).

  • Additional UI

    • Wizard for queueing existing injections
    • Show status of export queue
    • Supports sample list settings for acquire and process workflows in MassLynx

@greencodefairy
Copy link

Do you know this program https://microapps.on-demand.waters.com/home/showmarkdown/data-as-a-product ? They claim that manage to convert waters .raw to MZml and fix the lockmass precursor issue.
I don't check it but maybe you can use their approach for correcting the issues.
Ángel SI

Thanks for sharing this link. The data-as-a-product app is made by the team I work on.

The data-as-a-product app actually uses msconvert internally to produce the mzML.

As a further layer, internally msconvert uses the Waters MassLynx SDK to read the Raw data.

In fact the changes we are talking about bringing into msconvert in November (in this thread) are already included in the data-as-a-product application, which has a private build of msconvert and the MassLynx SDK included. We are currently working on creating a new release of the MassLynx SDK, and will be submitting a PR to msconvert so that all users of msconvert get the benefits of these fixes.

I'd also mention some other value that the data-as-a-product app adds to the core functionality msconvert provides:

* You can specify a quad isolation window to include in the mzML for DDA data, which some 3rd party software requires (it is down to the user to enter the lower and upper offset they wish to include in the file).

* Additional UI
  
  * Wizard for queueing existing injections
  * Show status of export queue
  * Supports sample list settings for acquire and process workflows in MassLynx

Hey @pete-reay-waters, thanks for sharing, I signed up and tried out Waters DataConnect by importing existing files and successfully converted .raw to .mzML. Comparing the file in MZmine 3 with the same file converted with another script Waters2mzML, it still does not assign MS2 precursor ions correctly (for all same m/z 625.00). Is this something, that you currently address?

Regards,
Laura

@greencodefairy
Copy link

Do you know this program https://microapps.on-demand.waters.com/home/showmarkdown/data-as-a-product ? They claim that manage to convert waters .raw to MZml and fix the lockmass precursor issue.
I don't check it but maybe you can use their approach for correcting the issues.
Ángel SI

Thanks for sharing this link. The data-as-a-product app is made by the team I work on.

The data-as-a-product app actually uses msconvert internally to produce the mzML.

As a further layer, internally msconvert uses the Waters MassLynx SDK to read the Raw data.

In fact the changes we are talking about bringing into msconvert in November (in this thread) are already included in the data-as-a-product application, which has a private build of msconvert and the MassLynx SDK included. We are currently working on creating a new release of the MassLynx SDK, and will be submitting a PR to msconvert so that all users of msconvert get the benefits of these fixes.

I'd also mention some other value that the data-as-a-product app adds to the core functionality msconvert provides:

* You can specify a quad isolation window to include in the mzML for DDA data, which some 3rd party software requires (it is down to the user to enter the lower and upper offset they wish to include in the file).

* Additional UI
  
  * Wizard for queueing existing injections
  * Show status of export queue
  * Supports sample list settings for acquire and process workflows in MassLynx

Hi @pete-reay-waters, converting existing .raw files works with Waters DataConnect mentioned above. However, when I compare this with Waters2mzML V1.2.0, then the same result is achieved. I was hoping that the issue of the precursor ion would be solved with this, as now for both scripts MS2 level gets the same ion assigned for all (m/z 625). Is this smething you are currently working on?

Regards,
Laura

@chambm
Copy link
Member

chambm commented Dec 4, 2023

It sounds like you might be looking at MSe data or possibly targeted MS2 data (i.e. inclusion list). Doesn't sound like DDA if all the precursors are round numbers like 625.00. Can you share an example file?

@greencodefairy
Copy link

that is true, I am using MSe data. DDA is not working for my experimental approach.

@chambm
Copy link
Member

chambm commented Dec 4, 2023

Then you can basically ignore the precursor m/z. It's meaningless because the entire scan range was fragmented (for a high energy scan).

@greencodefairy
Copy link

but it will work now with MS2 data? Because I also have one set, I could test then.

@chambm
Copy link
Member

chambm commented Dec 4, 2023

The set mass won't work yet with lockmass correction. That's going to be fixed by #2770 .

@chambm
Copy link
Member

chambm commented Dec 5, 2023

OK, you should now be able to get lockmass corrected precursor mass either with or without DDA processing mode. That fix and the others Pete mentioned are available in the current ProteoWizard download.

@chambm chambm closed this as completed Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants