Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free spectra database formats (.msp and .msl) #3

Open
Stortebecker opened this issue Mar 22, 2017 · 10 comments
Open

Free spectra database formats (.msp and .msl) #3

Stortebecker opened this issue Mar 22, 2017 · 10 comments

Comments

@Stortebecker
Copy link
Owner

.msp seems to be the commonly used open format (text based) for metabolomics spectra. Free databases like the Fiehn database and the Golm database are provided in this format. The commercial NIST database is provided in a closed format (NIST db file), but can be converted to .msp using the "NIST MS search" software (freeware).

Yesterday we saw that each database uses a different syntax for their .msp files. All formats seem to contain exactly the same information, but the text file looks a bit different. Obvious differences we saw immediately were:

  • columns separated by ; or \tab , depending on the database
  • usage of ( and ) in one of the .msp formats
  • differences in the usage of spaces
  • one of the formats extensively uses spaces and files containing the very same data are about 1.5 times larger than in another .msp format

Manuel, can you please provide examples for each .msp (one spectrum will be sufficient for now - if possible exactly the same spectrum in the three formats)?

@korseby Do converters exist for .msp file formats? Are there international conventions for the format?

@korseby @blankclemens If no converter already exists, please coordinate whom is going to write one.

@Stortebecker
Copy link
Owner Author

Also, there is a second free (?) format .msl. Do converters exist for .msp to .msl and vice-versa?

@korseby @blankclemens Which format is superior? Which format is more compatible to open source / closed source software? Please consider .msl format, if you should start writing a converter.

@Stortebecker Stortebecker changed the title .msp is not equal to .msp Free spectra database formats (.msp and .msl) Mar 22, 2017
@Stortebecker
Copy link
Owner Author

Stortebecker commented Mar 22, 2017

@korseby Could you please help filling my list of spectra databases?

@blankclemens Could you please check, why this file is not correctly displayed?

@korseby
Copy link
Collaborator

korseby commented Mar 22, 2017

@Stortebecker The MONA guys are using the format. AFAIK it is a plain-text format. See: http://mona.fiehnlab.ucdavis.edu/downloads under Download Spectra.

@blankclemens
Copy link
Collaborator

blankclemens commented Mar 22, 2017

@Stortebecker Done. 8f005d6

@MSchlprt
Copy link
Contributor

@Stortebecker unfortunately we do not have one particular spectra in different .msp formats however we have same databases in .msl and .msp (see Golm database)

Golm Database in .msp and .msl format
databases.zip

fiehnlib database (best open-source one) in .msp generated by free lib2nist conversion tool http://chemdata.nist.gov/mass-spc/ms-search/Library_conversion_tool.html
fiehn_alk_simplenames_nist.zip

in house database build up and exported with NIST MSSearch in .msp format (Flo mentioned that we are able to export also the commercial nist mainlib to .msp format which should be in that format-type of .msp as it is also exported by NIST MSSearch)
CFM-Standards.zip

@MSchlprt
Copy link
Contributor

@Stortebecker @blankclemens
Additionally there is a updated fiehnlib database for GC-MS (MONA_export_GC-MS) in .msp which also has another filestructure in .msp
MoNA-export-GC-MS-msp.zip

@Stortebecker
Copy link
Owner Author

@korseby Do converters exist for .msp file formats? Are there international conventions for the format?

@Stortebecker
Copy link
Owner Author

International conventions should be here, if they already exist: http://metabolomicssociety.org/

@Stortebecker
Copy link
Owner Author

@korseby told me that W4M probably has a converter for the different formats, because they built up an internal database using Golm and other free databases.

@blankclemens Can you check if you can find this converter and maybe even a Galaxy wrapper for it? If it is not freely available, maybe the W4Ms are willing to share it? Also MONA must have some way of converting the data into their own format, maybe they are more cooperative?

@MSchlprt
Copy link
Contributor

@Stortebecker @blankclemens @korseby

We tried different things with metaMS.runGC tool (Galaxy - Freiburg VS1.1)
first: using subgroups for predicting your dataset in treated and untreated works. Upload and processing to obtain a peakspectra.msp (suitable for GOLM metabolomics annotation) was successful. However the GOLM database is not that good for our application.

Second: Database option of metaMS.runGC
Currently we know how the msp file should look like for using the database option of metaMS.runGC.
Here you find the layout of the example .msp database (unlikely as .txt as github cannot upload .msp)

threeStdsDB.txt

Running metaMS.runGC including that database is functional however as the database is just exemplary and therefore we have no annotation. We added the entry for glutamine manually but also we were not able to annotate glutamine in the sample set.

Third: Using xcms.xcmsSet tool (Galaxy Version 2.1.0) for preprocessing prior to metaMS.runGC leads to an error.

I also shared my history with Björn, Clemens and Flo if you want to have a more detailed view.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants