-
Notifications
You must be signed in to change notification settings - Fork 7
Interpreting MSFragger Output
By default, MSFragger generates a pepXML file (<run name>.pepXML
) for every spectral file searched. For open searches and mass offset searches, a tab-separated file (<run name>.tsv
) of the search hits will also be generated by default. To generate this tsv file in other situations, select 'TSV_PEPXML' output format in FragPipe (MSFragger tab, Advanced Output Options) or set output_format = tsv_pepXML
in the fragger.params file if running in the command line.
The pepXML outputs can be used for downstream processing (e.g. FDR control, protein inference) using PeptideProphet in TPP directly. For viewing of results or conversion to other peptide identification result formats for use in other pipelines or tools that do not support pepXML, we recommend first converting to the mzIdentML format using the tool idconvert as part of the ProteoWizard package.
Please note: The pepXML files produced by MSFragger may have additional attributes (e.g., uncalibrated_precursor_neutral_mass
and ion_mobility
) not in the original schema. According to our tests, both PeptideProphet and Philosopher can process those additional attributes.
The output fields of the TSV file (if enabled) produced by MSFragger are listed below:
scannum
scan number of the MS/MS spectrum within the spectral file
precursor_neutral_mass
neutral mass of the identified peptide ion as measured (in Da)
retention_time
MS/MS spectrum retention time (in minutes)
charge
charge state of the identified peptide ion
hit_rank
position of the identification within all matches to the spectrum (1=highest scoring match)
peptide
stripped amino acid sequence of the identified peptide
peptide_prev_aa
amino acid directly preceding the identified peptide within the mapped protein sequence
peptide_next_aa
amino acid directly following the identified peptide within the mapped protein sequence
protein
complete FASTA header of the originating protein sequence
num_matched_ions
count of fragment ions matching the identified peptide sequence (includes mass-shifted ions from localization-aware matching)
tot_num_ions
total count of theoretical fragment ions from the peptide
calc_neutral_pep_mass
theoretical mass of the identified peptide ion (in Da)
massdiff
difference between measured and theoretical precursor neutral mass
num_tol_term
number of enzymatic termini (2=fully enzymatic, 1=semi-enzymatic, 0=non-enzymatic)
num_missed_cleavages
number of missed enzymatic cleavage sites in the identified peptide sequence
modification_info
position, identity, and mass of each identified modification specified as fixed or variable in the search (does not include mass differences from open or mass offset searches), multiple modifications are comma-separated
hyperscore
similarity score between observed and theoretical spectra, higher values indicate greater similarity
nextscore
similarity score (hyperscore) of second-highest scoring match for the spectrum
expectscore
expectation score of the peptide-spectrum match, lower values indicate higher likelihood
best_locs
peptide sequence with most probable delta mass (massdiff
) locations indicated with lowercase letters
score_without_delta_mass
similarity score (hyperscore) if no delta mass is included on the peptide
best_score_with_delta_mass
similarity score (hyperscore) if the indicated delta mass (massdiff
) is included on the peptide
second_best_score_with_delta_mass
similarity score (hyperscore) of the second-highest scoring match if the indicated delta mass (massdiff
) is included on the peptide
delta_score
similarity score difference between best_score_with_delta_mass
and second_best_score_with_delta_mass
alternative_proteins
FASTA headers of any additional proteins the identified peptide maps to, list separated by @@
The pepXML files produced by MSFragger may have additional attributes (e.g., uncalibrated_precursor_neutral_mass
and ion_mobility
) not in the official schema. According to our tests, both PeptideProphet and Philosopher can process those additional attributes.