-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove code for ignoring ambiguous spectra from MSFReader.cpp #3111
Remove code for ignoring ambiguous spectra from MSFReader.cpp #3111
Conversation
…k/20240617_MsfAmbiguousSpectra
…her than keeping the best-scoring ambiguous spectrum. Add tiny-msf-keep.check for the expected results when "-K" flag is specified to keep ambiguous spectra.
…k/20240617_MsfAmbiguousSpectra
…ithub.com/ProteoWizard/pwiz into Skyline/work/20240617_MsfAmbiguousSpectra
…k/20240617_MsfAmbiguousSpectra
…ithub.com/ProteoWizard/pwiz into Skyline/work/20240617_MsfAmbiguousSpectra
…k/20240617_MsfAmbiguousSpectra
…k/20240617_MsfAmbiguousSpectra
Could this be a performance issue with large pdResult files? Perhaps this internal way of eliminating ambiguous spectra is faster than relying on BiblioSpec's generic method? |
I don't think it was done for performance reasons. There are a couple of problems with this code as it is:
I am, however, thinking that the correct fix for this behavior is not to actually remove this code like in this pull request. I think the way I have things in this pull request might make the behavior worse in the cases where there are some PSMs that clearly scored worse compared to the best PSMs. I will investigate that. |
Actually, I think that the code changes that I have in this pull request are what we should go with. |
Yes. That sounds like the entire point of BlibFilter and the redundancy is
expected before it is run. We even name the file .redundant.blib
…On Sun, Sep 8, 2024 at 7:45 PM nickshulman ***@***.***> wrote:
I am, however, thinking that the correct fix for this behavior is not to
actually remove this code like in this pull request. Instead, maybe the
code should keep all PSMs that score as well as the best PSM and discard
all PSMs that score worse. In this way, the scores that are tied for best
place will be kept if "-K" was specified and discarded if "-K" was not
specified.
I think the way I have things in this pull request might make the behavior
worse in the cases where there are some PSMs that clearly scored worse
compared to the best PSMs. I will investigate that.
Actually, I think that the code changes that I have in this pull request
are what we should go with.
This other idea that I was proposing here where we only keep the best
scoring PSM among multiple PSMs that scored better than the cutoff would be
very different behavior from what we do for other types of search results.
—
Reply to this email directly, view it on GitHub
<#3111 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACYBWUAPATC4OPXFACGH5RDZVUDVHAVCNFSM6AAAAABMHKJWZ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZXGAYDGNBYGU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
tiny-msf-keep seems unreliable. Any idea why? |
It looks like the filename is getting mangled somehow, or is the weird filename part of the test?
|
Unless marked as unsupported, we test BiblioSpec for Unicode support by mangling the filenames internally. |
Oops actually it's opt-in, not opt-out. Looks like you just copied the existing msf test which was opted in. I can't think why your changes would have anything to do with Unicode support though. |
I created PR #3342 which adds some logging code in MSFReader::openFile. I imagine "tiny-msf-keep" is failing intermittently because it uses the same input files as the test that comes right before it ("tiny-msf") and something is not getting copied correctly as part of the test setup. |
Or they're both trying to run in parallel and it's a race condition. Let me add a dependency and see if that solves it. |
A user had a problem that spectra which matched more than one peptide in a .pdresult file were not being put into the .blib even if "Keep ambiguous spectra" was checked.
https://skyline.ms/announcements/home/support/thread.view?rowId=64796
MSFReader has its own code for preventing ambiguous spectra from being included in the .blib which does not pay attention to the "-K" command-line parameter.
As far as I can tell, there is no reason for that code to exist, so I figured that we should probably just remove it.