Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a problem with ProteinMetadataManager hanging while trying to resolve protein metadata for peptide lists… #3218

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

bspratt
Copy link
Member

@bspratt bspratt commented Nov 14, 2024

…, which often aren't resolvable because they aren't named as actual proteins (e.g. "Library Peptides")

…olve protein metadata for peptide lists, which often aren't resolvable because they aren't named as actual proteins (e.g. "Library Peptides")
@bspratt bspratt requested review from chambm and brendanx67 November 14, 2024 19:49
Copy link
Contributor

@brendanx67 brendanx67 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define a virtual function for getting the type of name you are looking for and use polymorphism?

{
var name = nodePepGroup.IsPeptideList ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ick. Can you possibly use polymorphism? Give this a base-class implementation and then have the FastaSequence version do what it needs to do to get the name you are truly looking for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not - it doesn't make any sense to ask for the i'th name in a peptide list, it only has one name because it doesn't have a FastaSequenceList. (For a peptide list that loop count is always 1.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not going to let this one go through until it has received more thought. Seems like clearly poor design to me. If the information is in a parallel list of names to the ProteinMetadataList, then you should instead be returning a list of structs with names and protein metadata instances, so that you don't need to rely on these parallel lists that you have to know so much to decipher. Improve the design, don't go digging like this based on knowledge you shouldn't be required to have from outside the PeptideGroupDocNode class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all about the protein association work, which I wasn't part of. Sounds like you want to kick this back to the original author?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which part you're objecting to Brendan. To me it seems like he's just refactoring it a bit to fix a bug: checking for IsPeptideList before casting to FastaSequence, which is what the existing code does. What you suggest would seem to be a considerably bigger refactoring in order to avoid this logic:
var name = nodePepGroup.IsPeptideList ? nodePepGroup.Name : (nodePepGroup.PeptideGroup as FastaSequence)?.FastaSequenceList[i].Name; ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A peptide list may have a name that's actually a protein identifier for which we can get metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to get this off my plate. Can you guys @chambm @brendanx67 reach a conclusion on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does something get marked a peptide list? Do we actually want it to get metadata if it happens to match a protein id or name, even if it has peptides that aren't related to the protein?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import Transition List and Spectral Library Add, for instance - though Import Transition List is the more likely case for a bunch of peptides sharing a group name which is recognizable by web services like UniProt.

By adding in plausible metadata we aren't making any assertions about the appropriateness of the group membership that weren't already implied by using an obviously recognizable protein name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that often as not, we aren't even hitting the internet - just parsing the metadata in names that are canonically structured strings)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants