-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve pVACvector graph building algorithm #1163
base: staging
Are you sure you want to change the base?
Conversation
TODO:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I like the approach. Searching for all problematic junctions avoids getting caught in local minima, where we've got a bunch of good junctions, but get stuck on a handful that just don't like to be connect, for whatever reason. I suspect that this will do more work than is necessary in some cases (if there is only one problematic junction, and a simple spacer or clip fixes it, you wouldn't need to check the rest of the graph). That said, trying to fix them one by one and backtrack if you hit a dead end means the traversal algorithm gets complex quickly. As long as this runs in a reasonable amount of time, I think it's the right approach.
-
When clipping, are we somehow ensuring that we don't clip out key parts of the core epitope? (e.g. if the mutation is at amino acid 2 of the gene, we probably can't be clipping from the left). It's a rare case, and maybe just a todo item - look into handling that in the future. I'm reasonably sure it would require some additional information being passed into pVACvector that isn't there now.
-
I also wonder if we should go back and run it on a few real cases to see whether this successfully improves the speed and number of successful vectors. I think it should!
Looks good
|
Previously pVACvector would clip “problematic” peptides (i.e. peptides without incoming or outgoing good junctions in the graph where a good junction is a connection to other peptides without novel junctional neoantigens). It would then attempt to build a whole new graph with the updated set of peptides. This would result in not all possible combinations of clipped and non-clipped peptides to be tested. Additionally, the building a whole new graph after clipping is non-ideal since valid junctions that were previously discovered are ignored.
This PR updates the algorithm to work roughly as follows (biggest updates bolded):
As a result, pVACvector should have a higher likelihood of finding a result sooner or finding a result at all.
Closes #1087