-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selectSolution not selecting the best solution #46
Comments
Having now ran TITAN for > 600 tumor samples I am finding that |
Hi @fpbarthel Thanks for reporting back on your experiences with TITAN. The Here are some alternative things to try:
Hope this helps, |
Thanks @gavinha these are all excellent and I will try them out! I am not sure I understand what you are suggesting with (3). My current cohort consists of about 25% whole genomes and 75% exomes and currently I'm setting both Either way, I will try out (1) and (2) first and let you know how this pans out and it may not be necessary to go that route. Floris UPDATE: I figured I would share another interesting case. A very common pattern in GBM is an amplification of chr7 in combination with a loss of chromosome 10. Often with deep amplifications of EGFR (chr 7) and deep deletions of CDKN2A (chr9). I have a sample which underwent both WXS and WGS. Interestingly, for the whole exome sample the ploidy 2 (likely correct) solution is chosen, but for the WGS sample a ploidy 3 solution is chosen: WXSSolution chosen by
WGSSolution chosen by
WGSPloidy 2 / cluster 2 solution
WGSPloidy 2 / cluster 3 solution
|
The On a side note, the |
The copy number segments are usually nice to look at but it is more difficult to assess the solutions. The plots that I like to use are the When determining whether a sample is very likely genome doubled (ploidy3 solutions in your plots), this is what I look for: Scenario 1: Copy neutral (HET and LOH) segments are both present
If you can find examples of BOTH these, then it is very likely this solution is correct. As long as you don't have large homozygous deletions (see next scenario). Scenario 2: Large homozygous deletionsThere are large homozygous deletions spanning 10's to 100's of Mbps. Then, this solution is likely incorrect and a higher ploidy solution should be considered. This scenario is handled in the |
Thank you, very helpful! |
Hi @gavinha, I am working alongside @fpbarthel on the same brain tumor data set and I tried out your advice to select the minimum S_Dbw Validity Index out of all runs across all ploidy and cluster initializations. It did seem to reduce the number of 4n solutions to what is more in line with previously published data. Thanks for your suggestion! Nevertheless, after analyzing either the selectSolution or taking the minimum S_Dbw validity index we ran into a separate issue. That is, we found ploidy differences > 1 in up to 33% of the samples for which we had both whole genome sequencing and whole exome sequencing. There did not seem to be a consistent trend in the WGS displaying a higher or lower ploidy than WXS. Having whole genome and whole exome data is a peculiar feature of our dataset, but it made us wonder whether you might have any thoughts on these discordances and/or ideas about combining two different data types? Ideally, we would generate solutions that have closer to 90% ploidy concordance. Below is one discordant example where the WGS and WXS data were generated from the same DNA extraction/aliquot: test-sample whole genome sequencing
test-sample exome sequencing
Updated with matching LOH plots. |
Hi @Kcjohnson and @fpbarthel Thanks for sharing your experiences with TITAN and bringing up the concern regarding ploidy. Frankly, selecting the correct ploidy solution is a very challenging problem that I still encounter. Honestly, 33% of samples showing discordance is more or less what I would expect, considering 66% of samples faired better? Looking at your plots... Depending on the tumor type, I generally begin by leaning towards diploid (ploidy2) solutions, unless there is some very obvious evidence that genome doubling has occurred. See the previous message for my guidelines (I should probably put this into the Wiki). Next, I consider the clonal cluster solutions. In this example, I would say that the WGS results (diploid) look more believable. Of course, for some tumor types where genome doubling is more frequent, we can begin with different expectations. Usually for these frequent doubled tumors, I do notice the evidence for doubling. So ultimately, like you are already doing, manual inspection of solutions and results is recommended. I'm sorry I can't be more helpful. Best, Edit: I should also add that I would tend to believe the WGS results more because TITAN was designed for WGS. There are many WES-based tools available and so you can try to use an alternative method to see how the ploidy matches up. My guess is that you'll probably see the same or worse discordance. |
I updated all previous posts to include these plots instead.
I've noticed the
So far I've been using the recommended values for
On the contrary, your suggestions have been extremely helpful for us learning TITAN and in tweaking parameters to optimize results. We are happy to contribute examples to the community if it helps. Floris |
Not a bug, I previously uploaded mismatching CNA and LOH plots. Sorry for the confusion! The new plots reflect the same sample. |
Could you tell me how to create this plot? I'm beginner of R |
In a test sample I'm finding that
selectSolution
is not selecting "what should biologically speaking" be the best solution.For some reason, it prefers this ploidy=4, clusters=3 solution:
Over a more meaningful ploidy = 2 solution like this one:
What parameters are used to select the optimal variant and how can we adjust this? LOH of chromosome arms 1p/19q (as in the second, non-selected solution) is a well recognized marker of this cancer type.
UPDATE: I guess it looks at the log-likelihood. Why do you think the first solution was scored higher than the second solution?
The text was updated successfully, but these errors were encountered: