-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sub-pooling suggestions #33
Comments
Hi Alexis @alexis-sedg, interesting approach! Do you have a reference or link to that sub-pooling procedure? Why is that the recommendation for that variant caller? Yes, grenedalf can do that, using the Hope that helps, so long |
Hi Lucas, Yeah, happily! The paper it's from is "A statistical method for the detection of variants from next-generation resequencing of DNA pools" My understanding is that they use the comparison between multiple replicate pools of the same population to distinguish sequence errors from rare alleles. The explanation they provided was: Excellent, I'll give the merge function a go! Read quality and quantity is variable across my data, even between samples from the same groups. Are there any additional considerations or recommendations you have to deal with the variability or is it alright to run the merge function as is? Thanks for your time, |
Hi Alexis, thanks for the details! Hm, interesting approach - however, I am not quite sure that the following is true in general:
That seems to be highly dependent on the sequencing technology being used, in the sense that (as far as I am aware as a non-wet-lab person) not all instruments have error profiles with dependencies on local sequence context. But well, if that method makes sense for your data, it seems reasonable :-) We did in fact also work out the math for accounting for base qualities in the grenedalf statistical supplement, see the first of the two supplemental PDFs here. This is however currently not implemented, but at least in theory, it can be done. Just wanted to point it out to you in case that helps. Also, CRISP seems to produce VCFs, but you said you are using BAMs? What is the process there? As for the merge option: There is a caveat if you are using the You also asked about considerations wrt the variablity - what do you mean there? Hope that helps, so long |
Hello,
I'm interested in using this pipeline for pooled data. However, when we designed the study, we used the sub-pooling method recommended by CRISP (variant caller). So instead of having a singular BAM file for a given population, I have multiple. I assume I can use the downstream VCF or mpileup as part of your pipeline but I'd prefer to use the BAMs as inputs. Is there a way to go about using multiple BAMs for a given sample population in the Grenedalf pipeline?
Thank you,
Alexis
The text was updated successfully, but these errors were encountered: