Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions on data bundle and SV genotyping #46

Open
Han-Cao opened this issue Nov 5, 2022 · 1 comment
Open

Questions on data bundle and SV genotyping #46

Han-Cao opened this issue Nov 5, 2022 · 1 comment

Comments

@Han-Cao
Copy link

Han-Cao commented Nov 5, 2022

Hi @jonassibbesen ,

Thanks for providing this great tool.

I am now trying to download the data bundle from the link you provide. However, it always failed after ~1GB data is downloaded no matter which tool I use. For example, wget keep raising error Connection closed at byte 1073725440. Retrying..

Will you consider provide an alternative link for download? Or could you clarify if I generate the reference data for GRCh38 in this way is OK:

  1. Reference genome: put chr1-22, X, Y, chrrandom to canon.fa, put chrUn, chrdecoy to decoy.fa, skip chralt and HLA
  2. Variant prior vcf: sequence resolved site-only vcf

By the way, if I only want to genotype large SVs detected from long read sequencing-based callset, can I skip the variant calling step and estimate SV genotype of short read sequencing samples using SV callset + SNV/INDEL prior file?

Thanks,
Han

@jonassibbesen
Copy link
Contributor

Hi Han,

Thanks for writing. I just tried to download the GRCh38 data bundle and was able to without a problem. Could you maybe try again now? If you still have problems I have also now put the GRCh38 bundle on google drive: https://drive.google.com/file/d/1ioTjLFkfmvOMsXubJS5_rwpfajPv5G1Q/view?usp=sharing

Regarding the SVs call from long reads and the prior. It is not something I have tried myself. I have seen from other studies that BayesTyper generally does better on SVs predicted using short reads compared to long reads. This is likely due to the breakpoints being more accurate from short reads which is important for BayesTyper.

Best,

Jonas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants