Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue runing Hatchet2 with hg38 reference_version #223

Open
wchukwu opened this issue Jul 17, 2024 · 5 comments
Open

Issue runing Hatchet2 with hg38 reference_version #223

wchukwu opened this issue Jul 17, 2024 · 5 comments

Comments

@wchukwu
Copy link

wchukwu commented Jul 17, 2024

I get a No snps found in normal! error when I attempt to run Hatchet2 using reference_version=hg38. Reference fasta files are hg38. Interestingly, when I run the relevant bcftools commands from the genotype_snps function locally, I obtain non-empty snp files so it only fails as part of the hatchet run command. Also, if I erroneously set reference_version=hg19, the hatchet run generates non-empty snp files.
I have included my hatchet.ini file here.
hatchet_gitissue.txt

@balabanmetin
Copy link
Member

Hi,

did you try chr_notation = False?

If reference chromosomes have the the prefix "chr", set this to True. Otherwise False.

@balabanmetin
Copy link
Member

Screenshot_20240821-225608

There might be another reason for this issue and I think this is more likely. Make sure your SNP panel has the same chromosome notation (chr prefix exists or not) as the reference. The screenshot shows my setup for grch38. See "chrnotation" in the SNP filename

@wchukwu
Copy link
Author

wchukwu commented Aug 28, 2024

Hi Metin,

Unfortunately, this did not fix the issue. To troubleshoot, I ran multiple instances of hatchet to exhaust the combinations of the reference_version and chr_notation:

Reference version chr_notation ini_file stout_error
"" True norefT_git.txt norefT.log
"" False norefF_git.txt norefF.log
hg19 True hg19T_git.txt hg19T.log
hg19 False hg19F_git.txt hg19F.log
hg38 True hg38T_git.txt hg38T.log
hg38 False hg38F_git.txt hg38F.log

Both the hg38T and hg38F instances fail without calling any snps. Of all, hg19T gets the farthest and fails on phasing but again, it is not the reference version I need.
I also don't provide a snp file myself but rely on this #If unspecified, HATCHet selects a list of known germline SNPs based on <run.reference_version> and <run.chr_notation>
Thanks for your help so far!

@balabanmetin
Copy link
Member

Can you share

  1. the bam header for the matched normal
  2. the bam header for the tumor
  3. reference genome fasta record ids (grep ">" from the reference you are using)

@wchukwu
Copy link
Author

wchukwu commented Aug 29, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants