Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when no reads are recorded in tumor.bed before/after centromere location #198

Open
ronkesm opened this issue Nov 14, 2023 · 2 comments

Comments

@ronkesm
Copy link

ronkesm commented Nov 14, 2023

Hi,

In a number of my samples (which I am currently running through HATCHet2 individually), I am experiencing a bug that appears to occur when either pre- or post-centromeric reads on specific chromosomes aren't recorded in the baf/tumor.1bed file.

Here's an example of an error message:

Traceback (most recent call last): File "/data/home/hmz251/.conda/envs/HATCHet2/lib/python3.9/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/data/home/hmz251/.conda/envs/HATCHet2/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar return list(map(*args)) File "/data/BCI-WrenchLab/globus/WGS/BAM/BASE_RECALIBRATED/hatchet-2.0.0/src/hatchet/utils/count_reads.py", line 539, in run_chromosome_wrapper run_chromosome(*param) File "/data/BCI-WrenchLab/globus/WGS/BAM/BASE_RECALIBRATED/hatchet-2.0.0/src/hatchet/utils/count_reads.py", line 535, in run_chromosome raise e File "/data/BCI-WrenchLab/globus/WGS/BAM/BASE_RECALIBRATED/hatchet-2.0.0/src/hatchet/utils/count_reads.py", line 509, in run_chromosome last_idx_p = np.argwhere(thresholds > centromere_start)[0][0] IndexError: index 0 is out of bounds for axis 0 with size 0

...Affecting chromosome 5 in this sample:

Error in chromosome chr5: index 0 is out of bounds for axis 0 with size 0

When I inspect the baf/tumour.1bed file, I see that the last read on chr5 is
chr5 34292229 PD44709a 29 35

Which is before the hg38 coordinates for the beginning of the centromere on chr5. There are no reads past the centromere. It's a strange bug but it occurs in 1/3 of the samples I run through HATCHet2 (so around 10-12).

Haven't had any issues calling CNs with any other callers on this sample (or the others where this issue appears). Sequenza output below:

image

Happy to provide a subset of the sample to reproduce.

@istvankleijn
Copy link

Did you manage to resolve this? I am encountering the same issue on three chromosomes in the sample I am investigating.

I also get many warnings [W::tbx_parse1] Coordinate <= 0 detected. Did you forget to use the -0 option? slightly prior to the error, but these do not seem to be specific to the three chromosomes with SNPs called only in part of one arm.

@ronkesm
Copy link
Author

ronkesm commented Aug 15, 2024

Hi @istvankleijn,

No, unfortunately not... this was a while ago, but I remember trying to hack together a couple of workarounds (like manually imputing post-centromeric reads into the bed file) but it ended up crashing downstream of that step anyway. It doesn't seem to be strictly related to sample quality either.

I got HATCHet2 to work in samples where I already had prior confirmation of WGD or suspected WGD (eg. from cytogenetics reports or VAF distributions), so I left it at that and ran another copy-number caller on the remaining samples I had.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants