Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

head_vs_ht_ratio = head_vs_ht_ratio_list[each_read] IndexError: list index out of range #228

Open
huangxin0221 opened this issue Sep 29, 2024 · 4 comments

Comments

@huangxin0221
Copy link

My command is:

simulator.py genome --ref_g $file
--model_prefix ${Model_Output}
--output ${output_dir1}
--number 300
--max_len ${max_len}
--fastq --num_threads 1

But the following error occur (latest version of NanoSim):

Traceback (most recent call last):
File ".conda/envs/NanoSim/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File ".conda/envs/NanoSim/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File ".conda/envs/NanoSim/bin/simulator.py", line 1349, in simulation_aligned_genome
head_vs_ht_ratio = head_vs_ht_ratio_list[each_read]
IndexError: list index out of range

@lcoombe
Copy link
Member

lcoombe commented Oct 1, 2024

Hi @huangxin0221,

Could you please provide us with more information, including:

  • Version of NanoSim you are using (is it v3.2.1?)
  • What model you are using (pre-trained or trained yourself - details of model training if the latter)
  • Your full standard output and error
  • Do you see the same if you increase the specific number of threads

Thank you for your interest in NanoSim!
Lauren

@RunpengLuo
Copy link

RunpengLuo commented Oct 6, 2024

Hi @lcoombe,

I have the same issue. I'm using NanoSim version 3.2.1 (installed via Conda on linux 64), and the command I used:

simulator.py genome -n 40000 -min 10000 --seed 0 --fastq -t 8 \
    -rg ${CL0_CHR1A} -c ${PRETRAIN_MODEL} -o ${OUTDIR}

My genome size is about 373MB, it is a simulated genome from grch38 reference chr1. I used pretrained model human_giab_hg002_sub1M_kitv14_dorado_v3.2.1. And I've attached the output as below. I tried 8 or 16 threads but they have same error.

Thanks for your help!
John

running the code with following parameters:

ref_g ~/sim_grch38_chr1_simple/fasta/clone0.paternal.fa
model_prefix ~/NanoSim/pre-trained_models/human_giab_hg002_sub1M_kitv14_dorado_v3.2.1/training
out ~/sim_grch38_chr1_simple/fastq/round1/sim_error_clone0A_10x/error_clone0A_10x
number [40000]
perfect False
homopolymer False
dna_type linear
strandness None
sd_len None
median_len None
max_len inf
min_len 10000
fastq True
chimeric False
num_threads 8
2024-10-07 05:47:31: ~/miniconda3/envs/nanosim/bin/simulator.py genome -n 40000 -min 10000 --seed 0 --fastq -t 8 -rg ~/sim_grch38_chr1_simple/fasta/clone0.paternal.fa -c ~/NanoSim/pre-trained_models/human_giab_hg002_sub1M_kitv14_dorado_v3.2.1/training -o ~/sim_grch38_chr1_simple/fastq/round1/sim_error_clone0A_10x/error_clone0A_10x
2024-10-07 05:47:31: Read in reference 
2024-10-07 05:47:32: Read error profile
2024-10-07 05:47:32: Read KDF of unaligned reads
~/miniconda3/envs/nanosim/lib/python3.7/site-packages/sklearn/base.py:318: UserWarning: Trying to unpickle estimator KernelDensity from version 0.23.2 when using version 0.22.1. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
2024-10-07 05:47:32: Read KDF of aligned reads
2024-10-07 05:47:32: Read chimeric simulation information
2024-10-07 05:47:32: Start simulation of aligned reads
Process Process-7:0: Number of reads simulated >> 30001
Traceback (most recent call last):
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "~/miniconda3/envs/nanosim/bin/simulator.py", line 1349, in simulation_aligned_genome
    head_vs_ht_ratio = head_vs_ht_ratio_list[each_read]
IndexError: list index out of range
Process Process-8:
Traceback (most recent call last):
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "~/miniconda3/envs/nanosim/bin/simulator.py", line 1349, in simulation_aligned_genome
    head_vs_ht_ratio = head_vs_ht_ratio_list[each_read]
IndexError: list index out of range

2024-10-07 05:49:53: Start simulation of random reads

2024-10-07 05:49:55: Finished!

@lcoombe
Copy link
Member

lcoombe commented Oct 9, 2024

Hi @RunpengLuo,
Thanks for the detailed information and log - was very helpful to trace the issue!

I have a tentative fix in #233, which will hopefully be merged to master branch later today. I will also update when the fix is integrated in a new release - but feel free to test out that code in the meantime to see if it fixes your error!

@lcoombe
Copy link
Member

lcoombe commented Oct 9, 2024

The fix has been included in the newly released v3.2.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@lcoombe @RunpengLuo @huangxin0221 and others