How many threads and memories required at training stage? #236

yaoxkkkkk · 2024-10-20T07:19:30Z

Thank you for your development. I am using Nanosim to simulate ONT data, I use 32 threads and 256GB memory to run training stage, but it reported out of memory error. The command is

	read_analysis.py genome \
		-i ZJYY_ont_filter.fq.gz \
		-rg nd.asm.fasta \
		-o ${home_dir}/01-data/ONT/${species}_training \
		--fastq \
		-t 32

The ZJYY_ont_filter.fq.gz dataset stat is

file                   format  type   num_seqs         sum_len  min_len   avg_len  max_len
ZJYY_ont_filter.fq.gz  FASTQ   DNA   1,544,988  43,308,647,713    2,000  28,031.7  246,468

And when I run the command without --fastq parameter, the training step could be finished.

The text was updated successfully, but these errors were encountered:

lcoombe · 2024-10-21T14:39:20Z

Hi @yaoxkkkkk,

The amount of memory required will really depend on the dataset that you are training on.
On my end, training using --fastq with the HG002 ONT dataset used for the latest pre-trained models required around 263 GB of RAM - so that could be why you are seeing those errors.
If you want to use --fastq, some other options could be to use our pre-trained model, or try training using a subset of your reads.

Thank you for your interest in NanoSim!
Lauren

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many threads and memories required at training stage? #236

How many threads and memories required at training stage? #236

yaoxkkkkk commented Oct 20, 2024

lcoombe commented Oct 21, 2024

How many threads and memories required at training stage? #236

How many threads and memories required at training stage? #236

Comments

yaoxkkkkk commented Oct 20, 2024

lcoombe commented Oct 21, 2024