Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ntm profiler crashed when we run large fastq files. #22

Open
yujun2017 opened this issue Nov 30, 2022 · 3 comments
Open

ntm profiler crashed when we run large fastq files. #22

yujun2017 opened this issue Nov 30, 2022 · 3 comments

Comments

@yujun2017
Copy link

bash-5.1# more test1.errlog.txt

ntm-profiler error report

  • OS: linux
  • ntm-profiler version: 0.2.0
  • pathogen-profiler version: 2.0.0
  • Program call:
{'no_clean': False, 'read1': 'SRR315266_1.fastq.gz', 'read2': 'SRR315266_2.fastq.gz', 'bam': None, 'fasta': None, 'vcf': None, 'platform': 'illumina', 'resistance_db': None, 'external_resistance_db': None, 'species_db': 'ntmdb', 'external_species_db': None, 'prefix': 'test1', 'dir': 'NTM_results', 'csv': False, 'txt': True, 'add_columns': None, 'add_mutation_metadata': False, 'call_whole_genome': False, 'mapper': 'bwa', 'caller': 'freebayes', 'calling_params': None, 'min_depth': 10, 'af': 0.1, 'reporting_af': 0.1, 'coverage_fraction_threshold': 0, 'missing_cov_threshold': None, 'species_only': False, 'no_trim': False, 'no_flagstat': False, 'no_clip': True, 'no_delly': False, 'no_species': False, 'no_mash': False, 'output_kmer_counts': False, 'add_variant_annotations': False, 'threads': 1, 'verbose': 0, 'no_cleanup': False, 'delly_vcf': None, 'func': <function main_profile at 0x7f656f009b80>, 'software_name': 'ntm-profiler', 'tmp_prefix': 'e487d483-7948-4a04-860a-8e0e114ac317', 'files_prefix': 'NTM_results/e487d483-7948-4a04-860a-8e0e114ac317'}

Traceback:

  File "/opt/conda/envs/ntm-profiler/bin/ntm-profiler", line 322, in <module>
    args.func(args)
  File "/opt/conda/envs/ntm-profiler/bin/ntm-profiler", line 89, in main_profile
    species_prediction = pp.speciate(args)
  File "/opt/conda/envs/ntm-profiler/lib/python3.9/site-packages/pathogenprofiler/cli.py", line 64, in speciate
    kmer_dump = fastq_class.get_kmer_counts(args.files_prefix,threads=args.threads)
  File "/opt/conda/envs/ntm-profiler/lib/python3.9/site-packages/pathogenprofiler/fastq.py", line 128, in get_kmer_counts
    run_cmd(f"kmc {bins} -t{threads} -sf{threads} -sp{threads} -sr{threads} -k{klen} @{tmp_file_list} {tmp_prefix} {tmp_prefix}")
  File "/opt/conda/envs/ntm-profiler/lib/python3.9/site-packages/pathogenprofiler/utils.py", line 391, in run_cmd
    raise ValueError("Command Failed:\n%s\nstderr:\n%s" % (cmd,stderr.decode()))

Value:

Command Failed:
set -u pipefail; kmc  -t1 -sf1 -sp1 -sr1 -k31 @73dd2415-1693-41ed-8bbe-2b6cf16e4806.list 73dd2415-1693-41ed-8bbe-2b6cf16e4806 73dd2415-1693-41ed-8bbe-2b6cf16e4806
stderr:
*****************
Stage 1: 100%
Stage 2: 100%
@yujun2017
Copy link
Author

After I do more test, the issue comes from memory setting. The kmc default setting is 12GB, it is too large for a web server. I would like to ask if NTM can add more parameter for the kmc memory setting.

@yujun2017
Copy link
Author

Jody, Happy new year. Do you have any update for NTM Profiler? we are stilling waiting for new NTM profiler out. Thanks!

John

@jodyphelan
Copy link
Owner

Hi John, happy new year to you too!

Sorry was a bit delayed with releasing a new version before the holidays but have started the process of getting it into bioconda now. I added a --ram setting to ntm-profiler and tb-profiler to set a max limit. I also noticed that sometimes kmc hangs so I've also added in another kmer counting option: dsk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants