Running Trans-ABySS in threaded mode, but ABYSS seems to be running single-threaded #33

elissasoroj · 2025-01-12T14:56:24Z

Hello,

I am trying to process a large number of assemblies under a bit of a time crunch. I am running Trans-ABySS with the following command:

transabyss --pe krakennp_SRR20074402_out_1.fq.gz krakennp_SRR20074402_out_2.fq.gz krakennp_SRR20074403_out_1.fq.gz krakennp_SRR20074403_out_2.fq.gz krakennp_SRR20074404_out_1.fq.gz krakennp_SRR20074404_out_2.fq.gz krakennp_SRR29324688_out_1.fq.gz krakennp_SRR29324688_out_2.fq.gz krakennp_SRR29324689_out_1.fq.gz krakennp_SRR29324689_out_2.fq.gz krakennp_SRR29324700_out_1.fq.gz krakennp_SRR29324700_out_2.fq.gz krakennp_SRR29324701_out_1.fq.gz krakennp_SRR29324701_out_2.fq.gz -k 32 --name crichardii_ncbiCrHAM_transabyss_k32_out.fa --threads 18

Trans-ABySS seems to initialize fine:

Found Trans-ABySS directory at: /home/elissa/miniconda3/envs/abyss
Found Trans-ABySS `bin` directory at: /home/elissa/miniconda3/envs/abyss/bin
Found script at: /home/elissa/miniconda3/envs/abyss/bin/skip_psl_self.awk
Found script at: /home/elissa/miniconda3/envs/abyss/bin/skip_psl_self_ss.awk
Found `abyss-pe' at /home/elissa/miniconda3/envs/abyss/bin/abyss-pe
Found `MergeContigs' at /home/elissa/miniconda3/envs/abyss/bin/MergeContigs
Found `abyss-filtergraph' at /home/elissa/miniconda3/envs/abyss/bin/abyss-filtergraph
Found `abyss-junction' at /home/elissa/miniconda3/envs/abyss/bin/abyss-junction
Found `blat' at /home/elissa/miniconda3/envs/abyss/bin/blat
Found `abyss-map' at /home/elissa/miniconda3/envs/abyss/bin/abyss-map
# CPU(s) available:     80
# thread(s) requested:  18
# thread(s) to use:     18

But then it takes about 6 hours to read in one fq file at a time and discard reads (seems to be using these settings: ABYSS -k32 -q3 -e2 -E0 -c2 --coverage-hist=coverage.hist ...).

This seems like a parallelizeable step to me, or is this just standard behavior?

I am getting this error at the very beginning of the run. I thought it was not that important since it did not seem to interfere with the process for others (e.g. #26). However, I see the parameter j=18 up above the error, so perhaps it is related?

CMD: bash -euo pipefail -c 'abyss-pe graph=adj --directory=/mnt/pinky/elissa/1n2n/transabyss/crichardii k=32 name=crichardii_ncbiCrHAM_transabyss_k32_out.fa E=0 e=2 c=2 j=18 crichardii_ncbiCrHAM_transabyss_k32_out.fa-1.fa crichardii_ncbiCrHAM_transabyss_k32_out.fa-1.adj q=3 se="/mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074402_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074402_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074403_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074403_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074404_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074404_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324688_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324688_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324689_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324689_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324700_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324700_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324701_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324701_out_2.fq.gz"'
make: Entering directory '/mnt/pinky/elissa/1n2n/transabyss/crichardii'
dirname: missing operand
Try 'dirname --help' for more information.
ABYSS -k32 -q3 -e2 -E0 -c2    --coverage-hist=coverage.hist -s crichardii_ncbiCrHAM_transabyss_k32_out.fa-bubbles.fa  -o crichardii_ncbiCrHAM_transabyss_k32_out.fa-1.fa  /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074402_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074402_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074403_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074403_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074404_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR20074404_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324688_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324688_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324689_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324689_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324700_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324700_out_2.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324701_out_1.fq.gz /mnt/pinky/elissa/1n2n/kraken/crichardii/krakennp_SRR29324701_out_2.fq.gz

Any help is greatly appreciates. Sorry if I'm missing something obvious.

~Elissa

The text was updated successfully, but these errors were encountered:

kmnip · 2025-01-12T20:17:10Z

Hi @elissasoroj ,

If I remember correctly, ABySS (without Bloom filter deBruijn graph) can only read multiple read files at the same time if it was using MPI. Trans-ABySS doesn't run ABySS with MPI enabled.

The dirname: missing operand error is indeed the same issue as #26 . The solution for this issue is in my comment here:
#26 (comment)

j=18 tells abyss-pe to use 18 threads in its workflow. I don't think that is related to this issue.

Do you have to use Trans-ABySS in your work?
If not, you can try RNA-Bloom: https://github.com/bcgsc/RNA-Bloom
I developed it for reference-free transcriptome assembly. It should work well for your time crunch.

Ka Ming

elissasoroj · 2025-01-12T20:38:04Z

Hi Ka Ming,

Thanks for the quick reply! I am currently testing different approaches, so I will give RNA-Bloom a try!

I'd still like too try out Trans-ABySS if possible - is there a setting for it that will allow me to run ABySS in parallel - for example, is there a way to run it with the Bloom filter deBruijn graph?

Thanks again,
~Elissa

kmnip · 2025-01-12T21:42:13Z

I tried the Bloom filter DBG approach in ABySS a long time ago, but it produced a worse transcriptome assembly at the time. I decided to stick with the original DBG approach. So, I wouldn't recommend switching to the Bloom filter DBG.
Sorry, I don't think there is a solution to the issue.

elissasoroj · 2025-01-12T22:06:48Z

Alright, thank you so much! I appreciate it!

kmnip self-assigned this Jan 12, 2025

kmnip added the question label Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Trans-ABySS in threaded mode, but ABYSS seems to be running single-threaded #33

Running Trans-ABySS in threaded mode, but ABYSS seems to be running single-threaded #33

elissasoroj commented Jan 12, 2025

kmnip commented Jan 12, 2025 •

edited

Loading

elissasoroj commented Jan 12, 2025

kmnip commented Jan 12, 2025

elissasoroj commented Jan 12, 2025

Running Trans-ABySS in threaded mode, but ABYSS seems to be running single-threaded #33

Running Trans-ABySS in threaded mode, but ABYSS seems to be running single-threaded #33

Comments

elissasoroj commented Jan 12, 2025

kmnip commented Jan 12, 2025 • edited Loading

elissasoroj commented Jan 12, 2025

kmnip commented Jan 12, 2025

elissasoroj commented Jan 12, 2025

kmnip commented Jan 12, 2025 •

edited

Loading