When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. #801

beazerj · 2024-03-29T18:42:50Z

I'm running diamond blastp in multiprocessing mode on multiple machines (gcloud c2-cpu-standard-60 machine, 60cpus, 260GB memory). Here is the specific command for the blastp search:

diamond blastp -q seqs.faa -d seqs -o out -f 6 qseqid sseqid corrected_bitscore --approx-id 50 --query-cover 90 -k1000 -c1 --more-sensitive -b6 --multiprocessing --tmpdir tmp --parallel_tmp --log.

During the run, I'm observing that some of the processes will either hang or segfault. After recovering with the --mp-recover option and restarting the alignment process some of these processes will complete (some may still fail). The hang or segfault typically occurs at the "Computing Alignments..." step. Peak RSS is 115GB.

I've run this command on anywhere from 8 to 72 nodes and using multiple levels of sensitivity. It doesn't seem dependent on the number of nodes and i've seen it at every sensitivity level i've tried: fast, default and more-sensitive. I've tried both v2.1.8 and v2.1.9 releases of diamond

May be related to #732 and #747. The issue poster in #732 mentioned that their issue is resolved by downgrading to v2.0.15. If i were to make this downgrade? Would this make a meaningful difference to the quality / speed of the alignment?

Could be some merit to the idea that this issue occurs when trying to align a small number of sequences. Running the diamond depeclust workflow with the same steps (fast, default, more-sensitive) but on a single machine with greater memory (900GB) such that there are only 4 blocks instead of 12, i don't see the segfault issue except this takes many many days to complete.

The text was updated successfully, but these errors were encountered:

fengqingling · 2024-04-22T01:33:24Z

I'm having a similar issue. I have over 10000 analyses, so I use Python's for loop to blastp individually.
diamond blastp --more-sensitive -p 40 -q {input_file} -d {dmnd} --evalue 1e-5 -f 6 --out {result} --query-cover cover --subject-cover cover -k 0 --id 40
However, for some reason, the diamond quest stops on a quest and there aren't many sequences within that quest. This error seems to be memory-related, as it only happens when my server runs other tasks (not diamond ones). But in reality, the server has plenty of memory and CPU left over.
My diamond version is v2.1.8.162.

bbuchfink · 2024-10-21T18:46:28Z

I will try to reproduce the problem. Unfortunately it's not easy to track down this sort of problem that only occurs randomly.

beazerj changed the title ~~When running diamond blastp in multiprocessing mode, some processes hang or seqfault non deterministically.~~ When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. #801

When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. #801

beazerj commented Mar 29, 2024 •

edited

Loading

fengqingling commented Apr 22, 2024

bbuchfink commented Oct 21, 2024

When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. #801

When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. #801

Comments

beazerj commented Mar 29, 2024 • edited Loading

fengqingling commented Apr 22, 2024

bbuchfink commented Oct 21, 2024

beazerj commented Mar 29, 2024 •

edited

Loading