Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. #801

Open
beazerj opened this issue Mar 29, 2024 · 2 comments

Comments

@beazerj
Copy link

beazerj commented Mar 29, 2024

I'm running diamond blastp in multiprocessing mode on multiple machines (gcloud c2-cpu-standard-60 machine, 60cpus, 260GB memory). Here is the specific command for the blastp search:

diamond blastp -q seqs.faa -d seqs -o out -f 6 qseqid sseqid corrected_bitscore --approx-id 50 --query-cover 90 -k1000 -c1 --more-sensitive -b6 --multiprocessing --tmpdir tmp --parallel_tmp --log.

During the run, I'm observing that some of the processes will either hang or segfault. After recovering with the --mp-recover option and restarting the alignment process some of these processes will complete (some may still fail). The hang or segfault typically occurs at the "Computing Alignments..." step. Peak RSS is 115GB.

I've run this command on anywhere from 8 to 72 nodes and using multiple levels of sensitivity. It doesn't seem dependent on the number of nodes and i've seen it at every sensitivity level i've tried: fast, default and more-sensitive. I've tried both v2.1.8 and v2.1.9 releases of diamond

May be related to #732 and #747. The issue poster in #732 mentioned that their issue is resolved by downgrading to v2.0.15. If i were to make this downgrade? Would this make a meaningful difference to the quality / speed of the alignment?

Could be some merit to the idea that this issue occurs when trying to align a small number of sequences. Running the diamond depeclust workflow with the same steps (fast, default, more-sensitive) but on a single machine with greater memory (900GB) such that there are only 4 blocks instead of 12, i don't see the segfault issue except this takes many many days to complete.

@beazerj beazerj changed the title When running diamond blastp in multiprocessing mode, some processes hang or seqfault non deterministically. When running diamond blastp in multiprocessing mode, some processes hang or segfault non deterministically. Mar 29, 2024
@fengqingling
Copy link

I'm having a similar issue. I have over 10000 analyses, so I use Python's for loop to blastp individually.
diamond blastp --more-sensitive -p 40 -q {input_file} -d {dmnd} --evalue 1e-5 -f 6 --out {result} --query-cover cover --subject-cover cover -k 0 --id 40
However, for some reason, the diamond quest stops on a quest and there aren't many sequences within that quest. This error seems to be memory-related, as it only happens when my server runs other tasks (not diamond ones). But in reality, the server has plenty of memory and CPU left over.
My diamond version is v2.1.8.162.

@bbuchfink
Copy link
Owner

I will try to reproduce the problem. Unfortunately it's not easy to track down this sort of problem that only occurs randomly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants