Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mapreduce: avoid deadlock by forcing the accumulator type. #2596

Merged
merged 1 commit into from
Dec 18, 2024

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Dec 16, 2024

Otherwise we may union-split across a shfl invocation, resulting in a deadlock.

Fixes #2595

@maleadt maleadt added cuda array Stuff about CuArray. bugfix This gets something working again. labels Dec 16, 2024
@maleadt maleadt force-pushed the tb/mapreduce_deadlock branch 2 times, most recently from 1b7b3b2 to c4e131b Compare December 17, 2024 05:55
Otherwise we may union-split across a shfl invocation,
resulting in a deadlock.
@maleadt maleadt force-pushed the tb/mapreduce_deadlock branch from c4e131b to 22a89f9 Compare December 18, 2024 10:27
@maleadt
Copy link
Member Author

maleadt commented Dec 18, 2024

These segfaults are annoying; only seem to happen on CI...

Thread 6 "julia" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 1179041]
ijl_process_events () at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jl_uv.c:389
389	/cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jl_uv.c: No such file or directory.
(gdb) bt
#0  ijl_process_events () at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jl_uv.c:389
#1  0x00007ffff71d3f37 in ijl_task_get_next (trypoptask=<optimized out>, q=<optimized out>,
    checkempty=<optimized out>) at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/scheduler.c:610
#2  0x00007fffe25a9993 in julia_poptask_66344 () at task.jl:1012
#3  0x00007fffe386f313 in julia_wait_65850 () at task.jl:1021
#4  0x00007fffe30d4494 in julia_#wait#731_65869 () at condition.jl:130
#5  0x00007fff4cf399ae in ?? ()
#6  0x00007fff16d30010 in ?? ()
#7  0x389b91a0f883f464 in ?? ()
#8  0x00007fff31b83c28 in ?? ()
#9  0x00007fff16d30080 in ?? ()
#10 0x00007fff31b83cd0 in ?? ()
#11 0x00007fff31b83ad0 in ?? ()
#12 0x389b91a0f403f464 in ?? ()
#13 0x389b6b37b033f464 in ?? ()
#14 0x00007fff00000000 in ?? ()
#15 0x00007fff103c68c0 in ?? ()
#16 0x00007fff103afa70 in ?? ()
#17 0x0000000000000004 in ?? ()
#18 0x00007fff103c5c60 in ?? ()
#19 0x0000000000000008 in ?? ()
#20 0x0000000000000040 in ?? ()
#21 0x0000000000000000 in ?? ()

Copy link

codecov bot commented Dec 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.63%. Comparing base (4e9513b) to head (22a89f9).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2596      +/-   ##
==========================================
- Coverage   73.64%   73.63%   -0.02%     
==========================================
  Files         157      157              
  Lines       15204    15204              
==========================================
- Hits        11197    11195       -2     
- Misses       4007     4009       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maleadt maleadt merged commit 95894f0 into master Dec 18, 2024
2 checks passed
@maleadt maleadt deleted the tb/mapreduce_deadlock branch December 18, 2024 14:58
avik-pal pushed a commit to avik-pal/CUDA.jl that referenced this pull request Jan 11, 2025
…2596)

Otherwise we may union-split across a shfl invocation,
resulting in a deadlock.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This gets something working again. cuda array Stuff about CuArray.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mapreducedim! size-dependent fail when narrowing float element types
1 participant