Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to convert IdealGlmMhdEquations to bits type #11

Closed
huiyuxie opened this issue Aug 12, 2023 · 5 comments
Closed

Failed to convert IdealGlmMhdEquations to bits type #11

huiyuxie opened this issue Aug 12, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@huiyuxie
Copy link
Member

huiyuxie commented Aug 12, 2023

In order to test nonconservative_terms::True related kernels in 2D and 3D, I used https://github.com/huiyuxie/trixi_cuda/blob/main/tests/mhd_alfven_wave_2d.jl and https://github.com/huiyuxie/trixi_cuda/blob/main/tests/mhd_alfven_wave_3d.jl (the corresponding 1D file does not fit) as @ranocha suggested. But the IdealGlmMhdEquations2D and IdealGlmMhdEquations3D (and probably also for IdealGlmMhdEquations1D failed to convert to bits type in kernels like https://github.com/huiyuxie/trixi_cuda/blob/a81eccd6a6fda336d7877c5cda73a48a4c6b2c92/cuda_dg_2d.jl#L190-L225 and would cause errors like

ERROR: GPU compilation of MethodInstance for symmetric_noncons_flux_kernel!(::CuDeviceArray{Float32, 5, 1}, ::CuDeviceArray{Float32, 5, 1}, ::CuDeviceArray{Float32, 5, 1}, ::CuDeviceArray{Float32, 5, 1}, ::CuDeviceArray{Float32, 4, 1}, ::CuDeviceMatrix{Float32, 1}, ::IdealGlmMhdEquations2D{Float32}, ::typeof(flux_central), ::typeof(flux_nonconservative_powell)) failed
KernelError: passing and using non-bitstype argument

Argument 8 to your kernel function is of type IdealGlmMhdEquations2D{Float32}, which is not isbits:


Stacktrace:
  [1] check_invocation(job::GPUCompiler.CompilerJob)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/NVLGB/src/validation.jl:96
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/NVLGB/src/driver.jl:99 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:253 [inlined]
  [4] codegen(output::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing, ctx::LLVM.ThreadSafeContext)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/NVLGB/src/driver.jl:97
  [5] codegen
    @ ~/.julia/packages/GPUCompiler/NVLGB/src/driver.jl:92 [inlined]
  [6] compile(target::Symbol, job::GPUCompiler.CompilerJob; libraries::Bool, toplevel::Bool, optimize::Bool, cleanup::Bool, strip::Bool, validate::Bool, only_entry::Bool, ctx::LLVM.ThreadSafeContext)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/NVLGB/src/driver.jl:88
  [7] compile
    @ ~/.julia/packages/GPUCompiler/NVLGB/src/driver.jl:79 [inlined]
  [8] compile(job::GPUCompiler.CompilerJob, ctx::LLVM.ThreadSafeContext)
    @ CUDA ~/.julia/packages/CUDA/pCcGc/src/compiler/compilation.jl:125
  [9] #1032
    @ ~/.julia/packages/CUDA/pCcGc/src/compiler/compilation.jl:120 [inlined]
 [10] LLVM.ThreadSafeContext(f::CUDA.var"#1032#1033"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}})
    @ LLVM ~/.julia/packages/LLVM/5aiiG/src/executionengine/ts_module.jl:14
 [11] JuliaContext
    @ ~/.julia/packages/GPUCompiler/NVLGB/src/driver.jl:35 [inlined]
 [12] compile
    @ ~/.julia/packages/CUDA/pCcGc/src/compiler/compilation.jl:119 [inlined]
 [13] actual_compilation(cache::Dict{Any, Any}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/NVLGB/src/execution.jl:125
 [14] cached_compilation(cache::Dict{Any, Any}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/NVLGB/src/execution.jl:103
 [15] macro expansion
    @ ~/.julia/packages/CUDA/pCcGc/src/compiler/execution.jl:318 [inlined]
 [16] macro expansion
    @ ./lock.jl:267 [inlined]
 [17] cufunction(f::typeof(symmetric_noncons_flux_kernel!), tt::Type{Tuple{CuDeviceArray{Float32, 5, 1}, CuDeviceArray{Float32, 5, 1}, CuDeviceArray{Float32, 5, 1}, CuDeviceArray{Float32, 5, 1}, CuDeviceArray{Float32, 4, 1}, CuDeviceMatrix{Float32, 1}, IdealGlmMhdEquations2D{Float32}, typeof(flux_central), typeof(flux_nonconservative_powell)}}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/pCcGc/src/compiler/execution.jl:313
 [18] cufunction
    @ ~/.julia/packages/CUDA/pCcGc/src/compiler/execution.jl:310 [inlined]
 [19] macro expansion
    @ ~/.julia/packages/CUDA/pCcGc/src/compiler/execution.jl:104 [inlined]
 [20] cuda_volume_integral!(du::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}, u::CuArray{Float32, 4, CUDA.Mem.DeviceBuffer}, mesh::TreeMesh{2, SerialTree{2}}, nonconservative_terms::True, equations::IdealGlmMhdEquations2D{Float32}, volume_integral::VolumeIntegralFluxDifferencing{Tuple{typeof(flux_central), typeof(flux_nonconservative_powell)}}, dg::DGSEM{LobattoLegendreBasis{Float64, 4, SVector{4, Float64}, Matrix{Float64}, Matrix{Float64}, Matrix{Float64}}, LobattoLegendreMortarL2{Float64, 4, Matrix{Float64}, Matrix{Float64}}, SurfaceIntegralWeakForm{Tuple{FluxLaxFriedrichs{typeof(max_abs_speed_naive)}, typeof(flux_nonconservative_powell)}}, VolumeIntegralFluxDifferencing{Tuple{typeof(flux_central), typeof(flux_nonconservative_powell)}}})
    @ Main ~/trixi_cuda/cuda_dg_2d.jl:340
 [21] top-level scope
    @ ~/trixi_cuda/cuda_dg_2d.jl:984

and this is the first time I've encountered an issue like this. So I directly changed to other test files, like https://github.com/huiyuxie/trixi_cuda/blob/main/tests/shallowwater_well_balanced_2d.jl for 2D and this time it worked and passed the accuracy tests.

(1) As mentioned above, I used https://github.com/huiyuxie/trixi_cuda/blob/main/tests/shallowwater_well_balanced_1d.jl for testing 1D kernels and https://github.com/huiyuxie/trixi_cuda/blob/main/tests/shallowwater_well_balanced_2d.jl for 2D. But I cannot find any suitable test example for 3D (except for IdealGlmMhdEquations). Is there any recommended samples for 3D? Thanks!
(2) Why did the conversion of IdealGlmMhdEquations to a bits type fail? I further inspected the difference between these equations and I think it may because of the keyword mutable like here https://github.com/trixi-framework/Trixi.jl/blob/68df09d5a21bd8f7393df90dab915247f9498505/src/equations/ideal_glm_mhd_2d.jl#L14-L24. Is this keyword really necessary? If yes, I think this would help https://cuda.juliagpu.org/stable/tutorials/custom_structs/ but I am not pretty sure as it does not mention how to handle mutable.

@ranocha
Copy link
Collaborator

ranocha commented Aug 12, 2023

The shallow water equations just do not really make sense in 3D, so there is no 3D setup with them.

The IdealGlmMhdEquations are a mutable struct, not an immutable one. The reason is that we adapt the GLM cleaning speed
https://github.com/trixi-framework/Trixi.jl/blob/68df09d5a21bd8f7393df90dab915247f9498505/src/equations/ideal_glm_mhd_3d.jl#L18
in the GlmSpeedCallback at
https://github.com/trixi-framework/Trixi.jl/blob/68df09d5a21bd8f7393df90dab915247f9498505/src/callbacks_step/glm_speed.jl#L79
As a workaround, you can try to make the IdealGlmMhdEquations immutable (by removing the keyword mutable before struct) and run everything without the GlmSpeedCallback.

huiyuxie added a commit that referenced this issue Aug 14, 2023
@huiyuxie
Copy link
Member Author

huiyuxie commented Aug 14, 2023

I have tried removing mutable keyword and it worked successfully and also passed the test.

But if you want to use mutable struct for other purposes, it does not seem like a good solution. I asked the similar question here https://discourse.julialang.org/t/how-to-pass-a-mutable-struct-to-cuda-kernel-argument/102804.

@ranocha
Copy link
Collaborator

ranocha commented Aug 15, 2023

At least we know why it failed and have a kind of workaround allowing you to test some kernels for the MHD equations. Thanks for looking for a real solution to this issue!

@huiyuxie huiyuxie self-assigned this May 13, 2024
@huiyuxie huiyuxie moved this from Todo to In Progress in CUDA Support @trixi-framework/Trixi.jl May 13, 2024
@huiyuxie huiyuxie added the bug Something isn't working label May 13, 2024
@huiyuxie huiyuxie removed their assignment May 13, 2024
@huiyuxie huiyuxie pinned this issue May 19, 2024
@huiyuxie huiyuxie unpinned this issue May 19, 2024
@huiyuxie
Copy link
Member Author

Try to fix here trixi-framework/Trixi.jl#2050

@huiyuxie
Copy link
Member Author

Also here trixi-framework/Trixi.jl#2052

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants