-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report errors when apply surface_flux
for 3D
#3
Comments
Which versions of
are you using? |
Thanks for helping me solve the issue! Julia v1.9.0
Trixi v0.5.28
StaticArrays v1.5.26 |
That's an ugly error. Did you try to follow the
given in the stacktrace? It may be that we're running into some compiler heuristics that work fine for small tuples but not as well for slightly larger tuples - which is why it manifests only for 3D. |
Aha, I think I understood something. I am using Julia v1.9.2 but that shouldn't matter. First, I cloned this repo and ran (trixi_cuda) pkg> instantiate Then, I executed julia> include("header.jl")
julia> include("cuda_dg_3d.jl") and ran the code # CUDA kernel for calculating surface fluxes
function surface_flux_kernel1!(surface_flux_arr, interfaces_u, orientations,
equations::AbstractEquations{3}, surface_flux::Any)
j2 = (blockIdx().x - 1) * blockDim().x + threadIdx().x
j3 = (blockIdx().y - 1) * blockDim().y + threadIdx().y
k = (blockIdx().z - 1) * blockDim().z + threadIdx().z
if (j2 <= size(surface_flux_arr, 3) && j3 <= size(surface_flux_arr, 4) && k <= size(surface_flux_arr, 5))
u_ll, u_rr = get_surface_node_vars(interfaces_u, equations, j2, j3, k)
orientation = orientations[k]
surface_flux_node = surface_flux(u_ll, u_rr, orientation, equations)
@inbounds begin
for j1j1 in axes(surface_flux_arr, 2)
surface_flux_arr[1, j1j1, j2, j3, k] = surface_flux_node[j1j1]
end
end
end
return nothing
end
du, u = copy_to_gpu!(du, u)
cuda_volume_integral!(
du, u, mesh,
have_nonconservative_terms(equations), equations,
solver.volume_integral, solver)
cuda_prolong2interfaces!(u, mesh, cache)
surface_flux = solver.surface_integral.surface_flux
interfaces_u = CuArray{Float32}(cache.interfaces.u)
neighbor_ids = CuArray{Int32}(cache.interfaces.neighbor_ids)
orientations = CuArray{Int32}(cache.interfaces.orientations)
surface_flux_arr = CuArray{Float32}(undef, 1, size(interfaces_u)[2:end]...)
size_arr = CuArray{Float32}(undef, size(interfaces_u, 3), size(interfaces_u, 4), size(interfaces_u, 5))
surface_flux_kernel1 = @cuda launch = false surface_flux_kernel1!(
surface_flux_arr, interfaces_u, orientations, equations, surface_flux) This gave me the error you mentioned above. Next, I ran julia> code_typed(err[1].exception)
1-element Vector{Any}:
CodeInfo(
...
└──── %650 = Base.mul_float(%648, %649)::Float64
109 ┄ %651 = φ (#105 => %627, #107 => %640, #108 => %650)::Float64
│ %652 = φ (#105 => %623, #107 => %636, #108 => %646)::Union{Float32, Float64}
│ %653 = φ (#105 => %622, #107 => %635, #108 => %643)::Union{Float32, Float64}
│ %654 = φ (#105 => %621, #107 => %632, #108 => %642)::Union{Float32, Float64}
│ %655 = φ (#105 => %184, #107 => %233, #108 => %282)::Float32
│ %656 = invoke Main.SVector(%655::Float32, %654::Vararg{Any}, %653, %652, %651)::Any
... This type instability can be observed more nicely via julia> code_warntype(err[1].exception)
...
surface_flux_node::Any
...
│ %64 = Base.getindex(surface_flux_node, j1j1)::Any
... It basically boils down to julia> begin
u_ll = SVector(1.0f0, 0.1f0, 0.2f0, 0.3f0, 2.0f0)
u_rr = u_ll
equations = CompressibleEulerEquations3D(1.4f0)
orientation = 1
@code_warntype flux_lax_friedrichs(u_ll, u_rr, orientation, equations)
end
MethodInstance for (::FluxLaxFriedrichs{typeof(max_abs_speed_naive)})(::SVector{5, Float32}, ::SVector{5, Float32}, ::Int64, ::CompressibleEulerEquations3D{Float32})
from (numflux::FluxPlusDissipation)(u_ll, u_rr, orientation_or_normal_direction, equations) @ Main /tmp/trixi_cuda/trixi/src/equations/numerical_fluxes.jl:38
Arguments
numflux::Core.Const(FluxLaxFriedrichs(max_abs_speed_naive))
u_ll::SVector{5, Float32}
u_rr::SVector{5, Float32}
orientation_or_normal_direction::Int64
equations::CompressibleEulerEquations3D{Float32}
Locals
dissipation::DissipationLocalLaxFriedrichs{typeof(max_abs_speed_naive)}
numerical_flux::typeof(flux_central)
Body::Any
1 ─ nothing
│ (numerical_flux = Base.getproperty(numflux, :numerical_flux))
│ (dissipation = Base.getproperty(numflux, :dissipation))
│ %4 = (numerical_flux)(u_ll, u_rr, orientation_or_normal_direction, equations)::Any
│ %5 = (dissipation)(u_ll, u_rr, orientation_or_normal_direction, equations)::SVector{5, Float64}
│ %6 = (%4 + %5)::Any
└── return %6
julia> begin
u_ll = SVector(1.0f0, 0.1f0, 0.2f0, 0.3f0, 2.0f0)
u_rr = u_ll
equations = CompressibleEulerEquations3D(1.4f0)
orientation = 1
@code_warntype flux_central(u_ll, u_rr, orientation, equations)
end
MethodInstance for flux_central(::SVector{5, Float32}, ::SVector{5, Float32}, ::Int64, ::CompressibleEulerEquations3D{Float32})
from flux_central(u_ll, u_rr, orientation_or_normal_direction, equations::AbstractEquations) @ Main /tmp/trixi_cuda/trixi/src/equations/numerical_fluxes.jl:17
Arguments
#self#::Core.Const(flux_central)
u_ll::SVector{5, Float32}
u_rr::SVector{5, Float32}
orientation_or_normal_direction::Int64
equations::CompressibleEulerEquations3D{Float32}
Locals
f_rr::Any
f_ll::Any
Body::Any
1 ─ nothing
│ (f_ll = Main.flux(u_ll, orientation_or_normal_direction, equations))
│ (f_rr = Main.flux(u_rr, orientation_or_normal_direction, equations))
│ %4 = (f_ll + f_rr)::Any
│ %5 = (0.5 * %4)::Any
└── return %5
julia> begin
u_ll = SVector(1.0f0, 0.1f0, 0.2f0, 0.3f0, 2.0f0)
u_rr = u_ll
equations = CompressibleEulerEquations3D(1.4f0)
orientation = 1
@code_warntype flux(u_ll, orientation, equations)
end
...
f5::Float64
f4::Union{Float32, Float64}
f3::Union{Float32, Float64}
f2::Union{Float32, Float64}
f1::Float32
...
6 ┄ %60 = Main.SVector(f1, f2, f3, f4, f5)::Any
└── return %60 The problem is that we have explicit constants like Could you please check whether it works if you use |
Another option is to rewrite the code to use 32 bit floating point literals for constants, see #4 for example. |
We have an open issue in Trixi.jl for this - trixi-framework/Trixi.jl#591 - which I just revived. We should discuss it with other Trixi.jl developers there. |
Thank you so much for providing the detailed debugging process and finally solving it for me! I have changed the float type from |
Kernel
surface_flux_kernel!()
worked for both 1D and 2D but failed for 3D.1D version
surface_flux_kernel!()
functionhttps://github.com/huiyuxie/trixi_cuda/blob/a6e268a54203e725c67997f7f5a33bd82001fd60/cuda_dg_1d.jl#L173-L190
2D version
surface_flux_kernel!()
functionhttps://github.com/huiyuxie/trixi_cuda/blob/a6e268a54203e725c67997f7f5a33bd82001fd60/cuda_dg_2d.jl#L196-L215
But for 3D equations, we provided two ways for comparing the benchmark results but both failed
https://github.com/huiyuxie/trixi_cuda/blob/0ed812ce977b5ebc80f7a92c042864acb1625a46/cuda_dg_3d.jl#L205-L248
The errors are the same like
Both kernels for 3D worked for
test/linear_scalar_advection_3d.jl
but both failed fortest/compressible_euler_3d.jl
. Having checked the flux computing process for 1D, 2D, and 3D compressible euler equations but found nothing extremely different. So currently the issue cannot be solved.The text was updated successfully, but these errors were encountered: