Add tests using other real types #591

ranocha · 2021-05-22T07:54:16Z

sloede · 2021-05-22T09:00:03Z

As far as I can tell, MultiFloats.jl does not support transcendental functions, so maybe Quadmath.jl makes more sense for a general-purpose high-precision float. OTOH, MultiFloats.jl seems to be much better supported.

Float32 would definitely be interesting as well!

stillyslalom · 2021-06-17T19:59:55Z

This will require some changes because of Julia's automatic type promotion for mixed-precision floating point arithmetic. For example, this flux calculation includes multiplication with Float64 numeric literals which will coerce any lower-precision arguments to Float64. For this to work seamlessly, I'd recommend switching floating-point literals to integer literals, Rationals, or wrapping them with a parameterized convert(T, x) whenever possible. At the very least, the returned SVector should be parameterized to match the floating-point type of the arguments u_ll, u_rr to ensure the returned type does not differ from the input type.

For example,

julia> f(x) = x * 1//2
f (generic function with 1 method)

julia> f(2.0)
1.0

julia> f(2.0f0)
1.0f0

julia> f(Float16(2.0))
Float16(1.0)

The machine code for e.g. the Float32 method is identical to what's generated for g(x) = x * 0.5f0:

julia> @code_native f(2.0f0)
        .text
; ┌ @ REPL[3]:1 within `f'
        pushq   %rbp
        movq    %rsp, %rbp
        movabsq $.rodata.cst4, %rax
; │┌ @ promotion.jl:322 within `*' @ float.jl:331
        vmulss  (%rax), %xmm0, %xmm0
; │└
        popq    %rbp
        retq
        nopw    %cs:(%rax,%rax)
; └

julia> @code_native g(2.0f0)
        .text
; ┌ @ REPL[21]:1 within `g'
        pushq   %rbp
        movq    %rsp, %rbp
        movabsq $.rodata.cst4, %rax
; │┌ @ float.jl:331 within `*'
        vmulss  (%rax), %xmm0, %xmm0
; │└
        popq    %rbp
        retq
        nopw    %cs:(%rax,%rax)
; └

ranocha · 2021-06-18T03:41:16Z

Thanks for your input, @stillyslalom! Sure, that's part of the process. The first step will be to fix everything such that we can run simulations using plain Float32 everywhere without internal promotion to Float64. Do you happen to know whether there is some tool to check whether Float64s appear in intermediate calculations which should mostly use Float32s (without checking the source or assembly code, but automatic like @ensure_float32_math_only)?

ranocha · 2021-06-18T04:17:00Z

Of course, it would be a great PR to

convert methods to use generic types
check that the performance is not impacted
add tests to verify desired properties

ranocha · 2021-06-19T05:04:34Z

Right now, I hesitate a bit to use rational constants. Unless (something like) JuliaLang/julia#41258 is merged, there can and will be surprising problems. For example, the case of one half reported by @stillyslalom above is fine

julia> f(x) = x * 1//2
f (generic function with 1 method)

julia> @code_native debuginfo=:none syntax=:intel f(1.5)
        .text
        movabs  rax, offset .rodata.cst8
        vmulsd  xmm0, xmm0, qword ptr [rax]
        ret
        nop

but other rational constant will be problematic, e.g.

julia> f(x) = x * 1//3
f (generic function with 1 method)

julia> @code_native debuginfo=:none syntax=:intel f(1.5)
        .text
        sub     rsp, 40
        vmovsd  qword ptr [rsp + 8], xmm0
        movabs  rax, offset divgcd
        lea     rdi, [rsp + 16]
        mov     esi, 1
        mov     edx, 3
        call    rax
        mov     rax, qword ptr [rsp + 24]
        test    rax, rax
        js      L54
        mov     rcx, qword ptr [rsp + 16]
        jmp     L66
L54:
        neg     rax
        js      L91
        xor     ecx, ecx
        sub     rcx, qword ptr [rsp + 16]
L66:
        vcvtsi2sd       xmm0, xmm1, rcx
        vcvtsi2sd       xmm1, xmm1, rax
        vdivsd  xmm0, xmm0, xmm1
        vmulsd  xmm0, xmm0, qword ptr [rsp + 8]
        add     rsp, 40
        ret
L91:
        movabs  rax, offset jl_system_image_data
        mov     qword ptr [rsp + 32], rax
        movabs  rax, offset jl_invoke
        movabs  rdi, offset jl_system_image_data
        lea     rsi, [rsp + 32]
        movabs  rcx, 140365791294448
        mov     edx, 1
        call    rax
        ud2
        nop     word ptr cs:[rax + rax]

I don't really like the (only?) ways to get good code using something like

julia> f(x::T) where {T} = x * (T(1) / T(3))
f (generic function with 1 method)

julia> @code_native debuginfo=:none syntax=:intel f(1.5)
        .text
        movabs  rax, offset .rodata.cst8
        vmulsd  xmm0, xmm0, qword ptr [rax]
        ret
        nop

julia> f(x) = x * Base.unsafe_rational(Int, 1, 3)
f (generic function with 1 method)

julia> @code_native debuginfo=:none syntax=:intel f(1.5)
        .text
        movabs  rax, offset .rodata.cst8
        vmulsd  xmm0, xmm0, qword ptr [rax]
        ret
        nop

since that's considerably less readable, I think.

sloede · 2021-06-19T07:44:04Z

I agree. I could live with writing fractions using double // syntax, but if it goes beyond that in terms of complexity, we'd have a very convincing use case first. In either case, we'd have to have checks on place that ensure that the other real does not get clobbered accidentally by Float64 by a new function implemented by an unsuspecting user.

ranocha · 2021-07-27T13:07:30Z

Now that JuliaLang/julia#41258 is merged, we might be able to use this form of providing constants in Julia v1.8 (or whenever the change is included in a new release).

Edit: I guess it's v1.8 - it's already available in the nightly builds

julia> f(x) = x * 1//2
f (generic function with 1 method)

julia> @code_native debuginfo=:none f(1.5)
        .text
        .file   "f"
        .section        .rodata.cst8,"aM",@progbits,8
        .p2align        3                               # -- Begin function julia_f_54
.LCPI0_0:
        .quad   0x3fe0000000000000              # double 0.5
        .text
        .globl  julia_f_54
        .p2align        4, 0x90
        .type   julia_f_54,@function
julia_f_54:                             # @julia_f_54
        .cfi_startproc
# %bb.0:                                # %top
        movabsq $.LCPI0_0, %rax
        vmulsd  (%rax), %xmm0, %xmm0
        retq
.Lfunc_end0:
        .size   julia_f_54, .Lfunc_end0-julia_f_54
        .cfi_endproc
                                        # -- End function
        .section        ".note.GNU-stack","",@progbits

julia> f(x) = x * 2//3
f (generic function with 1 method)

julia> @code_native debuginfo=:none f(1.5)
        .text
        .file   "f"
        .section        .rodata.cst8,"aM",@progbits,8
        .p2align        3                               # -- Begin function julia_f_109
.LCPI0_0:
        .quad   0x3fe5555555555555              # double 0.66666666666666663
        .text
        .globl  julia_f_109
        .p2align        4, 0x90
        .type   julia_f_109,@function
julia_f_109:                            # @julia_f_109
        .cfi_startproc
# %bb.0:                                # %top
        movabsq $.LCPI0_0, %rax
        vmulsd  (%rax), %xmm0, %xmm0
        retq
.Lfunc_end0:
        .size   julia_f_109, .Lfunc_end0-julia_f_109
        .cfi_endproc
                                        # -- End function
        .section        ".note.GNU-stack","",@progbits

julia> versioninfo()
Julia Version 1.8.0-DEV.1125
Commit fa8be6fda5 (2021-12-12 19:26 UTC)

ranocha · 2023-07-11T11:49:26Z

Our current code doesn't work well with Float32 on GPUs, see trixi-gpu/TrixiCUDA.jl#3. I tried using rational constants such as (1//2) instead but got problems such as

julia> surface_flux_kernel1 = @cuda launch = false surface_flux_kernel1!(surface_flux_arr, interfaces_u, orientations, equations, surface_flux)
ERROR: InvalidIRError: compiling MethodInstance for surface_flux_kernel1!(::CuDeviceArray{Float32, 5, 1}, ::CuDeviceArray{Float32, 5, 1}, ::CuDeviceVector{Int32, 1}, ::CompressibleEulerEquations3D{Float32}, ::FluxLaxFriedrichs{typeof(max_abs_speed_naive)}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to __throw_rational_argerror_typemin(T) @ Base rational.jl:20)
Stacktrace:
 [1] checked_den
   @ ./rational.jl:24
 [2] Rational
   @ ./rational.jl:36
 [3] Rational
   @ ./rational.jl:39
 [4] //
   @ ./rational.jl:62
 [5] flux
   @ /tmp/trixi_cuda/trixi/src/equations/compressible_euler_3d.jl:379
 [6] flux_central
   @ /tmp/trixi_cuda/trixi/src/equations/numerical_fluxes.jl:20
...

Thus, it seems to be better to use 0.5f0 instead of 0.5 or (1//2). However, it makes the code also a bit more ugly to read. If we really want to use GPUs and/or nice Float32 operations on CPUs, we have to approach this problem. The only options I see are

Write 0.5f0 etc.
Let a macro do the job and wrap files in @muladd @trixi_special_macro begin ... end blocks - or just @trixi_special_macro begin ... end adding @muladd automatically?

Any thoughts, @trixi-framework/developers?

ranocha · 2023-07-11T14:44:02Z

Result from a local discussion: Maybe a macro? And maybe with the option to be disabled via Preferences.jl?

ranocha · 2023-07-12T09:51:13Z

Xref https://github.com/omlins/ParallelStencil.jl/blob/8530bfd12d99c12da27ffee58d95307a78f59348/src/ParallelKernel/parallel.jl#L250-L291 and https://github.com/JuliaMath/ChangePrecision.jl

ranocha · 2023-08-07T11:51:05Z

After thinking more about it, I tend to prefer an explicit solution where we write 0.5f0 instead of 0.5 etc.

ranocha · 2024-04-16T15:02:45Z

We have discussed this again and agreed to proceed as follows:

Convert 0.5 to 0.5f0 etc.
Add unit tests for fluxes etc.
Add bigger tests - maybe using LIKWID?
Write explicit conversions for magic constants like

Trixi.jl/src/solvers/dgsem_tree/indicators.jl

Lines 116 to 118 in b1a84a6

    
           # magic parameters 
        
           threshold = 0.5 * 10^(-1.8 * (nnodes(dg))^0.25) 
        
           parameter_s = log((1 - 0.0001) / 0.0001)

JoshuaLampert · 2024-06-06T18:28:38Z

I just came across DispatchDoctor.jl, which provides tools to guarantee type stability without any overhead. Is it worth thinking about using this package?

ranocha · 2024-06-06T19:26:56Z

Could be worth looking at. I'm just a bit concerned about "should" in

Code which is type stable should safely compile away the check

from the README - and the warning that it will increase precompile time quite a bit

JoshuaLampert · 2024-06-06T20:02:29Z

There is some discussion on the "should" part in this discourse thread.

ranocha · 2024-06-07T04:23:39Z

I would see it mostly useful for testing/CI. But that would require wrapping basically everything in @stable and enabling it only in our tests. I haven't followed the discussion completely - is it possible to use @stable like we use @muladd to wrap complete files? It looks like it may be, but there were some caveats.

Another aspect to consider is that we will still need tests to ensure that we return the correct type, e.g., for Float64 inputs (by converting 0.5 to 0.5f0 etc.). Thus, I am not sure how much we gain since we have to call things anyway explicitly in tests. And we have checks for allocations that would warn us about type instabilities, too.

I don't want to be pessimistic - just discuss and list pro/contra arguments. It's definitely an interesting package and has been on my TODO list of things to look into more deeply.

huiyuxie · 2024-08-16T00:31:21Z

Please check the task box once you see something is already finished :) @ranocha

Also, please change to improve math.jl (not just ln_mean) for Float32 as basically all the functions from that file should be implemented again for Float32. Thanks!

ranocha added performance We are greedy testing labels May 22, 2021

ranocha added the good first issue Good for newcomers label Jun 18, 2021

jlchan mentioned this issue Aug 24, 2021

improve performance of DGMulti methods using entropy-projections (for 2D Euler) #822

Merged

ranocha mentioned this issue Jul 11, 2023

Report errors when apply surface_flux for 3D trixi-gpu/TrixiCUDA.jl#3

Closed

huiyuxie mentioned this issue Jul 12, 2023

fix Float32/Float64 issue for 3D surface_flux_kernel1! trixi-gpu/TrixiCUDA.jl#4

Closed

huiyuxie mentioned this issue Apr 19, 2024

Add numerical support of other real types (Float32) #1909

Merged

3 tasks

huiyuxie mentioned this issue May 13, 2024

Support type stability for Float32 when calling rhs! trixi-gpu/TrixiCUDA.jl#12

Closed

huiyuxie added this to CUDA Support @trixi-framework/Trixi.jl May 13, 2024

huiyuxie moved this to Todo in CUDA Support @trixi-framework/Trixi.jl May 13, 2024

huiyuxie mentioned this issue Jul 24, 2024

Add numerical support for other real types (fluxes) #2021

Merged

2 tasks

ranocha mentioned this issue Aug 16, 2024

computations using only Float32 and no Float64 #2042

Merged

2 tasks

huiyuxie removed this from CUDA Support @trixi-framework/Trixi.jl Aug 20, 2024

ranocha mentioned this issue Aug 21, 2024

adapt auxiliary math functions for Float32 #2048

Merged

1 task

huiyuxie mentioned this issue Oct 30, 2024

Make StructuredMesh fully type-general #2136

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests using other real types #591

Add tests using other real types #591

ranocha commented May 22, 2021 •

edited

Loading

sloede commented May 22, 2021

stillyslalom commented Jun 17, 2021

ranocha commented Jun 18, 2021 •

edited

Loading

ranocha commented Jun 18, 2021

ranocha commented Jun 19, 2021

sloede commented Jun 19, 2021

ranocha commented Jul 27, 2021 •

edited

Loading

ranocha commented Jul 11, 2023 •

edited

Loading

ranocha commented Jul 11, 2023

ranocha commented Jul 12, 2023 •

edited

Loading

ranocha commented Aug 7, 2023

ranocha commented Apr 16, 2024

JoshuaLampert commented Jun 6, 2024

ranocha commented Jun 6, 2024

JoshuaLampert commented Jun 6, 2024

ranocha commented Jun 7, 2024

huiyuxie commented Aug 16, 2024 •

edited

Loading

Add tests using other real types #591

Add tests using other real types #591

Comments

ranocha commented May 22, 2021 • edited Loading

Tasks

sloede commented May 22, 2021

stillyslalom commented Jun 17, 2021

ranocha commented Jun 18, 2021 • edited Loading

ranocha commented Jun 18, 2021

ranocha commented Jun 19, 2021

sloede commented Jun 19, 2021

ranocha commented Jul 27, 2021 • edited Loading

ranocha commented Jul 11, 2023 • edited Loading

ranocha commented Jul 11, 2023

ranocha commented Jul 12, 2023 • edited Loading

ranocha commented Aug 7, 2023

ranocha commented Apr 16, 2024

JoshuaLampert commented Jun 6, 2024

ranocha commented Jun 6, 2024

JoshuaLampert commented Jun 6, 2024

ranocha commented Jun 7, 2024

huiyuxie commented Aug 16, 2024 • edited Loading

ranocha commented May 22, 2021 •

edited

Loading

ranocha commented Jun 18, 2021 •

edited

Loading

ranocha commented Jul 27, 2021 •

edited

Loading

ranocha commented Jul 11, 2023 •

edited

Loading

ranocha commented Jul 12, 2023 •

edited

Loading

huiyuxie commented Aug 16, 2024 •

edited

Loading