Towards partitioned RHS's (1D) #1967

DanielDoehring · 2024-06-05T07:29:33Z

This draft is one way how we could realize partitioned RHS's with relatively little changes to the existing code.
Opinions on this @sloede ?

Related to #21

github-actions · 2024-06-05T07:29:47Z

Review checklist

This checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging.

Purpose and scope

The PR has a single goal that is clear from the PR title and/or description.
All code changes represent a single set of modifications that logically belong together.
No more than 500 lines of code are changed or there is no obvious way to split the PR into multiple PRs.

Code quality

The code can be understood easily.
Newly introduced names for variables etc. are self-descriptive and consistent with existing naming conventions.
There are no redundancies that can be removed by simple modularization/refactoring.
There are no leftover debug statements or commented code sections.
The code adheres to our conventions and style guide, and to the Julia guidelines.

Documentation

New functions and types are documented with a docstring or top-level comment.
Relevant publications are referenced in docstrings (see example for formatting).
Inline comments are used to document longer or unusual code sections.
Comments describe intent ("why?") and not just functionality ("what?").
If the PR introduces a significant change or new feature, it is documented in NEWS.md with its PR number.

Testing

The PR passes all tests.
New or modified lines of code are covered by tests.
New or modified tests run in less then 10 seconds.

Performance

There are no type instabilities or memory allocations in performance-critical parts.
If the PR intent is to improve performance, before/after time measurements are posted in the PR.

Verification

The correctness of the code was verified using appropriate tests.
If new equations/methods are added, a convergence test has been run and the results
are posted in the PR.

Created with ❤️ by the Trixi.jl community.

codecov · 2024-06-05T10:21:36Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.36%. Comparing base (dcf1b58) to head (eca08c0).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1967   +/-   ##
=======================================
  Coverage   96.36%   96.36%           
=======================================
  Files         480      480           
  Lines       38028    38028           
=======================================
  Hits        36645    36645           
  Misses       1383     1383

Flag	Coverage Δ
unittests	`96.36% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sloede · 2024-06-21T04:27:55Z

This draft is one way how we could realize partitioned RHS's with relatively little changes to the existing code.
Opinions on this @sloede ?

From a first look, I like the idea!

An small variation could be to have a modified function without default argument and create new functions with the old API that call it with eachelement(...) etc. But I don't know if that's really more readable. Have you checked if your change impacts the performance (since now maybe the compiler is not able to infer as much as before)?

Another thought is that to me, range strongly implies an ordered subset of an array, e.g., something like 5:10 or 10:2:20. However, in this case there is no such a restriction - it is just a bunch of indices. So maybe {element,interface,whatever}_indices would be more apt?

sloede · 2024-06-21T04:28:38Z

In general, I think this would be a good topic to bring up at a Trixi.jl meeting for discussion (possibly with a heads up in Slack such that people can think about it beforehand)

DanielDoehring · 2024-06-21T11:25:02Z

An small variation could be to have a modified function without default argument and create new functions with the old API that call it with eachelement(...) etc. But I don't know if that's really more readable.

True, but this adds in principle the overhead of calling another function, right? But should be explored in a benchmarking run.

Have you checked if your change impacts the performance (since now maybe the compiler is not able to infer as much as before)?

No, not yet.

Another thought is that to me, range strongly implies an ordered subset of an array, e.g., something like 5:10 or 10:2:20. However, in this case there is no such a restriction - it is just a bunch of indices. So maybe {element,interface,whatever}_indices would be more apt?

Yeah that sounds reasonable 👍

sloede · 2024-06-21T11:34:06Z

An small variation could be to have a modified function without default argument and create new functions with the old API that call it with eachelement(...) etc. But I don't know if that's really more readable.

True, but this adds in principle the overhead of calling another function, right? But should be explored in a benchmarking run.

Yes and yes. In the end, it is likely that it doesn't make a difference performance-wise.

ranocha · 2024-06-25T14:21:32Z

Sounds reasonable. Could you please run some benchmarks?

DanielDoehring · 2024-06-27T08:56:05Z

Currently, we do not benchmark any 1D simulation:

Trixi.jl/benchmark/benchmarks.jl

Lines 13 to 39 in 961f2e7

    
           for elixir in [joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_extended.jl"), 
        
                          joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_amr_nonperiodic.jl"), 
        
                          joinpath(examples_dir(), "tree_2d_dgsem", "elixir_euler_ec.jl"), 
        
                          joinpath(examples_dir(), "tree_2d_dgsem", "elixir_euler_vortex_mortar.jl"), 
        
                          joinpath(examples_dir(), "tree_2d_dgsem", "elixir_euler_vortex_mortar_shockcapturing.jl"), 
        
                          joinpath(examples_dir(), "tree_2d_dgsem", "elixir_mhd_ec.jl"), 
        
                          joinpath(examples_dir(), "structured_2d_dgsem", "elixir_advection_extended.jl"), 
        
                          joinpath(examples_dir(), "structured_2d_dgsem", "elixir_advection_nonperiodic.jl"), 
        
                          joinpath(examples_dir(), "structured_2d_dgsem", "elixir_euler_ec.jl"), 
        
                          joinpath(examples_dir(), "structured_2d_dgsem", "elixir_euler_source_terms_nonperiodic.jl"), 
        
                          joinpath(examples_dir(), "structured_2d_dgsem", "elixir_mhd_ec.jl"), 
        
                          joinpath(examples_dir(), "unstructured_2d_dgsem", "elixir_euler_wall_bc.jl"), # this is the only elixir working for polydeg=3 
        
                          joinpath(examples_dir(), "p4est_2d_dgsem", "elixir_advection_extended.jl"), 
        
                          joinpath(@__DIR__, "elixir_2d_euler_vortex_tree.jl"), 
        
                          joinpath(@__DIR__, "elixir_2d_euler_vortex_structured.jl"), 
        
                          joinpath(@__DIR__, "elixir_2d_euler_vortex_unstructured.jl"), 
        
                          joinpath(@__DIR__, "elixir_2d_euler_vortex_p4est.jl"), 
        
                          joinpath(examples_dir(), "tree_3d_dgsem", "elixir_advection_extended.jl"), 
        
                          joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_ec.jl"), 
        
                          joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_mortar.jl"), 
        
                          joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_shockcapturing.jl"), 
        
                          joinpath(examples_dir(), "tree_3d_dgsem", "elixir_mhd_ec.jl"), 
        
                          joinpath(examples_dir(), "structured_3d_dgsem", "elixir_advection_nonperiodic_curved.jl"), 
        
                          joinpath(examples_dir(), "structured_3d_dgsem", "elixir_euler_ec.jl"), 
        
                          joinpath(examples_dir(), "structured_3d_dgsem", "elixir_euler_source_terms_nonperiodic_curved.jl"), 
        
                          joinpath(examples_dir(), "structured_3d_dgsem", "elixir_mhd_ec.jl"), 
        
                          joinpath(examples_dir(), "p4est_3d_dgsem", "elixir_advection_basic.jl"),]

I could add some 1D elixirs locally or we extend the benchmarks by some representative elixirs for 1D.

Otherwise, I could extend these changes to 2D to get benchmark results for this.

ranocha · 2024-06-28T07:18:33Z

I think it would be nice if you could add some representative 1D elixirs to the benchmarks in a new PR

DanielDoehring · 2024-07-25T04:53:38Z

Okay, so here are some benchmark results (Only one thread, two threads is basically no output).
I now run an artificial comparison of main against main to see whether these performance losses are real.

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

Time of benchmarks:
- Target: 19 Jul 2024 - 20:15
- Baseline: 20 Jul 2024 - 03:21
Package commits:
- Target: c8abd0
- Baseline: 91eac3
Julia commits:
- Target: 48d4fd
- Baseline: 48d4fd
Julia command flags:
- Target: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
- Baseline: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
Environment variables:
- Target: None
- Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["benchmark/elixir_2d_euler_vortex_p4est.jl", "p3_rhs!"]`	1.05 (5%) ❌	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_tree.jl", "p3_analysis"]`	1.16 (5%) ❌	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p3_analysis"]`	1.07 (5%) ❌	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p3_rhs!"]`	1.06 (5%) ❌	1.00 (1%)
`["latency", "mhd_2d"]`	0.99 (5%)	0.95 (1%) ✅
`["p4est_2d_dgsem/elixir_advection_extended.jl", "p3_rhs!"]`	1.05 (5%) ❌	1.00 (1%)
`["structured_1d_dgsem/elixir_euler_sedov.jl", "p3_rhs!"]`	1.07 (5%) ❌	1.00 (1%)
`["structured_2d_dgsem/elixir_advection_extended.jl", "p3_rhs!"]`	1.06 (5%) ❌	1.00 (1%)
`["structured_2d_dgsem/elixir_advection_extended.jl", "p7_rhs!"]`	1.05 (5%) ❌	1.00 (1%)
`["structured_2d_dgsem/elixir_euler_ec.jl", "p3_rhs!"]`	1.09 (5%) ❌	1.00 (1%)
`["structured_2d_dgsem/elixir_mhd_ec.jl", "p3_analysis"]`	1.06 (5%) ❌	1.00 (1%)
`["structured_2d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"]`	1.06 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_mhd_ec.jl", "p3_analysis"]`	1.08 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"]`	1.06 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p3_analysis"]`	1.05 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p3_rhs!"]`	1.08 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p7_analysis"]`	1.06 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p7_rhs!"]`	1.05 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl", "p3_rhs!"]`	1.07 (5%) ❌	1.00 (1%)
`["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl", "p7_rhs!"]`	1.12 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl", "p3_analysis"]`	1.09 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_advection_extended.jl", "p3_analysis"]`	1.09 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_advection_extended.jl", "p3_rhs!"]`	1.07 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_euler_ec.jl", "p3_analysis"]`	1.14 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_euler_ec.jl", "p3_rhs!"]`	1.07 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p3_analysis"]`	1.19 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p3_rhs!"]`	1.08 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_mhd_ec.jl", "p3_analysis"]`	1.07 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"]`	1.13 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_mhd_ec.jl", "p7_analysis"]`	1.08 (5%) ❌	1.00 (1%)
`["tree_2d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"]`	1.05 (5%) ❌	1.00 (1%)
`["tree_3d_dgsem/elixir_advection_extended.jl", "p3_analysis"]`	1.12 (5%) ❌	1.00 (1%)
`["tree_3d_dgsem/elixir_euler_ec.jl", "p3_analysis"]`	1.10 (5%) ❌	1.00 (1%)
`["tree_3d_dgsem/elixir_euler_mortar.jl", "p3_analysis"]`	1.07 (5%) ❌	1.00 (1%)
`["tree_3d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"]`	1.09 (5%) ❌	1.00 (1%)
`["unstructured_2d_dgsem/elixir_euler_wall_bc.jl", "p7_rhs!"]`	1.06 (5%) ❌	1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["benchmark/elixir_2d_euler_vortex_p4est.jl"]
["benchmark/elixir_2d_euler_vortex_structured.jl"]
["benchmark/elixir_2d_euler_vortex_tree.jl"]
["benchmark/elixir_2d_euler_vortex_unstructured.jl"]
["latency"]
["p4est_2d_dgsem/elixir_advection_extended.jl"]
["p4est_3d_dgsem/elixir_advection_basic.jl"]
["structured_1d_dgsem/elixir_euler_sedov.jl"]
["structured_2d_dgsem/elixir_advection_extended.jl"]
["structured_2d_dgsem/elixir_advection_nonperiodic.jl"]
["structured_2d_dgsem/elixir_euler_ec.jl"]
["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
["structured_2d_dgsem/elixir_mhd_ec.jl"]
["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl"]
["structured_3d_dgsem/elixir_euler_ec.jl"]
["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic_curved.jl"]
["structured_3d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]
["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl"]
["tree_2d_dgsem/elixir_advection_extended.jl"]
["tree_2d_dgsem/elixir_euler_ec.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl"]
["tree_2d_dgsem/elixir_mhd_ec.jl"]
["tree_3d_dgsem/elixir_advection_extended.jl"]
["tree_3d_dgsem/elixir_euler_ec.jl"]
["tree_3d_dgsem/elixir_euler_mortar.jl"]
["tree_3d_dgsem/elixir_euler_shockcapturing.jl"]
["tree_3d_dgsem/elixir_mhd_ec.jl"]
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl"]

Julia versioninfo

Target

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz     646850 s         50 s      38939 s  355231854 s          0 s
  Memory: 377.4298286437988 GB (378345.30859375 MB free)
  Uptime: 278077.63 sec
  Load Avg:  1.0  1.0  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz     895859 s         62 s      47475 s  387683751 s          0 s
  Memory: 377.4298286437988 GB (378315.21875 MB free)
  Uptime: 303633.39 sec
  Load Avg:  1.0  1.0  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

DanielDoehring · 2024-08-12T06:52:37Z

Results of main vs main:

1 Thread:

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

Time of benchmarks:
- Target: 25 Jul 2024 - 14:00
- Baseline: 25 Jul 2024 - 21:07
Package commits:
- Target: 91eac3
- Baseline: 91eac3
Julia commits:
- Target: 48d4fd
- Baseline: 48d4fd
Julia command flags:
- Target: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
- Baseline: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
Environment variables:
- Target: None
- Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["benchmark/elixir_2d_euler_vortex_p4est.jl", "p7_analysis"]`	0.95 (5%) ✅	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_structured.jl", "p3_analysis"]`	0.94 (5%) ✅	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_structured.jl", "p7_analysis"]`	0.94 (5%) ✅	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_tree.jl", "p7_analysis"]`	0.92 (5%) ✅	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_tree.jl", "p7_rhs!"]`	0.93 (5%) ✅	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p3_analysis"]`	0.94 (5%) ✅	1.00 (1%)
`["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p7_analysis"]`	0.91 (5%) ✅	1.00 (1%)
`["latency", "mhd_2d"]`	1.01 (5%)	1.06 (1%) ❌
`["p4est_2d_dgsem/elixir_advection_extended.jl", "p7_analysis"]`	0.94 (5%) ✅	1.00 (1%)
`["p4est_3d_dgsem/elixir_advection_basic.jl", "p3_analysis"]`	0.92 (5%) ✅	1.00 (1%)
`["p4est_3d_dgsem/elixir_advection_basic.jl", "p7_analysis"]`	0.92 (5%) ✅	1.00 (1%)
`["structured_1d_dgsem/elixir_euler_sedov.jl", "p3_rhs!"]`	0.95 (5%) ✅	1.00 (1%)
`["structured_1d_dgsem/elixir_euler_sedov.jl", "p7_rhs!"]`	0.95 (5%) ✅	1.00 (1%)
`["structured_2d_dgsem/elixir_advection_extended.jl", "p7_analysis"]`	0.95 (5%) ✅	1.00 (1%)
`["structured_2d_dgsem/elixir_euler_ec.jl", "p7_analysis"]`	0.95 (5%) ✅	1.00 (1%)
`["structured_2d_dgsem/elixir_mhd_ec.jl", "p7_analysis"]`	0.91 (5%) ✅	1.00 (1%)
`["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl", "p3_analysis"]`	0.94 (5%) ✅	1.00 (1%)
`["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl", "p7_analysis"]`	0.93 (5%) ✅	1.00 (1%)
`["structured_3d_dgsem/elixir_euler_ec.jl", "p3_analysis"]`	0.95 (5%) ✅	1.00 (1%)
`["structured_3d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"]`	0.93 (5%) ✅	1.00 (1%)
`["tree_2d_dgsem/elixir_advection_extended.jl", "p7_analysis"]`	0.93 (5%) ✅	1.00 (1%)
`["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p3_analysis"]`	0.93 (5%) ✅	1.00 (1%)
`["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p7_analysis"]`	0.94 (5%) ✅	1.00 (1%)
`["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p7_rhs!"]`	0.94 (5%) ✅	1.00 (1%)
`["tree_3d_dgsem/elixir_advection_extended.jl", "p7_analysis"]`	0.95 (5%) ✅	1.00 (1%)
`["tree_3d_dgsem/elixir_euler_mortar.jl", "p3_analysis"]`	0.90 (5%) ✅	1.00 (1%)
`["tree_3d_dgsem/elixir_euler_shockcapturing.jl", "p7_rhs!"]`	0.95 (5%) ✅	1.00 (1%)
`["unstructured_2d_dgsem/elixir_euler_wall_bc.jl", "p7_analysis"]`	0.94 (5%) ✅	1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["benchmark/elixir_2d_euler_vortex_p4est.jl"]
["benchmark/elixir_2d_euler_vortex_structured.jl"]
["benchmark/elixir_2d_euler_vortex_tree.jl"]
["benchmark/elixir_2d_euler_vortex_unstructured.jl"]
["latency"]
["p4est_2d_dgsem/elixir_advection_extended.jl"]
["p4est_3d_dgsem/elixir_advection_basic.jl"]
["structured_1d_dgsem/elixir_euler_sedov.jl"]
["structured_2d_dgsem/elixir_advection_extended.jl"]
["structured_2d_dgsem/elixir_advection_nonperiodic.jl"]
["structured_2d_dgsem/elixir_euler_ec.jl"]
["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
["structured_2d_dgsem/elixir_mhd_ec.jl"]
["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl"]
["structured_3d_dgsem/elixir_euler_ec.jl"]
["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic_curved.jl"]
["structured_3d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]
["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl"]
["tree_2d_dgsem/elixir_advection_extended.jl"]
["tree_2d_dgsem/elixir_euler_ec.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl"]
["tree_2d_dgsem/elixir_mhd_ec.jl"]
["tree_3d_dgsem/elixir_advection_extended.jl"]
["tree_3d_dgsem/elixir_euler_ec.jl"]
["tree_3d_dgsem/elixir_euler_mortar.jl"]
["tree_3d_dgsem/elixir_euler_shockcapturing.jl"]
["tree_3d_dgsem/elixir_mhd_ec.jl"]
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl"]

Julia versioninfo

Target

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz    9410557 s        173 s     861143 s  980369069 s          0 s
  Memory: 377.4298286437988 GB (376935.27734375 MB free)
  Uptime: 774020.13 sec
  Load Avg:  1.05  1.01  1.03
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  4294 MHz   10307518 s        173 s     891043 s  1012239646 s          0 s
  Memory: 377.4298286437988 GB (378011.94140625 MB free)
  Uptime: 799648.39 sec
  Load Avg:  1.06  1.02  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

2 Threads:

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

Time of benchmarks:
- Target: 26 Jul 2024 - 01:20
- Baseline: 26 Jul 2024 - 05:32
Package commits:
- Target: 91eac3
- Baseline: 91eac3
Julia commits:
- Target: 48d4fd
- Baseline: 48d4fd
Julia command flags:
- Target: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
- Baseline: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
Environment variables:
- Target: None
- Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio
`["tree_2d_dgsem/elixir_mhd_ec.jl", "p7_analysis"]`	1.07 (5%) ❌	1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["benchmark/elixir_2d_euler_vortex_p4est.jl"]
["benchmark/elixir_2d_euler_vortex_structured.jl"]
["benchmark/elixir_2d_euler_vortex_tree.jl"]
["benchmark/elixir_2d_euler_vortex_unstructured.jl"]
["latency"]
["p4est_2d_dgsem/elixir_advection_extended.jl"]
["p4est_3d_dgsem/elixir_advection_basic.jl"]
["structured_1d_dgsem/elixir_euler_sedov.jl"]
["structured_2d_dgsem/elixir_advection_extended.jl"]
["structured_2d_dgsem/elixir_advection_nonperiodic.jl"]
["structured_2d_dgsem/elixir_euler_ec.jl"]
["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
["structured_2d_dgsem/elixir_mhd_ec.jl"]
["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl"]
["structured_3d_dgsem/elixir_euler_ec.jl"]
["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic_curved.jl"]
["structured_3d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]
["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl"]
["tree_2d_dgsem/elixir_advection_extended.jl"]
["tree_2d_dgsem/elixir_euler_ec.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl"]
["tree_2d_dgsem/elixir_mhd_ec.jl"]
["tree_3d_dgsem/elixir_advection_extended.jl"]
["tree_3d_dgsem/elixir_euler_ec.jl"]
["tree_3d_dgsem/elixir_euler_mortar.jl"]
["tree_3d_dgsem/elixir_euler_shockcapturing.jl"]
["tree_3d_dgsem/elixir_mhd_ec.jl"]
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl"]

Julia versioninfo

Target

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  4296 MHz   10561224 s        173 s     903795 s  1031362101 s          0 s
  Memory: 377.4298286437988 GB (377872.78125 MB free)
  Uptime: 814797.44 sec
  Load Avg:  1.48  1.64  1.61
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  4297 MHz   10814263 s        174 s     916671 s  1050464699 s          0 s
  Memory: 377.4298286437988 GB (377916.60546875 MB free)
  Uptime: 829930.61 sec
  Load Avg:  1.38  1.61  1.59
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

DanielDoehring · 2024-08-28T06:37:49Z

@vchuravy can you maybe take a look at this (if you have the time)? I really lack the experience to be able to judge these results :/

…l into PartitionedRHS_1D

DanielDoehring · 2024-10-30T10:26:50Z

I took a look at the differences of the lowered form of the function for the pure_and_blended_element_ids! for both the new version with the element_indices parameter and the currently existing one.

Here is the lowered code of the new version:

julia> @code_lowered Trixi.pure_and_blended_element_ids!(element_ids_dg, element_ids_dgfv, alpha, solver, cache, eachelement(solver, cache))
CodeInfo(
1 ─       Trixi.empty!(element_ids_dg)
│         Trixi.empty!(element_ids_dgfv)
│   %3  = element_indices
│         @_8 = Base.iterate(%3)
│   %5  = @_8 === nothing
│   %6  = Base.not_int(%5)
└──       goto #7 if not %6
2 ┄ %8  = @_8
│         element = Core.getfield(%8, 1)
│   %10 = Core.getfield(%8, 2)
│   %11 = Base.getindex(alpha, element)
│   %12 = (:atol,)
│   %13 = Core.apply_type(Core.NamedTuple, %12)
│   %14 = Core.tuple(1.0e-12)
│   %15 = (%13)(%14)
│         dg_only = Core.kwcall(%15, Trixi.isapprox, %11, 0)
└──       goto #4 if not dg_only
3 ─       Trixi.push!(element_ids_dg, element)
└──       goto #5
4 ─       Trixi.push!(element_ids_dgfv, element)
5 ┄       @_8 = Base.iterate(%3, %10)
│   %22 = @_8 === nothing
│   %23 = Base.not_int(%22)
└──       goto #7 if not %23
6 ─       goto #2
7 ┄       return Trixi.nothing
)

For the currently implemented one:

julia> @code_lowered Trixi.pure_and_blended_element_ids!(element_ids_dg, element_ids_dgfv, alpha, solver, cache)
CodeInfo(
1 ─       Trixi.empty!(element_ids_dg)
│         Trixi.empty!(element_ids_dgfv)
│   %3  = Trixi.eachelement(dg, cache)
│         @_7 = Base.iterate(%3)
│   %5  = @_7 === nothing
│   %6  = Base.not_int(%5)
└──       goto #7 if not %6
2 ┄ %8  = @_7
│         element = Core.getfield(%8, 1)
│   %10 = Core.getfield(%8, 2)
│   %11 = Base.getindex(alpha, element)
│   %12 = (:atol,)
│   %13 = Core.apply_type(Core.NamedTuple, %12)
│   %14 = Core.tuple(1.0e-12)
│   %15 = (%13)(%14)
│         dg_only = Core.kwcall(%15, Trixi.isapprox, %11, 0)
└──       goto #4 if not dg_only
3 ─       Trixi.push!(element_ids_dg, element)
└──       goto #5
4 ─       Trixi.push!(element_ids_dgfv, element)
5 ┄       @_7 = Base.iterate(%3, %10)
│   %22 = @_7 === nothing
│   %23 = Base.not_int(%22)
└──       goto #7 if not %23
6 ─       goto #2
7 ┄       return Trixi.nothing
)

The kompare:

Looks safe to me? @ranocha @vchuravy

…l into PartitionedRHS_1D

ranocha · 2024-10-31T15:41:52Z

Should be fine, I hope... Maybe you can run one or two benchmarks manually to check.

…l into PartitionedRHS_1D

DanielDoehring · 2024-11-07T12:21:07Z

Here are benchmarks for the 1D tests only. Looks as expected.

Benchmark Report for /storage/home/daniel/git/Trixi.jl

Job Properties

Time of benchmarks:
- Target: 7 Nov 2024 - 12:25
- Baseline: 7 Nov 2024 - 12:39
Package commits:
- Target: 44e46a
- Baseline: 933ac4
Julia commits:
- Target: 67dffc
- Baseline: 67dffc
Julia command flags:
- Target: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
- Baseline: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
Environment variables:
- Target: None
- Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["latency"]
["structured_1d_dgsem/elixir_euler_sedov.jl"]
["tree_1d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]

Julia versioninfo

Target

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      18814 s          0 s       4845 s   14978003 s          0 s
  Memory: 377.4223098754883 GB (380672.515625 MB free)
  Uptime: 11722.15 sec
  Load Avg:  1.02  0.97  0.64
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      27231 s          0 s       5519 s   16037812 s          0 s
  Memory: 377.4223098754883 GB (380634.75390625 MB free)
  Uptime: 12557.29 sec
  Load Avg:  1.0  1.05  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Benchmark Report for /storage/home/daniel/git/Trixi.jl

Job Properties

Time of benchmarks:
- Target: 7 Nov 2024 - 12:53
- Baseline: 7 Nov 2024 - 13:07
Package commits:
- Target: 44e46a
- Baseline: 933ac4
Julia commits:
- Target: 67dffc
- Baseline: 67dffc
Julia command flags:
- Target: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
- Baseline: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
Environment variables:
- Target: None
- Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID	time ratio	memory ratio

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

["latency"]
["structured_1d_dgsem/elixir_euler_sedov.jl"]
["tree_1d_dgsem/elixir_mhd_ec.jl"]
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]

Julia versioninfo

Target

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      36084 s          0 s       6217 s   17101589 s          0 s
  Memory: 377.4223098754883 GB (380693.77734375 MB free)
  Uptime: 13395.9 sec
  Load Avg:  1.31  1.11  1.04
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      44935 s          0 s       6886 s   18161142 s          0 s
  Memory: 377.4223098754883 GB (380618.16796875 MB free)
  Uptime: 14231.19 sec
  Load Avg:  1.31  1.11  1.04
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

DanielDoehring added 6 commits March 21, 2024 12:39

PartitionRHS_Comp_1D

a424eb1

partitioned rhs structured & tree

20acb37

remove PERK Multi rhs, make range argument optional

e56855b

remove changes for 2d 3d SC

ea12778

fmt

cf3813d

Merge branch 'main' into PartitionedRHS_1D

da4ced6

DanielDoehring added 4 commits June 5, 2024 09:36

typo

749c670

remove

24c8c78

debug

947687e

fmt

0e76dca

Merge branch 'main' into PartitionedRHS_1D

78a7786

This was referenced Jul 8, 2024

Time integrator in the paired explicit Runge Kutta scheme: Perk p3 single ext DanielDoehring/Trixi.jl#35

Closed

Update benchmarks.jl: 1D #2009

Merged

DanielDoehring and others added 3 commits July 18, 2024 10:20

Merge branch 'main' into PartitionedRHS_1D

e60fb3f

Merge branch 'main' into PartitionedRHS_1D

c9607d7

Merge branch 'main' into PartitionedRHS_1D

c8abd07

DanielDoehring added the enhancement New feature or request label Aug 14, 2024

"range" to "indices"

622cc8e

Merge branch 'PartitionedRHS_1D' of github.com:DanielDoehring/Trixi.j…

a382bc8

…l into PartitionedRHS_1D

DanielDoehring added 5 commits October 30, 2024 11:29

fmt

5e17307

Merge branch 'main' into PartitionedRHS_1D

f442710

Merge branch 'main' into PartitionedRHS_1D

0a5ff79

specialize ambiguous funcs

f3f1cad

Merge branch 'PartitionedRHS_1D' of github.com:DanielDoehring/Trixi.j…

fd980f4

…l into PartitionedRHS_1D

DanielDoehring added 8 commits November 4, 2024 09:10

Merge branch 'main' into PartitionedRHS_1D

4d94dab

structured 1D rhs with indices

347f09b

1d tree rhs with indices

9023747

Merge branch 'PartitionedRHS_1D' of github.com:DanielDoehring/Trixi.j…

c3ed54d

…l into PartitionedRHS_1D

SC 1D indices

f468f46

parabolic 1D

40dc503

consistency

64b5e03

Merge branch 'main' into PartitionedRHS_1D

3f8578f

DanielDoehring marked this pull request as ready for review November 7, 2024 12:21

Merge branch 'main' into PartitionedRHS_1D

eca08c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Towards partitioned RHS's (1D) #1967

Towards partitioned RHS's (1D) #1967

DanielDoehring commented Jun 5, 2024 •

edited

Loading

github-actions bot commented Jun 5, 2024

codecov bot commented Jun 5, 2024 •

edited

Loading

sloede commented Jun 21, 2024

sloede commented Jun 21, 2024

DanielDoehring commented Jun 21, 2024

sloede commented Jun 21, 2024

ranocha commented Jun 25, 2024

DanielDoehring commented Jun 27, 2024

ranocha commented Jun 28, 2024

DanielDoehring commented Jul 25, 2024

DanielDoehring commented Aug 12, 2024

DanielDoehring commented Aug 28, 2024

DanielDoehring commented Oct 30, 2024 •

edited

Loading

ranocha commented Oct 31, 2024 •

edited

Loading

DanielDoehring commented Nov 7, 2024

Towards partitioned RHS's (1D) #1967

Are you sure you want to change the base?

Towards partitioned RHS's (1D) #1967

Conversation

DanielDoehring commented Jun 5, 2024 • edited Loading

github-actions bot commented Jun 5, 2024

Review checklist

Purpose and scope

Code quality

Documentation

Testing

Performance

Verification

codecov bot commented Jun 5, 2024 • edited Loading

Codecov Report

sloede commented Jun 21, 2024

sloede commented Jun 21, 2024

DanielDoehring commented Jun 21, 2024

sloede commented Jun 21, 2024

ranocha commented Jun 25, 2024

DanielDoehring commented Jun 27, 2024

ranocha commented Jun 28, 2024

DanielDoehring commented Jul 25, 2024

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

DanielDoehring commented Aug 12, 2024

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

DanielDoehring commented Aug 28, 2024

DanielDoehring commented Oct 30, 2024 • edited Loading

ranocha commented Oct 31, 2024 • edited Loading

DanielDoehring commented Nov 7, 2024

Benchmark Report for /storage/home/daniel/git/Trixi.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

Benchmark Report for /storage/home/daniel/git/Trixi.jl

Job Properties

Results

Benchmark Group List

Julia versioninfo

Target

Baseline

DanielDoehring commented Jun 5, 2024 •

edited

Loading

codecov bot commented Jun 5, 2024 •

edited

Loading

DanielDoehring commented Oct 30, 2024 •

edited

Loading

ranocha commented Oct 31, 2024 •

edited

Loading