Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards partitioned RHS's (1D) #1967

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

DanielDoehring
Copy link
Contributor

@DanielDoehring DanielDoehring commented Jun 5, 2024

This draft is one way how we could realize partitioned RHS's with relatively little changes to the existing code.
Opinions on this @sloede ?

Related to #21

Copy link
Contributor

github-actions bot commented Jun 5, 2024

Review checklist

This checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging.

Purpose and scope

  • The PR has a single goal that is clear from the PR title and/or description.
  • All code changes represent a single set of modifications that logically belong together.
  • No more than 500 lines of code are changed or there is no obvious way to split the PR into multiple PRs.

Code quality

  • The code can be understood easily.
  • Newly introduced names for variables etc. are self-descriptive and consistent with existing naming conventions.
  • There are no redundancies that can be removed by simple modularization/refactoring.
  • There are no leftover debug statements or commented code sections.
  • The code adheres to our conventions and style guide, and to the Julia guidelines.

Documentation

  • New functions and types are documented with a docstring or top-level comment.
  • Relevant publications are referenced in docstrings (see example for formatting).
  • Inline comments are used to document longer or unusual code sections.
  • Comments describe intent ("why?") and not just functionality ("what?").
  • If the PR introduces a significant change or new feature, it is documented in NEWS.md with its PR number.

Testing

  • The PR passes all tests.
  • New or modified lines of code are covered by tests.
  • New or modified tests run in less then 10 seconds.

Performance

  • There are no type instabilities or memory allocations in performance-critical parts.
  • If the PR intent is to improve performance, before/after time measurements are posted in the PR.

Verification

  • The correctness of the code was verified using appropriate tests.
  • If new equations/methods are added, a convergence test has been run and the results
    are posted in the PR.

Created with ❤️ by the Trixi.jl community.

Copy link

codecov bot commented Jun 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.36%. Comparing base (dcf1b58) to head (eca08c0).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1967   +/-   ##
=======================================
  Coverage   96.36%   96.36%           
=======================================
  Files         480      480           
  Lines       38028    38028           
=======================================
  Hits        36645    36645           
  Misses       1383     1383           
Flag Coverage Δ
unittests 96.36% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sloede
Copy link
Member

sloede commented Jun 21, 2024

This draft is one way how we could realize partitioned RHS's with relatively little changes to the existing code.
Opinions on this @sloede ?

From a first look, I like the idea!

An small variation could be to have a modified function without default argument and create new functions with the old API that call it with eachelement(...) etc. But I don't know if that's really more readable. Have you checked if your change impacts the performance (since now maybe the compiler is not able to infer as much as before)?

Another thought is that to me, range strongly implies an ordered subset of an array, e.g., something like 5:10 or 10:2:20. However, in this case there is no such a restriction - it is just a bunch of indices. So maybe {element,interface,whatever}_indices would be more apt?

@sloede
Copy link
Member

sloede commented Jun 21, 2024

In general, I think this would be a good topic to bring up at a Trixi.jl meeting for discussion (possibly with a heads up in Slack such that people can think about it beforehand)

@DanielDoehring
Copy link
Contributor Author

An small variation could be to have a modified function without default argument and create new functions with the old API that call it with eachelement(...) etc. But I don't know if that's really more readable.

True, but this adds in principle the overhead of calling another function, right? But should be explored in a benchmarking run.

Have you checked if your change impacts the performance (since now maybe the compiler is not able to infer as much as before)?

No, not yet.

Another thought is that to me, range strongly implies an ordered subset of an array, e.g., something like 5:10 or 10:2:20. However, in this case there is no such a restriction - it is just a bunch of indices. So maybe {element,interface,whatever}_indices would be more apt?

Yeah that sounds reasonable 👍

@sloede
Copy link
Member

sloede commented Jun 21, 2024

An small variation could be to have a modified function without default argument and create new functions with the old API that call it with eachelement(...) etc. But I don't know if that's really more readable.

True, but this adds in principle the overhead of calling another function, right? But should be explored in a benchmarking run.

Yes and yes. In the end, it is likely that it doesn't make a difference performance-wise.

@ranocha
Copy link
Member

ranocha commented Jun 25, 2024

Sounds reasonable. Could you please run some benchmarks?

@DanielDoehring
Copy link
Contributor Author

Currently, we do not benchmark any 1D simulation:

for elixir in [joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_extended.jl"),
joinpath(examples_dir(), "tree_2d_dgsem", "elixir_advection_amr_nonperiodic.jl"),
joinpath(examples_dir(), "tree_2d_dgsem", "elixir_euler_ec.jl"),
joinpath(examples_dir(), "tree_2d_dgsem", "elixir_euler_vortex_mortar.jl"),
joinpath(examples_dir(), "tree_2d_dgsem", "elixir_euler_vortex_mortar_shockcapturing.jl"),
joinpath(examples_dir(), "tree_2d_dgsem", "elixir_mhd_ec.jl"),
joinpath(examples_dir(), "structured_2d_dgsem", "elixir_advection_extended.jl"),
joinpath(examples_dir(), "structured_2d_dgsem", "elixir_advection_nonperiodic.jl"),
joinpath(examples_dir(), "structured_2d_dgsem", "elixir_euler_ec.jl"),
joinpath(examples_dir(), "structured_2d_dgsem", "elixir_euler_source_terms_nonperiodic.jl"),
joinpath(examples_dir(), "structured_2d_dgsem", "elixir_mhd_ec.jl"),
joinpath(examples_dir(), "unstructured_2d_dgsem", "elixir_euler_wall_bc.jl"), # this is the only elixir working for polydeg=3
joinpath(examples_dir(), "p4est_2d_dgsem", "elixir_advection_extended.jl"),
joinpath(@__DIR__, "elixir_2d_euler_vortex_tree.jl"),
joinpath(@__DIR__, "elixir_2d_euler_vortex_structured.jl"),
joinpath(@__DIR__, "elixir_2d_euler_vortex_unstructured.jl"),
joinpath(@__DIR__, "elixir_2d_euler_vortex_p4est.jl"),
joinpath(examples_dir(), "tree_3d_dgsem", "elixir_advection_extended.jl"),
joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_ec.jl"),
joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_mortar.jl"),
joinpath(examples_dir(), "tree_3d_dgsem", "elixir_euler_shockcapturing.jl"),
joinpath(examples_dir(), "tree_3d_dgsem", "elixir_mhd_ec.jl"),
joinpath(examples_dir(), "structured_3d_dgsem", "elixir_advection_nonperiodic_curved.jl"),
joinpath(examples_dir(), "structured_3d_dgsem", "elixir_euler_ec.jl"),
joinpath(examples_dir(), "structured_3d_dgsem", "elixir_euler_source_terms_nonperiodic_curved.jl"),
joinpath(examples_dir(), "structured_3d_dgsem", "elixir_mhd_ec.jl"),
joinpath(examples_dir(), "p4est_3d_dgsem", "elixir_advection_basic.jl"),]

I could add some 1D elixirs locally or we extend the benchmarks by some representative elixirs for 1D.

Otherwise, I could extend these changes to 2D to get benchmark results for this.

@ranocha
Copy link
Member

ranocha commented Jun 28, 2024

I think it would be nice if you could add some representative 1D elixirs to the benchmarks in a new PR

@DanielDoehring
Copy link
Contributor Author

Okay, so here are some benchmark results (Only one thread, two threads is basically no output).
I now run an artificial comparison of main against main to see whether these performance losses are real.

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

  • Time of benchmarks:
    • Target: 19 Jul 2024 - 20:15
    • Baseline: 20 Jul 2024 - 03:21
  • Package commits:
    • Target: c8abd0
    • Baseline: 91eac3
  • Julia commits:
    • Target: 48d4fd
    • Baseline: 48d4fd
  • Julia command flags:
    • Target: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
    • Baseline: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["benchmark/elixir_2d_euler_vortex_p4est.jl", "p3_rhs!"] 1.05 (5%) ❌ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_tree.jl", "p3_analysis"] 1.16 (5%) ❌ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p3_analysis"] 1.07 (5%) ❌ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p3_rhs!"] 1.06 (5%) ❌ 1.00 (1%)
["latency", "mhd_2d"] 0.99 (5%) 0.95 (1%) ✅
["p4est_2d_dgsem/elixir_advection_extended.jl", "p3_rhs!"] 1.05 (5%) ❌ 1.00 (1%)
["structured_1d_dgsem/elixir_euler_sedov.jl", "p3_rhs!"] 1.07 (5%) ❌ 1.00 (1%)
["structured_2d_dgsem/elixir_advection_extended.jl", "p3_rhs!"] 1.06 (5%) ❌ 1.00 (1%)
["structured_2d_dgsem/elixir_advection_extended.jl", "p7_rhs!"] 1.05 (5%) ❌ 1.00 (1%)
["structured_2d_dgsem/elixir_euler_ec.jl", "p3_rhs!"] 1.09 (5%) ❌ 1.00 (1%)
["structured_2d_dgsem/elixir_mhd_ec.jl", "p3_analysis"] 1.06 (5%) ❌ 1.00 (1%)
["structured_2d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] 1.06 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_mhd_ec.jl", "p3_analysis"] 1.08 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"] 1.06 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p3_analysis"] 1.05 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p3_rhs!"] 1.08 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p7_analysis"] 1.06 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl", "p7_rhs!"] 1.05 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl", "p3_rhs!"] 1.07 (5%) ❌ 1.00 (1%)
["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl", "p7_rhs!"] 1.12 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl", "p3_analysis"] 1.09 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_advection_extended.jl", "p3_analysis"] 1.09 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_advection_extended.jl", "p3_rhs!"] 1.07 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_euler_ec.jl", "p3_analysis"] 1.14 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_euler_ec.jl", "p3_rhs!"] 1.07 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p3_analysis"] 1.19 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p3_rhs!"] 1.08 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_mhd_ec.jl", "p3_analysis"] 1.07 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] 1.13 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_mhd_ec.jl", "p7_analysis"] 1.08 (5%) ❌ 1.00 (1%)
["tree_2d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"] 1.05 (5%) ❌ 1.00 (1%)
["tree_3d_dgsem/elixir_advection_extended.jl", "p3_analysis"] 1.12 (5%) ❌ 1.00 (1%)
["tree_3d_dgsem/elixir_euler_ec.jl", "p3_analysis"] 1.10 (5%) ❌ 1.00 (1%)
["tree_3d_dgsem/elixir_euler_mortar.jl", "p3_analysis"] 1.07 (5%) ❌ 1.00 (1%)
["tree_3d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] 1.09 (5%) ❌ 1.00 (1%)
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl", "p7_rhs!"] 1.06 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["benchmark/elixir_2d_euler_vortex_p4est.jl"]
  • ["benchmark/elixir_2d_euler_vortex_structured.jl"]
  • ["benchmark/elixir_2d_euler_vortex_tree.jl"]
  • ["benchmark/elixir_2d_euler_vortex_unstructured.jl"]
  • ["latency"]
  • ["p4est_2d_dgsem/elixir_advection_extended.jl"]
  • ["p4est_3d_dgsem/elixir_advection_basic.jl"]
  • ["structured_1d_dgsem/elixir_euler_sedov.jl"]
  • ["structured_2d_dgsem/elixir_advection_extended.jl"]
  • ["structured_2d_dgsem/elixir_advection_nonperiodic.jl"]
  • ["structured_2d_dgsem/elixir_euler_ec.jl"]
  • ["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
  • ["structured_2d_dgsem/elixir_mhd_ec.jl"]
  • ["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl"]
  • ["structured_3d_dgsem/elixir_euler_ec.jl"]
  • ["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic_curved.jl"]
  • ["structured_3d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
  • ["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]
  • ["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl"]
  • ["tree_2d_dgsem/elixir_advection_extended.jl"]
  • ["tree_2d_dgsem/elixir_euler_ec.jl"]
  • ["tree_2d_dgsem/elixir_euler_vortex_mortar.jl"]
  • ["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl"]
  • ["tree_2d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_3d_dgsem/elixir_advection_extended.jl"]
  • ["tree_3d_dgsem/elixir_euler_ec.jl"]
  • ["tree_3d_dgsem/elixir_euler_mortar.jl"]
  • ["tree_3d_dgsem/elixir_euler_shockcapturing.jl"]
  • ["tree_3d_dgsem/elixir_mhd_ec.jl"]
  • ["unstructured_2d_dgsem/elixir_euler_wall_bc.jl"]

Julia versioninfo

Target

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz     646850 s         50 s      38939 s  355231854 s          0 s
  Memory: 377.4298286437988 GB (378345.30859375 MB free)
  Uptime: 278077.63 sec
  Load Avg:  1.0  1.0  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz     895859 s         62 s      47475 s  387683751 s          0 s
  Memory: 377.4298286437988 GB (378315.21875 MB free)
  Uptime: 303633.39 sec
  Load Avg:  1.0  1.0  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

@DanielDoehring
Copy link
Contributor Author

Results of main vs main:

1 Thread:

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

  • Time of benchmarks:
    • Target: 25 Jul 2024 - 14:00
    • Baseline: 25 Jul 2024 - 21:07
  • Package commits:
    • Target: 91eac3
    • Baseline: 91eac3
  • Julia commits:
    • Target: 48d4fd
    • Baseline: 48d4fd
  • Julia command flags:
    • Target: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
    • Baseline: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["benchmark/elixir_2d_euler_vortex_p4est.jl", "p7_analysis"] 0.95 (5%) ✅ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_structured.jl", "p3_analysis"] 0.94 (5%) ✅ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_structured.jl", "p7_analysis"] 0.94 (5%) ✅ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_tree.jl", "p7_analysis"] 0.92 (5%) ✅ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_tree.jl", "p7_rhs!"] 0.93 (5%) ✅ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p3_analysis"] 0.94 (5%) ✅ 1.00 (1%)
["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p7_analysis"] 0.91 (5%) ✅ 1.00 (1%)
["latency", "mhd_2d"] 1.01 (5%) 1.06 (1%) ❌
["p4est_2d_dgsem/elixir_advection_extended.jl", "p7_analysis"] 0.94 (5%) ✅ 1.00 (1%)
["p4est_3d_dgsem/elixir_advection_basic.jl", "p3_analysis"] 0.92 (5%) ✅ 1.00 (1%)
["p4est_3d_dgsem/elixir_advection_basic.jl", "p7_analysis"] 0.92 (5%) ✅ 1.00 (1%)
["structured_1d_dgsem/elixir_euler_sedov.jl", "p3_rhs!"] 0.95 (5%) ✅ 1.00 (1%)
["structured_1d_dgsem/elixir_euler_sedov.jl", "p7_rhs!"] 0.95 (5%) ✅ 1.00 (1%)
["structured_2d_dgsem/elixir_advection_extended.jl", "p7_analysis"] 0.95 (5%) ✅ 1.00 (1%)
["structured_2d_dgsem/elixir_euler_ec.jl", "p7_analysis"] 0.95 (5%) ✅ 1.00 (1%)
["structured_2d_dgsem/elixir_mhd_ec.jl", "p7_analysis"] 0.91 (5%) ✅ 1.00 (1%)
["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl", "p3_analysis"] 0.94 (5%) ✅ 1.00 (1%)
["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl", "p7_analysis"] 0.93 (5%) ✅ 1.00 (1%)
["structured_3d_dgsem/elixir_euler_ec.jl", "p3_analysis"] 0.95 (5%) ✅ 1.00 (1%)
["structured_3d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] 0.93 (5%) ✅ 1.00 (1%)
["tree_2d_dgsem/elixir_advection_extended.jl", "p7_analysis"] 0.93 (5%) ✅ 1.00 (1%)
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p3_analysis"] 0.93 (5%) ✅ 1.00 (1%)
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p7_analysis"] 0.94 (5%) ✅ 1.00 (1%)
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p7_rhs!"] 0.94 (5%) ✅ 1.00 (1%)
["tree_3d_dgsem/elixir_advection_extended.jl", "p7_analysis"] 0.95 (5%) ✅ 1.00 (1%)
["tree_3d_dgsem/elixir_euler_mortar.jl", "p3_analysis"] 0.90 (5%) ✅ 1.00 (1%)
["tree_3d_dgsem/elixir_euler_shockcapturing.jl", "p7_rhs!"] 0.95 (5%) ✅ 1.00 (1%)
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl", "p7_analysis"] 0.94 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["benchmark/elixir_2d_euler_vortex_p4est.jl"]
  • ["benchmark/elixir_2d_euler_vortex_structured.jl"]
  • ["benchmark/elixir_2d_euler_vortex_tree.jl"]
  • ["benchmark/elixir_2d_euler_vortex_unstructured.jl"]
  • ["latency"]
  • ["p4est_2d_dgsem/elixir_advection_extended.jl"]
  • ["p4est_3d_dgsem/elixir_advection_basic.jl"]
  • ["structured_1d_dgsem/elixir_euler_sedov.jl"]
  • ["structured_2d_dgsem/elixir_advection_extended.jl"]
  • ["structured_2d_dgsem/elixir_advection_nonperiodic.jl"]
  • ["structured_2d_dgsem/elixir_euler_ec.jl"]
  • ["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
  • ["structured_2d_dgsem/elixir_mhd_ec.jl"]
  • ["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl"]
  • ["structured_3d_dgsem/elixir_euler_ec.jl"]
  • ["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic_curved.jl"]
  • ["structured_3d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
  • ["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]
  • ["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl"]
  • ["tree_2d_dgsem/elixir_advection_extended.jl"]
  • ["tree_2d_dgsem/elixir_euler_ec.jl"]
  • ["tree_2d_dgsem/elixir_euler_vortex_mortar.jl"]
  • ["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl"]
  • ["tree_2d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_3d_dgsem/elixir_advection_extended.jl"]
  • ["tree_3d_dgsem/elixir_euler_ec.jl"]
  • ["tree_3d_dgsem/elixir_euler_mortar.jl"]
  • ["tree_3d_dgsem/elixir_euler_shockcapturing.jl"]
  • ["tree_3d_dgsem/elixir_mhd_ec.jl"]
  • ["unstructured_2d_dgsem/elixir_euler_wall_bc.jl"]

Julia versioninfo

Target

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz    9410557 s        173 s     861143 s  980369069 s          0 s
  Memory: 377.4298286437988 GB (376935.27734375 MB free)
  Uptime: 774020.13 sec
  Load Avg:  1.05  1.01  1.03
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  4294 MHz   10307518 s        173 s     891043 s  1012239646 s          0 s
  Memory: 377.4298286437988 GB (378011.94140625 MB free)
  Uptime: 799648.39 sec
  Load Avg:  1.06  1.02  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

2 Threads:

Benchmark Report for /home/daniel/git/Trixi.jl

Job Properties

  • Time of benchmarks:
    • Target: 26 Jul 2024 - 01:20
    • Baseline: 26 Jul 2024 - 05:32
  • Package commits:
    • Target: 91eac3
    • Baseline: 91eac3
  • Julia commits:
    • Target: 48d4fd
    • Baseline: 48d4fd
  • Julia command flags:
    • Target: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
    • Baseline: -C,native,-J/snap/julia/100/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["tree_2d_dgsem/elixir_mhd_ec.jl", "p7_analysis"] 1.07 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["benchmark/elixir_2d_euler_vortex_p4est.jl"]
  • ["benchmark/elixir_2d_euler_vortex_structured.jl"]
  • ["benchmark/elixir_2d_euler_vortex_tree.jl"]
  • ["benchmark/elixir_2d_euler_vortex_unstructured.jl"]
  • ["latency"]
  • ["p4est_2d_dgsem/elixir_advection_extended.jl"]
  • ["p4est_3d_dgsem/elixir_advection_basic.jl"]
  • ["structured_1d_dgsem/elixir_euler_sedov.jl"]
  • ["structured_2d_dgsem/elixir_advection_extended.jl"]
  • ["structured_2d_dgsem/elixir_advection_nonperiodic.jl"]
  • ["structured_2d_dgsem/elixir_euler_ec.jl"]
  • ["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
  • ["structured_2d_dgsem/elixir_mhd_ec.jl"]
  • ["structured_3d_dgsem/elixir_advection_nonperiodic_curved.jl"]
  • ["structured_3d_dgsem/elixir_euler_ec.jl"]
  • ["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic_curved.jl"]
  • ["structured_3d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
  • ["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]
  • ["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl"]
  • ["tree_2d_dgsem/elixir_advection_extended.jl"]
  • ["tree_2d_dgsem/elixir_euler_ec.jl"]
  • ["tree_2d_dgsem/elixir_euler_vortex_mortar.jl"]
  • ["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl"]
  • ["tree_2d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_3d_dgsem/elixir_advection_extended.jl"]
  • ["tree_3d_dgsem/elixir_euler_ec.jl"]
  • ["tree_3d_dgsem/elixir_euler_mortar.jl"]
  • ["tree_3d_dgsem/elixir_euler_shockcapturing.jl"]
  • ["tree_3d_dgsem/elixir_mhd_ec.jl"]
  • ["unstructured_2d_dgsem/elixir_euler_wall_bc.jl"]

Julia versioninfo

Target

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  4296 MHz   10561224 s        173 s     903795 s  1031362101 s          0 s
  Memory: 377.4298286437988 GB (377872.78125 MB free)
  Uptime: 814797.44 sec
  Load Avg:  1.48  1.64  1.61
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  4297 MHz   10814263 s        174 s     916671 s  1050464699 s          0 s
  Memory: 377.4298286437988 GB (377916.60546875 MB free)
  Uptime: 829930.61 sec
  Load Avg:  1.38  1.61  1.59
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

@DanielDoehring DanielDoehring added the enhancement New feature or request label Aug 14, 2024
@DanielDoehring
Copy link
Contributor Author

@vchuravy can you maybe take a look at this (if you have the time)? I really lack the experience to be able to judge these results :/

@DanielDoehring
Copy link
Contributor Author

DanielDoehring commented Oct 30, 2024

I took a look at the differences of the lowered form of the function for the pure_and_blended_element_ids! for both the new version with the element_indices parameter and the currently existing one.

Here is the lowered code of the new version:

julia> @code_lowered Trixi.pure_and_blended_element_ids!(element_ids_dg, element_ids_dgfv, alpha, solver, cache, eachelement(solver, cache))
CodeInfo(
1 ─       Trixi.empty!(element_ids_dg)
│         Trixi.empty!(element_ids_dgfv)
│   %3  = element_indices
│         @_8 = Base.iterate(%3)
│   %5  = @_8 === nothing%6  = Base.not_int(%5)
└──       goto #7 if not %6
2%8  = @_8
│         element = Core.getfield(%8, 1)
│   %10 = Core.getfield(%8, 2)
│   %11 = Base.getindex(alpha, element)
│   %12 = (:atol,)
│   %13 = Core.apply_type(Core.NamedTuple, %12)
│   %14 = Core.tuple(1.0e-12)
│   %15 = (%13)(%14)
│         dg_only = Core.kwcall(%15, Trixi.isapprox, %11, 0)
└──       goto #4 if not dg_only
3 ─       Trixi.push!(element_ids_dg, element)
└──       goto #5
4 ─       Trixi.push!(element_ids_dgfv, element)
5@_8 = Base.iterate(%3, %10)
│   %22 = @_8 === nothing%23 = Base.not_int(%22)
└──       goto #7 if not %23
6 ─       goto #2
7return Trixi.nothing
)

For the currently implemented one:

julia> @code_lowered Trixi.pure_and_blended_element_ids!(element_ids_dg, element_ids_dgfv, alpha, solver, cache)
CodeInfo(
1 ─       Trixi.empty!(element_ids_dg)
│         Trixi.empty!(element_ids_dgfv)
│   %3  = Trixi.eachelement(dg, cache)
│         @_7 = Base.iterate(%3)
│   %5  = @_7 === nothing%6  = Base.not_int(%5)
└──       goto #7 if not %6
2%8  = @_7
│         element = Core.getfield(%8, 1)
│   %10 = Core.getfield(%8, 2)
│   %11 = Base.getindex(alpha, element)
│   %12 = (:atol,)
│   %13 = Core.apply_type(Core.NamedTuple, %12)
│   %14 = Core.tuple(1.0e-12)
│   %15 = (%13)(%14)
│         dg_only = Core.kwcall(%15, Trixi.isapprox, %11, 0)
└──       goto #4 if not dg_only
3 ─       Trixi.push!(element_ids_dg, element)
└──       goto #5
4 ─       Trixi.push!(element_ids_dgfv, element)
5@_7 = Base.iterate(%3, %10)
│   %22 = @_7 === nothing%23 = Base.not_int(%22)
└──       goto #7 if not %23
6 ─       goto #2
7return Trixi.nothing
)

The kompare:

Screenshot from 2024-10-30 11-26-06

Looks safe to me? @ranocha @vchuravy

@ranocha
Copy link
Member

ranocha commented Oct 31, 2024

Should be fine, I hope... Maybe you can run one or two benchmarks manually to check.

@DanielDoehring
Copy link
Contributor Author

Here are benchmarks for the 1D tests only. Looks as expected.

Benchmark Report for /storage/home/daniel/git/Trixi.jl

Job Properties

  • Time of benchmarks:
    • Target: 7 Nov 2024 - 12:25
    • Baseline: 7 Nov 2024 - 12:39
  • Package commits:
    • Target: 44e46a
    • Baseline: 933ac4
  • Julia commits:
    • Target: 67dffc
    • Baseline: 67dffc
  • Julia command flags:
    • Target: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
    • Baseline: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["latency"]
  • ["structured_1d_dgsem/elixir_euler_sedov.jl"]
  • ["tree_1d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
  • ["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]

Julia versioninfo

Target

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      18814 s          0 s       4845 s   14978003 s          0 s
  Memory: 377.4223098754883 GB (380672.515625 MB free)
  Uptime: 11722.15 sec
  Load Avg:  1.02  0.97  0.64
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      27231 s          0 s       5519 s   16037812 s          0 s
  Memory: 377.4223098754883 GB (380634.75390625 MB free)
  Uptime: 12557.29 sec
  Load Avg:  1.0  1.05  1.0
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 128 virtual cores)

Benchmark Report for /storage/home/daniel/git/Trixi.jl

Job Properties

  • Time of benchmarks:
    • Target: 7 Nov 2024 - 12:53
    • Baseline: 7 Nov 2024 - 13:07
  • Package commits:
    • Target: 44e46a
    • Baseline: 933ac4
  • Julia commits:
    • Target: 67dffc
    • Baseline: 67dffc
  • Julia command flags:
    • Target: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
    • Baseline: -C,native,-J/storage/home/daniel/julia-1.10.6/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["latency"]
  • ["structured_1d_dgsem/elixir_euler_sedov.jl"]
  • ["tree_1d_dgsem/elixir_mhd_ec.jl"]
  • ["tree_1d_dgsem/elixir_navierstokes_convergence_walls_amr.jl"]
  • ["tree_1d_dgsem/elixir_shallowwater_well_balanced_nonperiodic.jl"]

Julia versioninfo

Target

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      36084 s          0 s       6217 s   17101589 s          0 s
  Memory: 377.4223098754883 GB (380693.77734375 MB free)
  Uptime: 13395.9 sec
  Load Avg:  1.31  1.11  1.04
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

Baseline

Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 24.04.1 LTS
  uname: Linux 6.8.0-48-generic #48-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 27 14:04:52 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 9374F 32-Core Processor: 
                  speed         user         nice          sys         idle          irq
       #1-128  1500 MHz      44935 s          0 s       6886 s   18161142 s          0 s
  Memory: 377.4223098754883 GB (380618.16796875 MB free)
  Uptime: 14231.19 sec
  Load Avg:  1.31  1.11  1.04
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 2 default, 0 interactive, 1 GC (on 128 virtual cores)

@DanielDoehring DanielDoehring marked this pull request as ready for review November 7, 2024 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants