-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Unify divb analyze routines for equations with magnetic field #1712
base: main
Are you sure you want to change the base?
Conversation
Review checklistThis checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging. Purpose and scope
Code quality
Documentation
Testing
Performance
Verification
Created with ❤️ by the Trixi.jl community. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1712 +/- ##
==========================================
+ Coverage 74.61% 82.99% +8.38%
==========================================
Files 431 431
Lines 34600 34654 +54
==========================================
+ Hits 25815 28761 +2946
+ Misses 8785 5893 -2892
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small suggestions, but otherwise this looks like a good approach for unifying these analysis quantities! You should, however, also ping Andrew or Hendrik once you're ready for a final look
B1_kj, _, _ = magnetic_field(view(u, :, k, j, element), equations) | ||
_, B2_ik, _ = magnetic_field(view(u, :, i, k, element), equations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
B1_kj, _, _ = magnetic_field(view(u, :, k, j, element), equations) | |
_, B2_ik, _ = magnetic_field(view(u, :, i, k, element), equations) | |
B1_kj, _, _ = magnetic_field(get_node_vars(u, equations, dg, k, j, element), equations) | |
_, B2_ik, _ = magnetic_field(get_node_vars(u, equations, dg, i, k, element), equations) |
To avoid the ugly view
? If this works, it should also be applied to the other places in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works for a short simulation, but it seems that it creates a lot of allocations. I tested with the following:
julia> using Trixi
julia> using BenchmarkTools
julia> include("../examples/tree_2d_dgsem/elixir_mhd_ec.jl")
julia> @benchmark Trixi.analyze($Val(:l2_divb), $ode.u0, $ode.u0, 0.0, $mesh, $equations, $solver, $semi.cache)
I got the following for the implementation with view
on Rocinante:
BenchmarkTools.Trial: 4272 samples with 1 evaluation.
Range (min … max): 661.995 μs … 9.138 ms ┊ GC (min … max): 0.00% … 88.29%
Time (median): 898.936 μs ┊ GC (median): 0.00%
Time (mean ± σ): 1.164 ms ± 1.153 ms ┊ GC (mean ± σ): 20.53% ± 17.92%
█▆▆▆▅▅▃▁ ▁
█████████▅▃▅▅▃▁▁▁▁▁▆▃▇▁▁▁▁▁▆▇▇▇▆▁▁▄▄▃▃▃▃▄▅▁▄▄▃▃▄▄▅▅▅▇▄▆▆▆▆▆ █
662 μs Histogram: log(frequency) by time 7.53 ms <
Memory estimate: 2.13 MiB, allocs estimate: 28685.
But the implementation with get_node_vars
crashes with the following (I removed many lines in the middle of the error message for readability):
[644396] signal (11.1): Segmentation fault
in expression starting at REPL[4]:1
getindex at ./essentials.jl:14 [inlined]
#309 at /mnt/hd1/scratch/aruedara/20231107_divB/Trixi.jl/src/solvers/dg.jl:542 [inlined]
macro expansion at ./ntuple.jl:72 [inlined]
ntuple at ./ntuple.jl:69 [inlined]
get_node_vars at /mnt/hd1/scratch/aruedara/20231107_divB/Trixi.jl/src/solvers/dg.jl:542 [inlined]
.
.
.
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
true_main at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:573
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:717
main at julia (unknown line)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 83483411 (Pool: 83454153; Big: 29258); GC: 106
Segmentation fault (core dumped)
Is this test OK? should I keep the implementation with view
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think there seems to be an underlying issue that I just do not understand.
Can you try to separate the get_node_vars
call from the magnetic_field
call? That is, to use something like
B1_kj, _, _ = magnetic_field(view(u, :, k, j, element), equations) | |
_, B2_ik, _ = magnetic_field(view(u, :, i, k, element), equations) | |
u_local_kj = get_node_vars(u, equations, dg, k, j, element) | |
u_local_ik = get_node_vars(u, equations, dg, i, k, element) | |
B1_kj, _, _ = magnetic_field(u_local_kj, equations) | |
_, B2_ik, _ = magnetic_field(u_local_ik, equations) |
such that we know where exactly the issue comes from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works 🙂 :
BenchmarkTools.Trial: 3875 samples with 1 evaluation.
Range (min … max): 685.359 μs … 9.795 ms ┊ GC (min … max): 0.00% … 83.22%
Time (median): 992.257 μs ┊ GC (median): 0.00%
Time (mean ± σ): 1.284 ms ± 1.303 ms ┊ GC (mean ± σ): 20.88% ± 17.77%
█▅▆▆▆▅▄▂▁ ▁
█████████▆▆▃▃▁▁▁▁▁▁▁▆▅▆▁▁▁▁▁▄▆▇▅▅▃▃▃▅▃▆▅▅▅▁▁▁▃▅▃▃▅▄▇▆▇▆▆▆▄▆ █
685 μs Histogram: log(frequency) by time 8.2 ms <
Memory estimate: 2.13 MiB, allocs estimate: 28685.
Is this preferred to the view
implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this preferred to the
view
implementation?
Yes. We introduced the get_node_vars
function and friends specifically for the purpose of not having to use view
s (which used to allocate) and also to hide memory layout details from the application.
@ranocha out of curiosity, do you have an explanation why my original formulation caused a segfault?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You called it with u_ode
as argument but the function is written to accept u
? But I would have expected that both versions lead to the same problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You called it with
u_ode
as argument but the function is written to acceptu
?
No, I called it exactly as detailed above, with ode.u0
for both u
and du
(du
is not used in the function).
I'm getting a weird behavior... Yesterday, when I tried the first code suggestion of Michael, I got the error message twice in a row. Today, I decided to give it a go again, and the benchmark is working fine (now 12 times in a row). Maybe the problem yesterday was that someone else was running a simulation on Rocinante(?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I'm saying - ode.u0
is a vector, we accept the wrapped array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you run it with any --check-bounds
flags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, ok! I didn't realize that... No, I ran it with --check-bounds=no
. Without that flag it crashes with
BoundsError: attempt to access 36864-element Vector{Float64} at index [1, 2, 1, 1]
I corrected my testing procedure and now I'm using:
julia> using Trixi
julia> using BenchmarkTools
julia> include("../examples/tree_2d_dgsem/elixir_mhd_ec.jl")
julia> u = Trixi.wrap_array(ode.u0, mesh, equations, solver, semi.cache)
julia> @benchmark Trixi.analyze($Val(:l2_divb), $u, $u, 0.0, $mesh, $equations, $solver, $semi.cache)
This seems to work fine using the first suggestion by @sloede (checking bounds):
BenchmarkTools.Trial: 5838 samples with 1 evaluation.
Range (min … max): 700.557 μs … 3.900 ms ┊ GC (min … max): 0.00% … 69.72%
Time (median): 724.738 μs ┊ GC (median): 0.00%
Time (mean ± σ): 853.904 μs ± 585.713 μs ┊ GC (mean ± σ): 14.73% ± 16.32%
█▂ ▃ ▁
██▃▃▁▁▁▁▆▄▅▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇█ █
701 μs Histogram: log(frequency) by time 3.55 ms <
Memory estimate: 2.31 MiB, allocs estimate: 32785.
For reference, this is the original implementation in Trixi:
BenchmarkTools.Trial: 5729 samples with 1 evaluation.
Range (min … max): 711.988 μs … 3.850 ms ┊ GC (min … max): 0.00% … 71.50%
Time (median): 738.548 μs ┊ GC (median): 0.00%
Time (mean ± σ): 870.079 μs ± 592.862 μs ┊ GC (mean ± σ): 14.63% ± 16.28%
█▃ ▃ ▁
██▄▃▁▁▆▆▄▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄█ █
712 μs Histogram: log(frequency) by time 3.6 ms <
Memory estimate: 2.31 MiB, allocs estimate: 32785.
So it seems that it works fine with get_node_vars
.
No description provided.