Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests stuff to fix mpi error for macos and windows runs #134

Open
wants to merge 8 commits into
base: t8codemesh-fv
Choose a base branch
from

Conversation

bennibolm
Copy link
Owner

@bennibolm bennibolm commented Dec 9, 2024

It is also a problem in serial runs. It seems to be a problem in the save solution callback.
On windows, the error occurs in t8_forest_write_vtk_ext.

Full error message:

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x7ffce64ff2e1 -- _ZL36t8_forest_vtk_vertices_scalar_kernelP9t8_forestiP7t8_treeiPK10t8_elementP16t8_eclass_schemeiP6_iobufPiPPv19T8_VTK_KERNEL_MODUS at C:\Users\benja\.julia\artifacts\bad2755ea4d60c6ce1cf44e236f40866f7b931a3\bin\libt8.dll (unknown line)
in expression starting at C:\Users\benja\Documents\git\Trixi.jl\examples\t8code_2d_fv\elixir_advection_basic.jl:90
_ZL36t8_forest_vtk_vertices_scalar_kernelP9t8_forestiP7t8_treeiPK10t8_elementP16t8_eclass_schemeiP6_iobufPiPPv19T8_VTK_KERNEL_MODUS at C:\Users\benja\.julia\artifacts\bad2755ea4d60c6ce1cf44e236f40866f7b931a3\bin\libt8.dll (unknown line)
_ZL29t8_forest_vtk_write_cell_dataP9t8_forestP6_iobufPKcS4_S4_iPFiS0_iP7t8_treeiPK10t8_elementP16t8_eclass_schemeiS2_PiPPv19T8_VTK_KERNEL_MODUSEiSD_ at C:\Users\benja\.julia\artifacts\bad2755ea4d60c6ce1cf44e236f40866f7b931a3\bin\libt8.dll (unknown line)
_Z25t8_forest_vtk_write_ASCIIP9t8_forestPKciiiiiiP19t8_vtk_data_field_t at C:\Users\benja\.julia\artifacts\bad2755ea4d60c6ce1cf44e236f40866f7b931a3\bin\libt8.dll (unknown line)
t8_forest_vtk_write_file at C:\Users\benja\.julia\artifacts\bad2755ea4d60c6ce1cf44e236f40866f7b931a3\bin\libt8.dll (unknown line)
t8_forest_write_vtk_ext at C:\Users\benja\.julia\packages\T8code\O3bnx\src\Libt8.jl:16691
output_data_to_vtu at C:\Users\benja\Documents\git\Trixi.jl\src\meshes\t8code_mesh.jl:1964
#save_solution_file#1542 at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:270
save_solution_file at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:251 [inlined]
#save_solution_file#1541 at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:240 [inlined]
save_solution_file at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:233 [inlined]
macro expansion at C:\Users\benja\.julia\packages\TrixiBase\Mq1xp\src\trixi_timeit.jl:64 [inlined]
#save_solution_file#1538 at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:226 [inlined]
save_solution_file at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:200 [inlined]
macro expansion at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:192 [inlined]
macro expansion at C:\Users\benja\.julia\packages\TrixiBase\Mq1xp\src\trixi_timeit.jl:64 [inlined]
SaveSolutionCallback at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:187
initialize_save_cb! at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:145
initialize_save_cb! at C:\Users\benja\Documents\git\Trixi.jl\src\callbacks_step\save_solution.jl:135 [inlined]
initialize! at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\callbacks.jl:18 [inlined]
initialize! at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\callbacks.jl:14
unknown function (ip: 000001fcdb1828d0)
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
do_apply at C:/workdir/src\builtins.c:768
initialize! at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\callbacks.jl:14
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
do_apply at C:/workdir/src\builtins.c:768
initialize! at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\callbacks.jl:14
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
do_apply at C:/workdir/src\builtins.c:768
initialize! at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\callbacks.jl:14
initialize! at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\callbacks.jl:7 [inlined]
initialize_callbacks! at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:654 [inlined]
#__init#747 at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:517
__init at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:10 [inlined]
__init at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:10 [inlined]
__init at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:10 [inlined]
__init at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:10 [inlined]
__init at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:10 [inlined]
#__solve#746 at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:5 [inlined]
__solve at C:\Users\benja\.julia\packages\OrdinaryDiffEq\NBaQM\src\solve.jl:1 [inlined]
#solve_call#34 at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\solve.jl:561
solve_call at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\solve.jl:527
unknown function (ip: 000001fcdb17883f)
#solve_up#42 at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\solve.jl:1010
solve_up at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\solve.jl:996 [inlined]
#solve#40 at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\solve.jl:933
solve at C:\Users\benja\.julia\packages\DiffEqBase\a6p43\src\solve.jl:923
unknown function (ip: 000001fcdb13b713)
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
do_call at C:/workdir/src\interpreter.c:126
eval_value at C:/workdir/src\interpreter.c:223
eval_stmt_value at C:/workdir/src\interpreter.c:174 [inlined]
eval_body at C:/workdir/src\interpreter.c:635
jl_interpret_toplevel_thunk at C:/workdir/src\interpreter.c:775
jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:934
jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:877
jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:877
ijl_toplevel_eval at C:/workdir/src\toplevel.c:943 [inlined]
ijl_toplevel_eval_in at C:/workdir/src\toplevel.c:985
eval at .\boot.jl:385 [inlined]
include_string at .\loading.jl:2139
_include at .\loading.jl:2199
include at .\Base.jl:496 [inlined]
#trixi_include#1 at C:\Users\benja\.julia\packages\TrixiBase\Mq1xp\src\trixi_include.jl:48
trixi_include at C:\Users\benja\.julia\packages\TrixiBase\Mq1xp\src\trixi_include.jl:33 [inlined]
#trixi_include#4 at C:\Users\benja\.julia\packages\TrixiBase\Mq1xp\src\trixi_include.jl:52 [inlined]
trixi_include at C:\Users\benja\.julia\packages\TrixiBase\Mq1xp\src\trixi_include.jl:51
unknown function (ip: 000001fcdb1088fb)
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
do_call at C:/workdir/src\interpreter.c:126
eval_value at C:/workdir/src\interpreter.c:223
eval_stmt_value at C:/workdir/src\interpreter.c:174 [inlined]
eval_body at C:/workdir/src\interpreter.c:635
jl_interpret_toplevel_thunk at C:/workdir/src\interpreter.c:775
jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:934
jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:877
ijl_toplevel_eval at C:/workdir/src\toplevel.c:943 [inlined]
ijl_toplevel_eval_in at C:/workdir/src\toplevel.c:985
eval at .\boot.jl:385 [inlined]
eval_user_input at C:\workdir\usr\share\julia\stdlib\v1.10\REPL\src\REPL.jl:150
repl_backend_loop at C:\workdir\usr\share\julia\stdlib\v1.10\REPL\src\REPL.jl:246
#start_repl_backend#46 at C:\workdir\usr\share\julia\stdlib\v1.10\REPL\src\REPL.jl:231
start_repl_backend at C:\workdir\usr\share\julia\stdlib\v1.10\REPL\src\REPL.jl:228
#run_repl#59 at C:\workdir\usr\share\julia\stdlib\v1.10\REPL\src\REPL.jl:389
run_repl at C:\workdir\usr\share\julia\stdlib\v1.10\REPL\src\REPL.jl:375
jfptr_run_repl_96275.1 at C:\Users\benja\.julia\juliaup\julia-1.10.7+0.x64.w64.mingw32\lib\julia\sys.dll (unknown line)
#1014 at .\client.jl:437
jfptr_YY.1014_87040.1 at C:\Users\benja\.julia\juliaup\julia-1.10.7+0.x64.w64.mingw32\lib\julia\sys.dll (unknown line)
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
jl_f__call_latest at C:/workdir/src\builtins.c:812
invokelatest at .\essentials.jl:889 [inlined]
run_main_repl at .\client.jl:421
exec_options at .\client.jl:338
_start at .\client.jl:557
jfptr__start_87065.1 at C:\Users\benja\.julia\juliaup\julia-1.10.7+0.x64.w64.mingw32\lib\julia\sys.dll (unknown line)
jl_apply at C:/workdir/src\julia.h:1982 [inlined]
true_main at C:/workdir/src\jlapi.c:582
jl_repl_entrypoint at C:/workdir/src\jlapi.c:731
mainCRTStartup at C:/workdir/cli\loader_exe.c:58
BaseThreadInitThunk at C:\Windows\System32\KERNEL32.DLL (unknown line)
RtlUserThreadStart at C:\Windows\SYSTEM32\ntdll.dll (unknown line)
Allocations: 13590129 (Pool: 13582780; Big: 7349); GC: 21

Things I learned:

  • Errors occur in all fv elixirs on mpi tests for macos and windows
  • Disabled summay_callback(), solve() -> No output, no call of rhs!, No errors
  • rhs! is still called with t_end=0 -> errors
  • analysis_callback seems to work properly
  • stepsize_callback seems to work properly

Tested:

  • ❌ Disable (t8code-fv) elixir advection basic in mpi tests: (Still failing) error
  • ❌ Disable call of solve in elixir_advection_basic.jl (Expected error for basic; but Still the same error for gauss) error
  • Disable call of solve in elixir_advection_basic.jl; set t_end for gauss; deactivate all tests for error numbers; deactivate basic tests for allocations
    -> ✔️ basic runs through; no output except warnings since no solve call. ❌ gauss fails with known errors (although t_end=0)

I just noticed those warnings even for die DG runs on mpi windows:
[t8] WARNING: Trying to use shared memory but intranode and internode communicators are not set. You should call t8_shmem_init before initializing a shared memory array.

Copy link

github-actions bot commented Dec 9, 2024

Review checklist

This checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging.

Purpose and scope

  • The PR has a single goal that is clear from the PR title and/or description.
  • All code changes represent a single set of modifications that logically belong together.
  • No more than 500 lines of code are changed or there is no obvious way to split the PR into multiple PRs.

Code quality

  • The code can be understood easily.
  • Newly introduced names for variables etc. are self-descriptive and consistent with existing naming conventions.
  • There are no redundancies that can be removed by simple modularization/refactoring.
  • There are no leftover debug statements or commented code sections.
  • The code adheres to our conventions and style guide, and to the Julia guidelines.

Documentation

  • New functions and types are documented with a docstring or top-level comment.
  • Relevant publications are referenced in docstrings (see example for formatting).
  • Inline comments are used to document longer or unusual code sections.
  • Comments describe intent ("why?") and not just functionality ("what?").
  • If the PR introduces a significant change or new feature, it is documented in NEWS.md with its PR number.

Testing

  • The PR passes all tests.
  • New or modified lines of code are covered by tests.
  • New or modified tests run in less then 10 seconds.

Performance

  • There are no type instabilities or memory allocations in performance-critical parts.
  • If the PR intent is to improve performance, before/after time measurements are posted in the PR.

Verification

  • The correctness of the code was verified using appropriate tests.
  • If new equations/methods are added, a convergence test has been run and the results
    are posted in the PR.

Created with ❤️ by the Trixi.jl community.

Comment on lines +20 to +21
# l2=[0.08551397247817498],
# linf=[0.12087467695430498])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
# l2=[0.08551397247817498],
# linf=[0.12087467695430498])
# l2=[0.08551397247817498],
# linf=[0.12087467695430498])

Comment on lines +33 to +34
# l2=[0.008142380494734171],
# linf=[0.018687916234976898])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
# l2=[0.008142380494734171],
# linf=[0.018687916234976898])
# l2=[0.008142380494734171],
# linf=[0.018687916234976898])

Comment on lines +53 to +54
# l2=[0.5598148317954682],
# linf=[0.6301130236005371])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
# l2=[0.5598148317954682],
# linf=[0.6301130236005371])
# l2=[0.5598148317954682],
# linf=[0.6301130236005371])

Comment on lines +67 to +68
# l2=[0.5899077806567905],
# linf=[0.8972489222157533])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
# l2=[0.5899077806567905],
# linf=[0.8972489222157533])
# l2=[0.5899077806567905],
# linf=[0.8972489222157533])

@@ -33,7 +33,7 @@ save_solution = SaveSolutionCallback(interval = 10,

stepsize_callback = StepsizeCallback(cfl = 0.5)

callbacks = CallbackSet(summary_callback, save_solution, analysis_callback, alive_callback,
callbacks = CallbackSet(#=summary_callback,=# save_solution, #=analysis_callback,=# alive_callback,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
callbacks = CallbackSet(#=summary_callback,=# save_solution, #=analysis_callback,=# alive_callback,
callbacks = CallbackSet(save_solution, alive_callback, #=analysis_callback,=#

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant