Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handles restarting problem of adaptive time integration methods #1565

Merged
merged 124 commits into from
Sep 15, 2023

Conversation

ArseniyKholod
Copy link
Contributor

@ArseniyKholod ArseniyKholod commented Jul 13, 2023

If time integration method uses time step controller like PID controller, than to obtain an exact solution after restarting the computation we have to save some additional data in SaveRestartCallback. This is a first shot.

@codecov
Copy link

codecov bot commented Jul 14, 2023

Codecov Report

Patch coverage: 91.30% and project coverage change: -6.60% ⚠️

Comparison is base (92dedde) 92.21% compared to head (a41f45f) 85.61%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1565      +/-   ##
==========================================
- Coverage   92.21%   85.61%   -6.60%     
==========================================
  Files         418      418              
  Lines       34223    34203      -20     
==========================================
- Hits        31558    29281    -2277     
- Misses       2665     4922    +2257     
Flag Coverage Δ
unittests 85.61% <91.30%> (-6.60%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
src/Trixi.jl 43.48% <ø> (ø)
src/callbacks_step/save_restart_dg.jl 25.73% <90.00%> (-35.14%) ⬇️
src/callbacks_step/save_restart.jl 91.67% <92.31%> (+0.14%) ⬆️

... and 65 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ArseniyKholod
Copy link
Contributor Author

ArseniyKholod commented Jul 21, 2023

General concept is, that we have to save some data for integrator.controller, to obtain exactly the same solution as without restart. So I added additional functions that saves them in SaveRestartCallback. Problem in a restarting file is that we have to load these data after creating an integrator, that occurs inside solve() function. It's the reason why I decided to implement a loading stuff inside new callback LoadRestartCallback. While initializing this callback we load all data for integrator.controller. After initializing this callback is never called, because function that activate callback returns always false.

As was said in a discussion in trixi_students I tested that implementation for these solvers for serial and parallel execution:
All with adaptive timestep from Stability-Preserving-Runge-Kutta-Methods-for-Hyperbolic-PDEs:

  • SSPRK43()
  • SSPRK432()

And some with adaptive timestep from Low storage methods:

  • CKLLSRK85_4FM_4R
  • RDPK3Sp35
  • RDPK3SpFSAL35
  • RDPK3SpFSAL49

All of tested methods works correctly with new callback, and works either with some additional L_inf, L2 error or unstable without it.

@ArseniyKholod
Copy link
Contributor Author

ArseniyKholod commented Jul 21, 2023

@SimonCan, what do you think about this idea?
Of coarse I'll add some asserts a bit later and other checks.
Now implementation is valid only for PIDController, PIController and IController, others will not produce an error, but also solutions with and without restart will slightly differ. I'll add a detailed support for other controllers, if it makes sence.

Example of using new callback you can find above.

@ArseniyKholod ArseniyKholod requested a review from SimonCan July 21, 2023 13:41
Copy link
Contributor

@SimonCan SimonCan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.
Could you add example elixirs like the ones in your comments? I was thinking of asimilar design as in examples/structured_2d_dgsem/elixir_advection_restart.jl where they call first call elixir_advection_extended.jl and then load the data for a restart.

Project.toml Outdated Show resolved Hide resolved
src/Trixi.jl Outdated Show resolved Hide resolved
src/callbacks_step/load_restart.jl Outdated Show resolved Hide resolved
@ArseniyKholod
Copy link
Contributor Author

ArseniyKholod commented Aug 25, 2023

Michael, @sloede, may be you have some quick ideas or already saw such problem, why this error could appear? Something wrong with allocations. But error appeared first time when I added only some comments, without changing a code, strange, I'm completely at a loss.
https://github.com/trixi-framework/Trixi.jl/actions/runs/5977210124/job/16216690034?pr=1565#step:7:3322
The same error I got in my new PR (#1614), these PRs seem to have nothing in common.
https://github.com/trixi-framework/Trixi.jl/actions/runs/5977692959/job/16218213885?pr=1614#step:7:3322
One more failing test: https://github.com/trixi-framework/Trixi.jl/actions/runs/5977692959/job/16218213284?pr=1614#step:7:8346

src/callbacks_step/save_restart.jl Outdated Show resolved Hide resolved
src/callbacks_step/save_restart.jl Outdated Show resolved Hide resolved
src/callbacks_step/save_restart_dg.jl Outdated Show resolved Hide resolved
Copy link
Member

@sloede sloede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! It would be good to also get a review from @ranocha

Copy link
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you add a test using a PIDController?

@ArseniyKholod
Copy link
Contributor Author

ArseniyKholod commented Sep 15, 2023

Did you add a test using a PIDController?

Yes, in test_mpi_tree.jl with RDPK3SpFSAL49 that uses PIDController by default. And code part, that is used only by PIDController, is covered by coverage test

@ranocha ranocha merged commit 6069149 into trixi-framework:main Sep 15, 2023
29 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants