From 9b5f190047decb848c3bb2613468b58bc2ef4941 Mon Sep 17 00:00:00 2001 From: Theresa Pollinger Date: Mon, 11 Nov 2024 08:42:11 +0900 Subject: [PATCH] docs: "PDE solver" in all markdown files --- README.md | 17 ++++----- docs/advanced_topics.md | 8 ++--- docs/combination_technique.md | 6 ++-- docs/getting_started.md | 2 +- docs/parallelism.md | 12 +++---- docs/simple_tutorial.md | 48 ++++++++++++++------------ examples/selalib_distributed/README.md | 2 +- 7 files changed, 50 insertions(+), 45 deletions(-) diff --git a/README.md b/README.md index 8bc0b485..d0448570 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ While it originates from the excellent very different code, such that it has become its own project. DisCoTec is designed as a framework that can run multiple instances of a -(black-box) grid-based solver implementation. +(black-box) grid-based PDE solver implementation. The most basic example we use is a [mass-conserving FDM/FVM constant advection upwinding solver](/examples/distributed_advection/). An example of a separate, coupled solver is [SeLaLib](/examples/selalib_distributed/). @@ -28,7 +28,7 @@ Garcke [2013](https://link.springer.com/chapter/10.1007/978-3-642-31703-3_3), Harding [2016](https://link.springer.com/chapter/10.1007/978-3-319-28262-6_4)) can be used to alleviate the curse of dimensionality encountered in high-dimensional simulations. -Instead of using your solver on a single structured full grid (where every +Instead of using your PDE solver on a single structured full grid (where every dimension is finely resolved), you would use it on many different structured full grids (each of them differently resolved). We call these coarsely-resolved grids component grids. @@ -50,9 +50,9 @@ reductions in compute and memory requirements. ### Parallelism in DisCoTec -The DisCoTec framework can work with existing MPI parallelized solver codes +The DisCoTec framework can work with existing MPI parallelized PDE solver codes operating on structured grids. -In addition to the parallelism provided by the solver, it adds the combination +In addition to the parallelism provided by the PDE solver, it adds the combination technique's parallelism. This is achieved through *process groups* (pgs): `MPI_COMM_WORLD` is subdivided into equal-sized process groups @@ -85,7 +85,7 @@ a size that uses most of the available main memory of the entire system. ## When to Use DisCoTec? -If you are using a structured grid solver and want to increase its +If you are using a structured grid PDE solver and want to increase its accuracy while not spending additional compute or memory resources on it, DisCoTec may be a viable option. The codes most likely in this situation are the ones that solve @@ -97,13 +97,13 @@ resource constrained, DisCoTec could be for you, too! Use its multiscale benefits without worrying about any multiscale yourself 😊 -Why not try it [with your own solver](https://discotec.readthedocs.io/en/latest/simple_tutorial.html)? +Why not try it [with your own PDE solver](https://discotec.readthedocs.io/en/latest/simple_tutorial.html)? ### What Numerical Advantage Can I Expect? Depends on your problem! [Figure 3.6 here](http://elib.uni-stuttgart.de/handle/11682/14229) -shows a first-order accurate 2D solver achieving +shows a first-order accurate 2D PDE solver achieving approximately second-order accuracy with the Combination Technique considering the total number of DOF. (Figure omitted due to licensing, first published @@ -113,7 +113,8 @@ the total number of DOF. 1. If memory and/or time constraints are not your limiting factor; you can easily achieve the numerical accuracy you need with your resources. -2. If your solver just does not fit the discretization constraints imposed by DisCoTec: +2. If your PDE solver just does not fit the discretization constraints + imposed by DisCoTec: - a rectilinear (or mapped to rectilinear) domain - structured rectilinear grids in your main data structure (=typically the unknown function), stored as a linearized array diff --git a/docs/advanced_topics.md b/docs/advanced_topics.md index cac5156c..cf088421 100644 --- a/docs/advanced_topics.md +++ b/docs/advanced_topics.md @@ -149,9 +149,9 @@ The last reference below shows how conservation of mass and L2 stability is only provided by the latter two. In practice, we have observed that using hierarchical hat functions and long combination intervals (many time steps per combination) is fine with relatively -laminar simulations. +laminar PDE solutions. But in the turbulent regime, it becomes necessary to use the CDF wavelets and to -combine after every solver time step to avoid numerical instability. +combine after every PDE solver time step to avoid numerical instability. If you find yourself in need of higher orders of accuracy or conservation, you could add higher-order CDF wavelets to `DistributedHierarchization.hpp`. @@ -208,7 +208,7 @@ widely-distributed simulations and are discussed below. The GENE and SeLaLib examples use a separate folder for each component grid, and generate the input parameter files at the beginning of the main program. -The task then changes the directory at initialization and for the solver update, +The task then changes the directory at initialization and for the PDE solver update, so that outputs will be placed there. The derived quantities like energy can then be [combined as a postprocessing step](https://github.com/SGpp/DisCoTec/blob/main/examples/selalib_distributed/postprocessing/combine_selalib_diagnostics.cpp#L38). @@ -243,7 +243,7 @@ with LZ4. with the Widely-Distributed Sparse Grid Combination Technique’. In: SC ’23. Association for Computing Machinery, Nov. 11, 2023. url: . -### Using Solvers Written In Other Programming Languages +### Using PDE Solvers Written In Other Programming Languages Your functions need the same described interface and need to somehow expose it to the C++ compiler. diff --git a/docs/combination_technique.md b/docs/combination_technique.md index 33cc64d1..0c5ede88 100644 --- a/docs/combination_technique.md +++ b/docs/combination_technique.md @@ -5,8 +5,10 @@ The sparse grid combination technique (Griebel et al. Garcke [2013](https://link.springer.com/chapter/10.1007/978-3-642-31703-3_3), Harding [2016](https://link.springer.com/chapter/10.1007/978-3-319-28262-6_4)) can be used to alleviate the curse of dimensionality encountered in -high-dimensional simulations. -Instead of using your solver on a single structured full grid (where every +high-dimensional problems. +Such problems are encountered as partial differential equations (PDEs) +in many fields of science and engineering. +Instead of using your PDE solver on a single structured full grid (where every dimension is finely resolved), you would use it on many different structured full grids (each of them differently resolved). We call these coarsely-resolved grids component grids. diff --git a/docs/getting_started.md b/docs/getting_started.md index f3891112..b8ce729d 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -151,7 +151,7 @@ Make sure that it exists and describes the parameters you want to run. As with the tests, make sure the correct MPI is loaded to your path. The exact format and naming in `ctparam` is not (yet) standardized, to allow -adaptation for different solver applications. +adaptation for different PDE solver applications. Please refer to existing parameter files and example implementations. (pinning-with-various-mpi-implementations)= diff --git a/docs/parallelism.md b/docs/parallelism.md index b05579d2..f364533f 100644 --- a/docs/parallelism.md +++ b/docs/parallelism.md @@ -1,8 +1,8 @@ # Parallelism in DisCoTec -The DisCoTec framework can work with existing MPI parallelized solver codes +The DisCoTec framework can work with existing MPI-parallelized PDE solver codes operating on structured grids. -In addition to the parallelism provided by the solver, it adds the combination +In addition to the parallelism provided by the PDE solver, it adds the combination technique's parallelism. This is achieved mainly through *process groups* (pgs): `MPI_COMM_WORLD` is subdivided into equal-sized process groups @@ -16,8 +16,8 @@ Figure originally published in (Pollinger [2024](https://elib.uni-stuttgart.de/h All component grids are distributed to process groups (either statically, or dynamically through the manager rank). -During the solver time step and most of the combination, MPI communication only -happens within the process groups. +During the PDE solver time step and most of the combination step, MPI communication +only happens within the process groups. Conversely, for the sparse grid reduction using the combination coefficients, MPI communication only happens between a rank and its colleagues in the other process groups, e.g., rank 0 in group 0 will only talk to rank 0 in all other groups. @@ -56,8 +56,8 @@ If the process groups become too large, the MPI communication of the multiscale transform starts to dominate the combination time. If there are too many pgs, the combination reduction will dominate the combination time. -However, the times required for the solver stay relatively constant; -they are determined by the solver's own scaling and the load balancing quality. +However, the times required for the PDE solver stay relatively constant; +they are determined by the PDE solver's own scaling and the load balancing quality. There are only few codes that allow weak scaling up to this problem size: a size that uses most of the available main memory of the entire system. diff --git a/docs/simple_tutorial.md b/docs/simple_tutorial.md index aeaffba1..7a5eb773 100644 --- a/docs/simple_tutorial.md +++ b/docs/simple_tutorial.md @@ -1,13 +1,14 @@ # Tutorial: Using Your Code With DisCoTec -So you want to use your timestepped simulation with the DisCoTec framework. +So you want to solve your timestepped PDE problem with the DisCoTec framework. +You already have a PDE solver that can do it on a regular structured grid. This Tutorial gives you an outline of the steps required. ## Your Code Interface: Init, Timestep, Get/Set Full Grid, Finish The first step is to prepare your code in a way that it can be called by DisCoTec. -Typically, your simulation code will look similar to this (in pseudocode): +Typically, your PDE solver code will look similar to this (in pseudocode): ```python int num_points_x = ... @@ -45,10 +46,10 @@ conditions and all `num_points_` are chosen as powers of two. You need to transform your current solver code into stateful functions, or a stateful data structure. -Let's say you introduce a class `YourSimulation`: computations before the time +Let's say you introduce a class `YourSolver`: computations before the time loop go into its constructor or an `init()` function (if your code uses MPI, this is also where a -sub-communicator should be passed for `YourSimulation` to use). +sub-communicator should be passed for `YourSolver` to use). Computation in the time loop goes into its `run()` function. Anything after the time loop goes into the destructor or a function `finalize()`. @@ -71,7 +72,7 @@ int num_points_vx = ... - ... - ) - helper_data_structures.initialize(...) -+ my_simulation = YourSimulation( ++ my_solver = YourSolver( + num_points_x, + num_points_y, + num_points_z, @@ -84,13 +85,13 @@ float time_now = 0.0; while(time_now < end_time) { # time loop - do_timestep(grid, time_step) -+ my_simulation.run(time_step) ++ my_solver.run(time_step) time_now += time_step } - properties = compute_properties(grid) - write_output(properties, grid) -+ my_simulation.finalize() ++ my_solver.finalize() ``` This setup assumes that we can pass the number of grid points per dimension in @@ -101,17 +102,17 @@ The most portable and memory-efficient way of doing so is to provide a pointer to the beginning of the contiguous array. Let's call this getter `get_tensor_pointer`. -## Make Your Simulation a DisCoTec Task and Your Grid a DisCoTec DistributedFullGrid +## Make Your PDE Solver a DisCoTec Task and Your Grid a DisCoTec DistributedFullGrid -From now on, we assume that the interface to `YourSimulation` is wrapped in a -C++ header `YourSimulation.h`, roughly like this: +From now on, we assume that the interface to `YourSolver` is wrapped in a +C++ header `YourSolver.h`, roughly like this: ```cpp #include -class YourSimulation { +class YourSolver { public: - YourSimulation(int num_points_x, int num_points_y, int num_points_vx, int num_points_vy, ...); + YourSolver(int num_points_x, int num_points_y, int num_points_vx, int num_points_vy, ...); //...rule of 5... void run(double time_step); @@ -123,11 +124,11 @@ class YourSimulation { ``` Now, DisCoTec comes into play. Create a new folder or project that can access -both `YourSimulation` and DisCoTec. -For the combination technique, multiple instances of `YourSimulation` will be +both `YourSolver` and DisCoTec. +For the combination technique, multiple instances of `YourSolver` will be instantiated, each at different resolutions. The `Task` class is the interface you will need to implement. -With `YourSimulation`, that will be as simple as: +With `YourSolver`, that will be as simple as: ```cpp #include @@ -138,7 +139,7 @@ With `YourSimulation`, that will be as simple as: #include "utils/PowerOfTwo.hpp" #include "utils/Types.hpp" -#include "YourSimulation.h" +#include "YourSolver.h" class YourTask : public combigrid::Task { public: @@ -168,9 +169,10 @@ class YourTask : public combigrid::Task { // if all are 1, we are only using the parallelism between grids std::vector p = {1, 1, 1, 1}; - // if using MPI within your simulation, pass p and the lcomm communicator to sim_, too + // if using MPI within your solver, + // pass p and the lcomm communicator to sim_, too sim_ = - std::make_unique(num_points_x, num_points_y, num_points_vx, num_points_vy); + std::make_unique(num_points_x, num_points_y, num_points_vx, num_points_vy); // wrap tensor in a DistributedFullGrid dfg_ = std::make_unique>( this->getDim(), this->getLevelVector(), lcomm, this->getBoundary(), @@ -186,7 +188,7 @@ class YourTask : public combigrid::Task { void setZero() override { dfg_->setZero(); } - std::unique_ptr sim_; + std::unique_ptr sim_; std::unique_ptr> dfg_; double dt_; }; @@ -198,14 +200,14 @@ DisCoTec just assumes that you pass it a pointer to a correctly-sized contiguous array. The size of the whole "global" grid is $2^{l_d}$, with $l_d$ the level vector for each dimension $d$. -Every MPI process in your simulation should then have $2^{l_d} / p_d$ grid -points, where $p$ is the cartesian process vector. +Every MPI process in your solver should then have $2^{l_d} / p_d$ grid +points, where $p$ is the Cartesian process vector. ## Make Your MPI Processes Workers Now, you can use the `YourTask` class to instantiate many tasks as part of a combination scheme. -There are a lot of choices to make regarding the combined simulation, +There are a lot of choices to make regarding the combined simulation run, which is why more than half of the example code is dedicated to defining parameters and generating input data structures: @@ -277,7 +279,7 @@ int main(int argc, char** argv) { These parameter values shold be suitable for a very small-scale proof-of-concept. -The last scope assigns tasks (i.e. your simulation instances) to process groups +The last scope assigns tasks (i.e. your PDE solver instances) to process groups statically. The remaining part of the code looks a lot like your initial time loop again: diff --git a/examples/selalib_distributed/README.md b/examples/selalib_distributed/README.md index 0394ead4..69948064 100644 --- a/examples/selalib_distributed/README.md +++ b/examples/selalib_distributed/README.md @@ -20,7 +20,7 @@ make install ``` where `test_cpp_interface` can be used to test the gerenal C/Fortran interface, -`sim_bsl_vp_3d3v_cart_dd_slim` is the mononlithic solver for our test case, and +`sim_bsl_vp_3d3v_cart_dd_slim` is the mononlithic PDE solver for our test case, and `sll_m_sim_bsl_vp_3d3v_cart_dd_slim_interface` builds the libraries needed for this `selalib_distributed` example. (This may take some time and usually fails if tried in parallel with `make -j`).