WIP: update python example #350

bkmgit · 2021-05-06T16:45:33Z

Re-write example to remove memory estimation and simplify it.

It may be helpful to use a loop to minimize memory requirements but still have speed by using numpy arrays.

bkmgit · 2021-05-06T17:17:00Z

Also need to add new diagrams.

Other suggestions for improvements welcome.

_episodes/16-parallel.md

mikerenfro · 2021-05-06T19:04:50Z

Also need to add new diagrams.

These won't be difficult to do. If you do TikZ, the source for the existing figures is in files/initialize-scatter-compute-gather.tex

If you don't do TikZ, give me an idea of how you'd like things to change and I'll handle it.

bkmgit · 2021-05-07T05:10:57Z

Also need to add new diagrams.

These won't be difficult to do. If you do TikZ, the source for the existing figures is in files/initialize-scatter-compute-gather.tex

Thanks. This is a helpful starting point.

ocaisa · 2021-05-07T06:17:36Z

There are some core reasons that you would want to use MPI: you need more compute or you need more memory. I think that the memory calculation provided the motivation to use MPI in the first place: the problem grew to the point where it didn't fit on the node any more.

It's not an efficient algorithm, we know that, and that could even be part of teaching. Making the point that we would be much better served by thinking about a better algorithm rather than jumping into MPI has a lot of value. However, many of the learners will not be in a position to rewrite the algorithm, nor implement MPI, they will have an MPI-capable code they were told to use that they need to run faster or with more memory. To me, the point of the example is to show them what they need to consider when they start hitting these memory/CPU limits. The memory wall is the easiest of these to (deliberately) hit when we are in a training course.

Having said all that, I agree that we could make this part of the lesson a little more fluid. This example is essentially shared with http://www.hpc-carpentry.org/hpc-parallel-novice/ , I think it is really worth having a common discussion on this example so that we can first agree on an improved design (taking the lesson objectives into account) and harmonize the two. For example, one could build up the example in http://www.hpc-carpentry.org/hpc-parallel-novice/ piece by piece, but in HPC intro it could be presented as "someone gave me this code that does that serial job in parallel", which is probably a pretty common use case.

bkmgit · 2021-05-07T06:59:41Z

The memory wall in the pi calculation is artificial, and reflective of not so great programming, even if it can be demonstrated with an easy calculation. It is good to rethink about code refactoring and optimization, but main purpose of this lesson should be to log in, work with files, be introduced to programs that run on multiple nodes and submit a job. It may be better to add a further example where memory constraint is not artificial, for example a matrix calculation or a large data calculation. Could download text files such as Wikipedia, log files , Gutenburg or generate random text for example using https://github.com/TheAbhijeet/lorem_text so that students do not prevent a server from allowing others to access it. One could also have a discussion exercise.

ocaisa · 2021-05-07T08:09:47Z

I understand your point but calculating pi is artificial as well. We use it because it is easy to understand (which was debated when it was introduced) and serves the teaching goals. Maybe there are better choices but, like you say, the main purpose of this part of the lesson is to run on multiple nodes. For that they need an example that motivates this and that they can trivially wrap their heads around, so that they can primarily focus on the things we are actually trying to teach in the lesson.

bkmgit · 2021-07-01T18:21:02Z

Also need to add new diagrams.

These won't be difficult to do. If you do TikZ, the source for the existing figures is in files/initialize-scatter-compute-gather.tex

@mikerenfro Got it to work. Thanks for the suggestion.

bkmgit added 2 commits May 6, 2021 19:42

Update example snippets

e7448a4

rename variable for clarity

0570a84

bkmgit requested review from annajiat, mikerenfro, ocaisa, psteinb, reid-a and tkphd as code owners May 6, 2021 16:45

bkmgit changed the title ~~Bkmgit update python~~ WIP: update python example May 6, 2021

Merge branch 'gh-pages' into bkmgit-update-python

e3e9201

mikerenfro reviewed May 6, 2021

View reviewed changes

_episodes/16-parallel.md Show resolved Hide resolved

_episodes/16-parallel.md Outdated Show resolved Hide resolved

bkmgit and others added 3 commits July 1, 2021 14:12

Merge branch 'carpentries-incubator:gh-pages' into bkmgit-update-python

96cedab

Merge branch 'carpentries-incubator:gh-pages' into bkmgit-update-python

280c4c2

update parallel figures

88cc2e3

bkmgit added 2 commits July 1, 2021 21:29

add more explanation for work distribution

bb74a8d

revert subtitle change

8f3b0f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: update python example #350

WIP: update python example #350

bkmgit commented May 6, 2021

bkmgit commented May 6, 2021

mikerenfro commented May 6, 2021

bkmgit commented May 7, 2021

ocaisa commented May 7, 2021

bkmgit commented May 7, 2021

ocaisa commented May 7, 2021

bkmgit commented Jul 1, 2021

WIP: update python example #350

Are you sure you want to change the base?

WIP: update python example #350

Conversation

bkmgit commented May 6, 2021

bkmgit commented May 6, 2021

mikerenfro commented May 6, 2021

bkmgit commented May 7, 2021

ocaisa commented May 7, 2021

bkmgit commented May 7, 2021

ocaisa commented May 7, 2021

bkmgit commented Jul 1, 2021