Skip to content
This repository has been archived by the owner on Aug 14, 2024. It is now read-only.

Initial development of Amdahl's Law plotting #28

Merged
merged 10 commits into from
Aug 18, 2022
Merged

Conversation

mikerenfro
Copy link
Contributor

@mikerenfro mikerenfro commented Aug 10, 2022

For #13. May still need some work on this, but here's a start for the Snakefile and Python code to convert the run*.log files into a speedup graph like the one shown below. Other items of note:

  • Since we're not doing real work in the MPI ranks, we can expand the horizontal axis out considerably beyond 8 processors. This gets us closer to the speedup limit, and de-emphasizes the bits of experimental error in the timing results.
  • We may need to explicitly specify the number of processes used in mpirun, especially in an HPC environment, as mpirun may query the scheduler to know how many processes it "should" run.
  • My time values aren't particularly close to the theoretical values. Not sure if the Conde-based mpirun is deficient or if I'm missing something else. Some runs have been better than others, but for a synthetic example, I'd have expected my values to be practically perfect on every run. -- Followup: Trevor's results were basically perfect, so it's possibly a local problem. Will check later with current Amdahl code to verify.

amdahl-080-percent

@bkmgit
Copy link
Contributor

bkmgit commented Aug 11, 2022

@mikerenfro
Copy link
Contributor Author

I can go either way on that. If the plotting code is intended to be black box like the Amdahl script is, then I don't mind making it more legible and showing the theoretical speedup line. If they're expected to code it up on their own, then a simpler plotting code is fine.

One other reason I added the extra line was because my timing results were so far off from the expected values. Not sure if that's due to conda's mpirun or something present on both my Mac and my HPC. Adding the theoretical line could lead to discussions of "yeah, and your results are right on that line because this is a somewhat synthetic problem," "yeah, your results are off the line because measurements aren't perfect," or others rather than explaining why different learners' graphs look visibly different.

@tkphd
Copy link
Member

tkphd commented Aug 12, 2022

It's possible, but I believe it would be more complicated to build the NumPy arrays than it is to build the Pandas DataFrame. The input data for the plot exists as a number of JSON files written by the compute nodes to disk, essentially one "row" per file. The process of building the DataFrame, as Mike programmed it, is relatively straight-forward to explain. If you can come up with a competitive solution with NumPy, I'd certainly be interested to see it.

The plotting script is intended to be a gray box: learners can open the file and see what's going on inside, but it is not necessary to do so, and we will certainly not spend time in a Snakemake lesson on writing a Python plotting script.

@mikerenfro mikerenfro changed the title WIP: Initial development of Amdahl's Law plotting Initial development of Amdahl's Law plotting Aug 12, 2022
@mikerenfro mikerenfro marked this pull request as ready for review August 12, 2022 18:18
@tkphd tkphd requested review from reid-a and ocaisa August 12, 2022 18:20
@tkphd
Copy link
Member

tkphd commented Aug 12, 2022

The current version seems to nail Amdahl.

amdahl-80-pct-parallel

@bkmgit
Copy link
Contributor

bkmgit commented Aug 12, 2022

Possibly writing out a csv file rather than json would make it easier, but if Python will be required for this, then using Pandas is ok.

Copy link

@vinisalazar vinisalazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! I added some suggestions, I hope they are helpful. The config.yaml looks great with the comments.

code/Snakefile Outdated Show resolved Hide resolved
code/Snakefile Outdated
conda:
"amdahl"
shell:
"python3 plot_amdahl.py {input} > {output}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snakemake does have a specific feature to handle scripts, so I think may be worth mentioning that. I'm not saying that this rule should be change to use the script directive, but rather that users should be aware of that, maybe a callout box would work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I'm not sure how script interacts with cluster, though, and we want the plotting to be done on a compute node.

code/plot_amdahl.py Outdated Show resolved Hide resolved
@tkphd
Copy link
Member

tkphd commented Aug 12, 2022

@bkmgit if you would like to offer a complete solution using NumPy, we will consider it, but that is outside the scope of this PR.

@tkphd
Copy link
Member

tkphd commented Aug 12, 2022

Done tinkering with this for now. Here's what the scaling plot currently looks like.

amdahl-scaling-study

There's also a PR on amdahl that improves the random jitter code (to accept a jitter proportion on the CLI and only increase runtime, unless the proportion is negative, in which case only decrease runtime). If you're in a reviewing mood, please take a look!

Copy link
Member

@tkphd tkphd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I might be biased.

@reid-a
Copy link
Member

reid-a commented Aug 18, 2022

I think it's useful to get a version of this code into the repo. In the spirit of not letting the perfect be the enemy of the good, I favor merging.

@reid-a reid-a merged commit e4d6844 into gh-pages Aug 18, 2022
@tkphd tkphd deleted the amdahl_plotting branch August 18, 2022 21:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants