Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch: Process-parallel directory scan and initial file read #426

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

mlange05
Copy link
Collaborator

@mlange05 mlange05 commented Nov 4, 2024

Initial implementation of simple parallelism in the batch-processing scheduler. This PR refactors the two "trivially parallel" steps of the Scheduler initialisation (scanning the source directories and parsing the full Sourcefile into IR) using Python's builtin ProcessPoolExecutor. It also adds a dummy SerialExecutor implementation that exposes the same interface, but works serially on the original process, thus keeping existing functionality available.

In a little more detail:

  • Strictly separate Sourcefile creation from Item creation and remove get_or_create_file_item_from_path
  • Add an executor object to the Scheduler that dummies to the provided SerialExecutor if num_workers=0 is selected, and otherwise creates a ProcessPoolExecutor(max_workers=num_workers).
  • Invoke the Sourcefile.from_path on the executor by passing assembled frontend_args
  • Perform the full file parse (Sourcefile.make_complete) using the executor.map functionality over source objects and parser-args, before updating the returned copy of source on the according Item
  • Pickle-safety fixes for sym.Cast objects and Module AST objects
  • Increase log-level for Scheduler enrichment to INFO, as it can now become quite dominant during the setup phase

Performance

To test performance, I've mimicked the H24-dev Plan-generation (without explicitly provided header paths), but locally enabled full source parses in the plan step. When adding the new ProcessPoolExecutor but keeping the number of processors low, we can see a significant overhead of the process-pipe-and-serialisation mechanics, but increasing the number of process somewhat we can still get to a reasonable quality-of-life improvement.

Sequential, equivalent to previous:

$ loki-transform.py plan --mode idem --config arpifs/loki_physics.config -s arpifs/phys_ec/ -s surf/external/ -s surf/module --plan-file=../../my_plan.cmake --num-workers=0
[Loki] Creating CMake plan file from config: arpifs/loki_physics.config
[Loki::Scheduler] Scheduler:: Initial file parse in 14.63s
...
[Loki::Scheduler] Performed initial source scan in 21.46s
[Loki::Scheduler] Performed full source parse in 79.78s
[Loki::Scheduler] Enriched call tree in 0.53s
[Loki] Scheduler writing CMake plan: ../../my_plan.cmake

Sequential, but through ProcessPoolExecutor:

$ loki-transform.py plan --mode idem --config arpifs/loki_physics.config -s arpifs/phys_ec/ -s surf/external/ -s surf/module --plan-file=../../my_plan.cmake --num-workers=1
[Loki] Creating CMake plan file from config: arpifs/loki_physics.config
[Loki::Scheduler] Scheduler:: Initial file parse in 13.06s
...
[Loki::Scheduler] Performed initial source scan in 20.50s
[Loki::Scheduler] Performed full source parse in 204.73s
[Loki::Scheduler] Enriched call tree in 20.53s
[Loki] Scheduler writing CMake plan: ../../my_plan.cmake

And with 12 build processes:

(loki_env) [naml@ac6-102 ifs-source]$ loki-transform.py plan --mode idem --config arpifs/loki_physics.config -s arpifs/phys_ec/ -s surf/external/ -s surf/module --plan-file=../../my_plan.cmake --num-workers=12
[Loki] Creating CMake plan file from config: arpifs/loki_physics.config
[Loki::Scheduler] Scheduler:: Initial file parse in 2.11s
...
[Loki::Scheduler] Performed initial source scan in 9.57s
[Loki::Scheduler] Performed full source parse in 36.45s
[Loki::Scheduler] Enriched call tree in 20.57s
[Loki] Scheduler writing CMake plan: ../../my_plan.cmake

@mlange05 mlange05 requested a review from reuterbal November 4, 2024 14:20
Copy link

github-actions bot commented Nov 4, 2024

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/426/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant