Batch: Process-parallel directory scan and initial file read #426

mlange05 · 2024-11-04T14:20:17Z

Initial implementation of simple parallelism in the batch-processing scheduler. This PR refactors the two "trivially parallel" steps of the Scheduler initialisation (scanning the source directories and parsing the full Sourcefile into IR) using Python's builtin ProcessPoolExecutor. It also adds a dummy SerialExecutor implementation that exposes the same interface, but works serially on the original process, thus keeping existing functionality available.

In a little more detail:

Strictly separate Sourcefile creation from Item creation and remove get_or_create_file_item_from_path
Add an executor object to the Scheduler that dummies to the provided SerialExecutor if num_workers=0 is selected, and otherwise creates a ProcessPoolExecutor(max_workers=num_workers).
Invoke the Sourcefile.from_path on the executor by passing assembled frontend_args
Perform the full file parse (Sourcefile.make_complete) using the executor.map functionality over source objects and parser-args, before updating the returned copy of source on the according Item
Pickle-safety fixes for sym.Cast objects and Module AST objects
Increase log-level for Scheduler enrichment to INFO, as it can now become quite dominant during the setup phase

Performance

To test performance, I've mimicked the H24-dev Plan-generation (without explicitly provided header paths), but locally enabled full source parses in the plan step. When adding the new ProcessPoolExecutor but keeping the number of processors low, we can see a significant overhead of the process-pipe-and-serialisation mechanics, but increasing the number of process somewhat we can still get to a reasonable quality-of-life improvement.

Sequential, equivalent to previous:

$ loki-transform.py plan --mode idem --config arpifs/loki_physics.config -s arpifs/phys_ec/ -s surf/external/ -s surf/module --plan-file=../../my_plan.cmake --num-workers=0
[Loki] Creating CMake plan file from config: arpifs/loki_physics.config
[Loki::Scheduler] Scheduler:: Initial file parse in 14.63s
...
[Loki::Scheduler] Performed initial source scan in 21.46s
[Loki::Scheduler] Performed full source parse in 79.78s
[Loki::Scheduler] Enriched call tree in 0.53s
[Loki] Scheduler writing CMake plan: ../../my_plan.cmake

Sequential, but through ProcessPoolExecutor:

$ loki-transform.py plan --mode idem --config arpifs/loki_physics.config -s arpifs/phys_ec/ -s surf/external/ -s surf/module --plan-file=../../my_plan.cmake --num-workers=1
[Loki] Creating CMake plan file from config: arpifs/loki_physics.config
[Loki::Scheduler] Scheduler:: Initial file parse in 13.06s
...
[Loki::Scheduler] Performed initial source scan in 20.50s
[Loki::Scheduler] Performed full source parse in 204.73s
[Loki::Scheduler] Enriched call tree in 20.53s
[Loki] Scheduler writing CMake plan: ../../my_plan.cmake

And with 12 build processes:

(loki_env) [naml@ac6-102 ifs-source]$ loki-transform.py plan --mode idem --config arpifs/loki_physics.config -s arpifs/phys_ec/ -s surf/external/ -s surf/module --plan-file=../../my_plan.cmake --num-workers=12
[Loki] Creating CMake plan file from config: arpifs/loki_physics.config
[Loki::Scheduler] Scheduler:: Initial file parse in 2.11s
...
[Loki::Scheduler] Performed initial source scan in 9.57s
[Loki::Scheduler] Performed full source parse in 36.45s
[Loki::Scheduler] Enriched call tree in 20.57s
[Loki] Scheduler writing CMake plan: ../../my_plan.cmake

Also adds small draft test for expression pickling.

github-actions · 2024-11-04T14:23:20Z

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/426/index.html

mlange05 added 9 commits November 4, 2024 13:30

Batch: Separate initial source read from item creation during scan

d0f30f3

Batch: Simple parallelisation of initial source scan

df35d7e

Loki-transform: Add num_workers argument to plan and convert

c36b6c0

Module: Only drop _ast it we have it

56f52a8

Scheduler: Use and keep one single ProcessPoolExecutor object

1ca7044

Expression: Fix constructor of Cast symbols and fix pickling

375f017

Also adds small draft test for expression pickling.

Scheduler: Process initial full parse in parallel

a84b89e

Batch: Add a SerialExecutor for serial mode with compatible API

3b11275

Batch: Log Scheduler-level enrichment at INFO level

099ea19

mlange05 requested a review from reuterbal November 4, 2024 14:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch: Process-parallel directory scan and initial file read #426

Batch: Process-parallel directory scan and initial file read #426

mlange05 commented Nov 4, 2024

github-actions bot commented Nov 4, 2024

Batch: Process-parallel directory scan and initial file read #426

Are you sure you want to change the base?

Batch: Process-parallel directory scan and initial file read #426

Conversation

mlange05 commented Nov 4, 2024

Performance

github-actions bot commented Nov 4, 2024