[Feature] Improve scheduling algorithm for loops #34
Labels
enhancement
New feature or request
performance
Issue realted with performance of the compiler
refactor
This issue is related with code
Description of the problem or scenario potentially improvable:
Currently all code is generated ad-hoc, there is no much logic on the greedy algorithm that packs together the operations. This is because we're assuming that we know all the sizes of the loops, but there could be cases where we don't.
Description of the solution
Create a pass/stage for analyzing the code size/latency/cycles ONLY of the loop.
Side effects or any other known issues caused by this feature:
Better scheduling for the loop. This could slow down a lot if there are many loops, so this shouldn't be a long/heavy stage. Aim for parallelism whenever possible.
The text was updated successfully, but these errors were encountered: