Optimize load distribution between nodes #719
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related to #701
Implement load distribution between nodes based on flops and memory.
Partitioning Strategy:
flops
attribute toPartition
class inexo/topology/partitioning_strategy.py
.map_partitions_to_shards
function to considerflops
when mapping partitions to shards.get_flops
method toPartitioningStrategy
class to calculate the flops of each partition.Ring Memory Weighted Partitioning Strategy:
partition
method inexo/topology/ring_memory_weighted_partitioning_strategy.py
to consider both memory and flops for partitioning.calculate_flops_weight
helper function to calculate the weight of each node based on its flops.Node Class:
Node
class inexo/orchestration/node.py
to implement logic to sort nodes by flops for load distribution.sort_nodes_by_flops
method to sort nodes by their flops.Inference Engine:
get_flops
method toInferenceEngine
class inexo/inference/inference_engine.py
to get the flops of the current node.Sharded Inference Engine:
get_flops
method toMLXDynamicShardInferenceEngine
class inexo/inference/mlx/sharded_inference_engine.py
to get the flops of the current node.