-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pass hoisting RT.await_future out of scf.forall loops #748
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
antoniupop
approved these changes
Mar 18, 2024
compilers/concrete-compiler/compiler/lib/Dialect/FHELinalg/Transforms/Tiling.cpp
Show resolved
Hide resolved
compilers/concrete-compiler/compiler/include/concretelang/Dialect/RT/Transforms/Passes.td
Show resolved
Hide resolved
compilers/concrete-compiler/compiler/lib/Dialect/RT/Transforms/HoistAwaitFuturePass.cpp
Show resolved
Hide resolved
compilers/concrete-compiler/compiler/lib/Dialect/RT/Transforms/HoistAwaitFuturePass.cpp
Outdated
Show resolved
Hide resolved
andidr
force-pushed
the
andi/tiling-optimizations
branch
3 times, most recently
from
April 4, 2024 09:55
7b07f30
to
edc871d
Compare
BourgerieQuentin
approved these changes
Apr 4, 2024
...compiler/compiler/lib/Conversion/MLIRLowerableDialectsToLLVM/MLIRLowerableDialectsToLLVM.cpp
Outdated
Show resolved
Hide resolved
...ete-compiler/compiler/lib/Conversion/TFHEGlobalParametrization/TFHEGlobalParametrization.cpp
Show resolved
Hide resolved
compilers/concrete-compiler/compiler/lib/Dialect/RT/Transforms/HoistAwaitFuturePass.cpp
Show resolved
Hide resolved
Could be good also to have check-tests for the hoisting pass |
This adds a new option `dump-fhe-df-parallelized` to `concretecompiler` that dumps the IR after the generation of data-flow tasks.
…ecific code This introduces a new function `normalizeInductionVar()` to the static loop utility code in `concretelang/Analysis/StaticLoops.h` with code extracted for IV normalization from the batching code and changes the batching code to make use of the factored function.
…ersion patterns Some of the TFHE to Concrete conversion patterns implicitly assume that operands are ciphertexts and thus that the converted types have a higher number of dimensions than the original types. However, for non-ciphertext types, the number of dimensions before and after the conversion must be the same. This commit adds a check to the respective conversion patterns triggering a simple type conversion that preserves the number of dimensions for non-ciphertext types.
andidr
force-pushed
the
andi/tiling-optimizations
branch
from
April 8, 2024 13:43
edc871d
to
c09e11c
Compare
Done. |
…th nested blocks The current scheme used by reinstantiating conversion patterns in `lib/Conversion/Utils/Dialects` for operations with blocks is to create a new operation with empty blocks, to move the operations from the old blocks and then to replace any references to block arguments. However, such in-place updates of the types of block arguments leave conversion patterns for operations nested in the blocks without the ability to determine the original types of values from before the update. This change uses proper signature conversion for block arguments, such that the original types of block arguments with converted types is preserved, while the new types are made available through the dialect conversion infrastructure via the respective adaptors.
… patterns for RT tasks
… bufferization This adds support for `memref.alloc`, `memref.load`, `memref.store`, `memref.copy` and `memref.subview` to the RT task bufferization pass.
andidr
force-pushed
the
andi/tiling-optimizations
branch
from
April 8, 2024 13:54
c09e11c
to
7430587
Compare
…oops The new pass hoists `RT.await_future` operations whose results are yielded by scf.forall operations out of the loops in order to avoid over-synchronization of data-flow tasks. E.g., the following IR: ``` scf.forall (%arg) in (16) shared_outs(%o1 = %sometensor, %o2 = %someothertensor) -> (tensor<...>, tensor<...>) { ... %rph = "RT.build_return_ptr_placeholder"() : () -> !RT.rtptr<!RT.future<tensor<...>>> "RT.create_async_task"(..., %rph, ...) { ... } : ... %future = "RT.deref_return_ptr_placeholder"(%rph) : (!RT.rtptr<!RT.future<...>>) -> !RT.future<tensor<...>> %res = "RT.await_future"(%future) : (!RT.future<tensor<...>>) -> tensor<...> ... scf.forall.in_parallel { ... tensor.parallel_insert_slice %res into %o1[..., %arg2, ...] [...] [...] : tensor<...> into tensor<...> ... } } ``` is transformed into: ``` %tensoroffutures = tensor.empty() : tensor<16x!RT.future<tensor<...>>> scf.forall (%arg) in (16) shared_outs(%otfut = %tensoroffutures, %o2 = %someothertensor) -> (tensor<...>, tensor<...>) { ... %rph = "RT.build_return_ptr_placeholder"() : () -> !RT.rtptr<!RT.future<tensor<...>>> "RT.create_async_task"(..., %rph, ...) { ... } : ... %future = "RT.deref_return_ptr_placeholder"(%rph) : (!RT.rtptr<!RT.future<...>>) -> !RT.future<tensor<...>> %wrappedfuture = tensor.from_elements %future : tensor<1x!RT.future<tensor<...>>> ... scf.forall.in_parallel { ... tensor.parallel_insert_slice %wrappedfuture into %otfut[%arg] [1] [1] : tensor<1xRT.future<tensor<...>>> into tensor<16x!RT.future<tensor<...>>> ... } } scf.forall (%arg) in (16) shared_outs(%o = %sometensor) -> (tensor<...>) { %future = tensor.extract %tensoroffutures[%arg] : tensor<4x!RT.future<tensor<...>>> %res = "RT.await_future"(%future) : (!RT.future<tensor<...>>) -> tensor<...> scf.forall.in_parallel { tensor.parallel_insert_slice %res into %o[..., %arg, ...] [...] [...] : tensor<...> into tensor<...> } } ```
andidr
force-pushed
the
andi/tiling-optimizations
branch
from
April 8, 2024 14:16
7430587
to
f506f5f
Compare
andidr
force-pushed
the
andi/tiling-optimizations
branch
from
April 9, 2024 12:54
278c9dc
to
f506f5f
Compare
BourgerieQuentin
approved these changes
Apr 9, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The new pass hoists
RT.await_future
operations whose results areyielded by scf.forall operations out of the loops in order to avoid
over-synchronization of data-flow tasks.
E.g., the following IR:
is transformed into: