Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add e2e test suite for the Attention - CPU Backend #17751

Merged
merged 35 commits into from
Aug 19, 2024

Conversation

erman-gurses
Copy link
Member

@erman-gurses erman-gurses commented Jun 27, 2024

Add the e2e test suite for the Attention. For now, it only checks CPU FP16, and the reference implementation is FP32.

@iree-org iree-org deleted a comment from google-cla bot Jun 27, 2024
@erman-gurses erman-gurses changed the title Add e2e tests for FA2 Add e2e test suite for FA2 Jun 27, 2024
@erman-gurses
Copy link
Member Author

erman-gurses commented Jul 22, 2024

@ScottTodd, I need your advice on the pre-commit and DCO failings below.

@ScottTodd
Copy link
Member

@ScottTodd, I need your advice on the pre-commit and DCO failings below.

For pre-commit, see https://iree.dev/developers/general/contributing/#coding-style-guidelines. The logs are telling you that a generated file needs to be updated by running python build_tools/bazel_to_cmake/bazel_to_cmake.py (or just pre-commit run)

For DCO, see https://iree.dev/developers/general/contributing/#developer-certificate-of-origin. The logs for the action also include steps you can take to resolve it.

@erman-gurses erman-gurses marked this pull request as ready for review August 6, 2024 14:07
@erman-gurses erman-gurses requested a review from benvanik as a code owner August 6, 2024 14:07
Copy link
Contributor

@Groverkss Groverkss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from attention side of things! Very excited to see this. I'll let Scott drive rest of the review on infra side of things.

tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
Copy link
Contributor

@pashu123 pashu123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Looking at the cc implementation now.

tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tools/testing/e2e/iree-e2e-fa2-test.cc Outdated Show resolved Hide resolved
@ScottTodd ScottTodd self-requested a review August 6, 2024 16:36
tests/e2e/attention/Build.bazel Outdated Show resolved Hide resolved
tests/e2e/attention/Build.bazel Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
tests/e2e/attention/Build.bazel Outdated Show resolved Hide resolved
tests/e2e/attention/Build.bazel Outdated Show resolved Hide resolved
tests/e2e/attention/generate_e2e_fa2_tests.py Outdated Show resolved Hide resolved
@erman-gurses erman-gurses changed the title Add e2e test suite for FA2 Add e2e test suite for the Attention Aug 7, 2024
Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more information to the pull request description, including a link to the tracking issue (#17892). See also https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue if you want this PR to close the issue.

tools/testing/e2e/iree-e2e-attention-test.cc Outdated Show resolved Hide resolved
tests/e2e/attention/BUILD.bazel Show resolved Hide resolved
tests/e2e/attention/BUILD.bazel Show resolved Hide resolved
tools/testing/e2e/iree-e2e-attention-test.cc Show resolved Hide resolved
@erman-gurses erman-gurses requested a review from ScottTodd August 15, 2024 17:55
@erman-gurses erman-gurses requested a review from pashu123 August 16, 2024 13:16
@erman-gurses erman-gurses self-assigned this Aug 16, 2024
@ScottTodd ScottTodd added infrastructure Relating to build systems, CI, or testing codegen Shared code generation infrastructure and dialects codegen/llvm LLVM code generation compiler backend labels Aug 16, 2024
Signed-off-by: ERMAN GURSES <[email protected]>
Signed-off-by: ERMAN GURSES <[email protected]>
Signed-off-by: ERMAN GURSES <[email protected]>
Signed-off-by: ERMAN GURSES <[email protected]>
Signed-off-by: ERMAN GURSES <[email protected]>
Signed-off-by: ERMAN GURSES <[email protected]>
Signed-off-by: ERMAN GURSES <[email protected]>
Signed-off-by: ERMAN GURSES <[email protected]>
Copy link
Contributor

@Groverkss Groverkss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from attention side. Please wait for Scott's approval also.

@erman-gurses erman-gurses changed the title Add e2e test suite for the Attention Add e2e test suite for the Attention - CPU Backend Aug 19, 2024
@erman-gurses
Copy link
Member Author

All checks have passed

I think @ScottTodd has already approved the PR.

@erman-gurses erman-gurses merged commit 2d629c6 into iree-org:main Aug 19, 2024
36 checks passed
@ScottTodd
Copy link
Member

This had some failures overnight:

  • Timeouts (after 60 seconds) in iree/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_large_llvm-cpu_local-task on GitHub-hosted Windows and macOS runners

  • Compilation error on arm64: https://github.com/iree-org/iree/actions/runs/10468944505/job/28990909321#step:4:9815:

    [415/1150] Generating /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb from e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir
    FAILED: tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb 
    cd /work/build-arm64/tests/e2e/attention && /work/build-arm64/tools/iree-compile --output-format=vm-bytecode --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir -o /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb --iree-hal-executable-object-search-path=\"/work/build-arm64\" --iree-llvmcpu-embedded-linker-path=\"/work/build-arm64/llvm-project/bin/lld\" --iree-llvmcpu-wasm-linker-path=\"/work/build-arm64/llvm-project/bin/lld\"
    /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:4:14: error: Yield operand #2 is not equivalent to the corresponding iter bbArg
      %result1 = iree_linalg_ext.attention {
                 ^
    /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:1:1: note: called from
    func.func @attention_2_1024_128_256_64_dtype_f16_f16_f16_f16(%query: tensor<2x1024x128xf16>, %key: tensor<2x256x128xf16>, %value: tensor<2x256x64xf16>, %scale: f32) -> tensor<2x1024x64xf16> {
    ^
    /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:4:14: error: failed to run translation of source executable to target executable for backend #hal.executable.target<"llvm-cpu", "embedded-elf-arm_64", {cpu = "generic", cpu_features = "+reserve-x18", data_layout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32", native_vector_size = 16 : i64, target_triple = "aarch64-unknown-unknown-eabi-elf"}>
      %result1 = iree_linalg_ext.attention {
                 ^
    /work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:1:1: note: called from
    func.func @attention_2_1024_128_256_64_dtype_f16_f16_f16_f16(%query: tensor<2x1024x128xf16>, %key: tensor<2x256x128xf16>, %value: tensor<2x256x64xf16>, %scale: f32) -> tensor<2x1024x64xf16> {
    ^
    failed to translate executables
    

ScottTodd added a commit that referenced this pull request Aug 20, 2024
ScottTodd added a commit that referenced this pull request Aug 20, 2024
Reverts #17751. A few of the new tests are failing on
various platforms:

* Timeouts (after 60 seconds) in
`iree/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_large_llvm-cpu_local-task`
on GitHub-hosted Windows and macOS runners
*
https://github.com/iree-org/iree/actions/runs/10468974350/job/28990992473#step:8:2477
*
https://github.com/iree-org/iree/actions/runs/10468947894/job/28990909629#step:9:3076
    
    ```
1529/1568 Test #969:
iree/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_large_llvm-cpu_local-task
.............................***Timeout 60.07 sec
---
TEST[attention_2_2048_256_512_128_dtype_f16_f16_f16_f16_2_2048_256_512_128_256_1.0_0]
---
    Attention shape (BATCHxMxK1xK2xN): 2x2048x256x512x256x128
    ```

* Compilation error on arm64:
https://github.com/iree-org/iree/actions/runs/10468944505/job/28990909321#step:4:9815:

    ```
[415/1150] Generating
/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb
from
e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir
FAILED:
tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb
/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb
cd /work/build-arm64/tests/e2e/attention &&
/work/build-arm64/tools/iree-compile --output-format=vm-bytecode
--mlir-print-op-on-diagnostic=false --iree-hal-target-backends=llvm-cpu
/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir
-o
/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.vmfb
--iree-hal-executable-object-search-path=\"/work/build-arm64\"
--iree-llvmcpu-embedded-linker-path=\"/work/build-arm64/llvm-project/bin/lld\"
--iree-llvmcpu-wasm-linker-path=\"/work/build-arm64/llvm-project/bin/lld\"

/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:4:14:
error: Yield operand #2 is not equivalent to the corresponding iter
bbArg
      %result1 = iree_linalg_ext.attention {
                 ^

/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:1:1:
note: called from
func.func @attention_2_1024_128_256_64_dtype_f16_f16_f16_f16(%query:
tensor<2x1024x128xf16>, %key: tensor<2x256x128xf16>, %value:
tensor<2x256x64xf16>, %scale: f32) -> tensor<2x1024x64xf16> {
    ^

/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:4:14:
error: failed to run translation of source executable to target
executable for backend #hal.executable.target<"llvm-cpu",
"embedded-elf-arm_64", {cpu = "generic", cpu_features = "+reserve-x18",
data_layout =
"e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32",
native_vector_size = 16 : i64, target_triple =
"aarch64-unknown-unknown-eabi-elf"}>
      %result1 = iree_linalg_ext.attention {
                 ^

/work/build-arm64/tests/e2e/attention/e2e_attention_cpu_f16_f16_f16_medium_llvm-cpu_local-task_attention.mlir:1:1:
note: called from
func.func @attention_2_1024_128_256_64_dtype_f16_f16_f16_f16(%query:
tensor<2x1024x128xf16>, %key: tensor<2x256x128xf16>, %value:
tensor<2x256x64xf16>, %scale: f32) -> tensor<2x1024x64xf16> {
    ^
    failed to translate executables
    ```
erman-gurses added a commit that referenced this pull request Aug 22, 2024
Add the e2e test suite for the Attention. It only checks CPU FP16, and
the reference implementation is FP32.
#17751
#18302

---------

Signed-off-by: erman-gurses <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/llvm LLVM code generation compiler backend codegen Shared code generation infrastructure and dialects infrastructure Relating to build systems, CI, or testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants