Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][NFCI] Finalize switch to SPV_KHR_cooperative_matrix #16045

Merged
merged 4 commits into from
Nov 21, 2024

Conversation

MrSidims
Copy link
Contributor

No description provided.

Signed-off-by: Sidorov, Dmitry <[email protected]>
@MrSidims
Copy link
Contributor Author

Tested locally on linux both CPU and GPU - the patch shouldn't cause any regressions (unless there is a difference between driver versions used in pre-commit here in CI and drivers used locally (setup by scripts)).
Hasn't yet tested on windows - having troubles finding DG2 machine.

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes itself LGTM.

@MrSidims , in addition to stability, could you please also verify performance just in case to make sure no performance regressions are introduced?

Copy link
Contributor

@dkhaldi dkhaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's investigate failures on Windows DG2 before merging this

@MrSidims
Copy link
Contributor Author

@dkhaldi on windows DG2 only these 2 tests are failing:
Failed Tests (2):
SYCL :: Matrix/SPVCooperativeMatrix/joint_matrix_bf16_fill_k_cache_SLM.cpp
SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_SLM.cpp

But there are indeed regressions. joint_matrix_out_bounds.cpp was added only for JointMatrices, with the switch it's actually failing. The reason is that __spirv_CooperativeMatrixLoadCheckedINTEL (and Store) are not handled by IGC for Cooperative Matrices at all. This behavior doesn't follow SPIR-V specification as it defines OpCooperativeMatrixLoadCheckedINTEL be used only for OpTypeCooperativeMatrixKHR. Support of TypeJointMatrixINTEL was left in the translator only as an exception for the switch. Prefetch seem to also not handled by IGC when used with TypeCooperativeMatrixKHR. We have to postpone the switch to cooperative matrices as well as promoting the sycl extension from experimental to supported. Note, only new functionality doesn't work. Old functionality works as expected.

@MrSidims MrSidims closed this Nov 13, 2024
@MrSidims
Copy link
Contributor Author

Closing the PR as we can't merge it

@MrSidims MrSidims reopened this Nov 20, 2024
@MrSidims
Copy link
Contributor Author

@dkhaldi @YuriPlyakhin apparently I had a problem with the driver. On agama driver performance tests are passing (with OOB loads/stores). No performance regressions observed.

@MrSidims
Copy link
Contributor Author

@dkhaldi @intel/llvm-reviewers-runtime please take a look

Copy link
Contributor

@dkhaldi dkhaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Let's embrace for impact :)

@MrSidims MrSidims requested a review from a team November 20, 2024 20:03
@MrSidims
Copy link
Contributor Author

@intel/llvm-reviewers-runtime friendly ping to review the SYCL headers and probably 'unsupported feature' test(s) changes.
@intel/llvm-gatekeepers friendly ping to merge the patch as actually all of the changes are related to matrices and 'unsupported feature' test(s) changes are quite trivial.

@dm-vodopyanov dm-vodopyanov merged commit 3edd618 into intel:sycl Nov 21, 2024
13 checks passed
@sarnex
Copy link
Contributor

sarnex commented Nov 21, 2024

@MrSidims Looks like this broke some build lit tests, can you take a look?

Failed Tests (7):
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-hip.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-intel.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-tensorcores.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-unified-utils.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-unified.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/static-query-use.hpp

@AlexeySachkov
Copy link
Contributor

@MrSidims Looks like this broke some build lit tests, can you take a look?

Failed Tests (7):
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-hip.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-intel.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-tensorcores.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-unified-utils.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix-unified.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/matrix.hpp
  SYCL :: self-contained-headers/sycl/ext/oneapi/matrix/static-query-use.hpp

Those tests were temporary disabled in #16150 by @MrSidims

@sarnex
Copy link
Contributor

sarnex commented Nov 21, 2024

Cool, thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants