Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Codegen][Tuner] default tuning spec available per-SKU #19748

Closed
wants to merge 2 commits into from

Conversation

bangtianliu
Copy link
Contributor

@bangtianliu bangtianliu commented Jan 21, 2025

This PR addresses the tasks outlined in the following issues: #19720 and #19721.

  • adds mi308x to the list of known HIP targets.
  • Update the default tuning strategy to be to be per-SKU first, and then per-architecture as a fallback. start with MI300X and MI308X
    • Introduced a default tuning specification for MI308X.
    • For MI300x, the tuning will default to the existing gfx942 specification, as an MI300X-specific tuning spec has not been added yet.

Signed-off-by: Bangtian Liu <[email protected]>
@bangtianliu bangtianliu marked this pull request as ready for review January 21, 2025 20:27
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should split this into three (or more) PRs:

  • Add a known target for mi308
  • Implement support for per-sku tuning specs
  • Populate tuning specs for mi308

@@ -34,6 +34,8 @@
#define DBGS() (llvm::dbgs() << "[" DEBUG_TYPE "]: ")
#define LDBG(X) LLVM_DEBUG(DBGS() << X << "\n")

extern llvm::cl::opt<std::string> clTestTarget;
Copy link
Member

@kuhar kuhar Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, we should no reference global variables like this. This makes the code hard to maintain and LLVM converged on external storage for flags meant to be external, and having these flag storage variables declared in headers.

Here specifically, we should not rely on test flags in this code. All the information we use must come from the gpu target attr. If the information we need there is not available, we should work on adding it.

// Entry point
//===----------------------------------------------------------------------===//

transform.named_sequence @__kernel_config(%variant_op: !transform.any_op {transform.consumed}) -> !transform.any_op
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can stage this PR such that we first commit a simple tuning spec to make sure everything works, and then separately work on adding configs we care about.

These configs should be tested to make sure they apply on the intended code, and we should quantify the improvements making sure we don't populate these specs with configs that give us only marginal improvements. This needs to be backed by data.

@kuhar
Copy link
Member

kuhar commented Jan 26, 2025

This already landed as a few smaller PRs

@kuhar kuhar closed this Jan 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants