Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIE] add StickyWAW mutator to postscheduler/postpipeliner #222

Open
wants to merge 3 commits into
base: aie-public
Choose a base branch
from

Conversation

martien-de-jong
Copy link
Collaborator

Not checked on QoR yet, but I do see a significant improvement in one of the postpipeliner unit tests

Martien de Jong added 2 commits October 24, 2024 12:15
This is necessary to try sufficient stages, but makes it very difficult to reach the earlier
NCopies == 1 crash situation.
@andcarminati
Copy link
Collaborator

Hi @martien-de-jong, in the last update of this mutator, we left an orphan command line option:

static cl::opt<unsigned> WAWStickyRegistersMemOpsThreshold(
    "aie-waw-sticky-register-mem-threshold", cl::Hidden, cl::init(4),
    cl::desc("Number of memory instructions to enable the register exclusion "
             "heuristic in WAW sticky registers dep. removal"));

Is it possible to include this removal in this PR?

Copy link
Collaborator

@andcarminati andcarminati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@gbossu
Copy link
Collaborator

gbossu commented Oct 24, 2024

I'd be happy to see QoR numbers, especially for Neg :)

@@ -1254,6 +1255,8 @@ void AIEScheduleDAGMI::schedule() {
// If it succeeds, we need to implement it, if we fail we fall back on the
// normal loop schedule
SchedImpl->buildGraph(*this, AA);
postProcessDAG();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHECK : Why do we need the call to "postProcessDAG" ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the DAG mutations are required for correctness, e.g. giving a correct latency to memory edges

@gbossu
Copy link
Collaborator

gbossu commented Oct 28, 2024

I'd rather not merge this now, running the DAG mutators for pipelining seems to have a nasty side effect on the final loop schedule when pipelining fails. In the post-processing loop of softmax, which essentially does VLD -> VMUL -> VST.CONV, we now have two NOPs at the end, probably because we see a memory dep between VST.CONV and the VLD of the next iteration.

Copy link
Collaborator

@gbossu gbossu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking as "request changes" to make sure we don't inadvertently merge this before looking into #222 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants