-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invalid IR from scalarrepl-param-hlsl in ReplaceConstantWithInst #6556
Fix invalid IR from scalarrepl-param-hlsl in ReplaceConstantWithInst #6556
Conversation
Note that this was a joint effort with @dneto0. See the test file in the commit. Also see this compiler explorer link with the same example that segfaults. The following explains the fix: The bug is in Here's the relevant IR:
There are two memcpys. (dest is first arg, src is second arg)
Now let's enter After these replacements, the first part of that basic block is this:
We see that %12 is in the first memcpy, but defined two instructions later. That's bad. The bug is in this statement: If such a use C is an instruction I, then it replaces uses of C in I with V. That's only safe to do if V dominates I. So the fix uses a big hammer: compute the dominator tree so we can guard the replacement with that dominance check. Although it may seem heavyweight to recompute the dominator tree repeatedly, it's the safe thing to do. Earlier in the code it says: // SROA_Parameter_HLSL has no access to a domtree, if one is needed, it'll
// be generated So it seems in the spirit to compute the dominator tree on demand in this pass. Note that early in ReplaceMemcpy, it has a promising comment and mechanism to early-out: // If the source of the memcpy (Src) doesn't dominate all users of dest (V),
// full replacement isn't possible without complicated PHI insertion
// This will likely replace with ld/st which will be replaced in mem2reg
if (Instruction *SrcI = dyn_cast<Instruction>(Src))
if (!DominateAllUsers(SrcI, V, DT))
return false; But DominateAllUsers doesn't quite do the job for us. It calls DominateAllUsersDom (where it has a dominator tree), and it does this code: // Use `DT` to trace all users and make sure `I`'s BB dominates them all
static bool DominateAllUsersDom(Instruction *I, Value *V, DominatorTree *DT) {
BasicBlock *BB = I->getParent();
Function *F = I->getParent()->getParent();
for (auto U = V->user_begin(); U != V->user_end();) {
Instruction *UI = dyn_cast<Instruction>(*(U++));
// If not an instruction or from a differnt function, nothing to check, move
// along.
if (!UI || UI->getParent()->getParent() != F)
continue;
if (!DT->dominates(BB, UI->getParent()))
return false;
if (isa<GetElementPtrInst>(UI) || isa<BitCastInst>(UI)) {
if (!DominateAllUsersDom(I, UI, DT))
return false;
}
}
return true;
} First, in our case value V is a constant ( One local fix here could be if the use is a constant, then recurse and follow the users of that enclosing constant, until you hit an instruction. This has a problem that it could blow up the search. And even if you got out far enough to reach an instruction you would get to the first memcpy. And then the test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid the tools/clang/test/DXC
directory becoming a dumping ground, can you move these tests into a new subdirectory, maybe tools/clang/test/DXC/Passes/ScalarReplHLSL
?
tools/clang/test/DXC/scalarrepl-param-hlsl-const-to-local-and-back.hlsl
Outdated
Show resolved
Hide resolved
tools/clang/test/DXC/scalarrepl-param-hlsl-const-to-local-and-back.hlsl
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to understand why the existing dominance check was insufficient.
@pow2clk, you asked good questions yesterday about why this change is needed. Here's a deeper dive into what's happening when we run DXC against the example code, You asked about why the initial call to DominateAllUsers wasn't good enough to determine that we shouldn't proceed with replacing the memcpy. I set a breakpoint in Call stack:
Looking at the call stack, we see that we are processing globals and allocas:
Which corresponds to:
In
Note that In HLSL, this refers to:
In IR, this looks like:
We're currently lowering the 2nd memcpy - the one from In } else if (!isa<CallInst>(Src)) {
// Resource ptr should not be replaced.
// Need to make sure src not updated after current memcpy.
// Check Src only have 1 store now.
// If Src has more than 1 store but only used once by memcpy, check if
// the stores dominate the memcpy.
hlutil::PointerStatus SrcPS(Src, size, /*bLdStOnly*/ false);
SrcPS.analyze(typeSys, bStructElt);
if (SrcPS.storedType != hlutil::PointerStatus::StoredType::Stored ||
(SrcPS.loadedType ==
hlutil::PointerStatus::LoadedType::MemcopySrcOnce &&
allStoresDominateInst(Src, MC, DT))) {
if (ReplaceMemcpy(V, Src, MC, annotation, typeSys, DL, DT)) { <------------ HERE
if (V->user_empty())
return true;
return LowerMemcpy(V, annotation, typeSys, DL, DT, bAllowReplace);
}
} It's true that Src (
So now we're in
So this is supposed to check whether Src, the At the very start of the call to
Indeed, the
I assume the reason for this is that Our use-case is one where there are 2 memcpys, where the first copies from
We're processing the second memcpy's destination, The problem isn't the fact that Src doesn't dominate uses of Dest, but rather, than in replacing Dest
Turns into this, right after the call to
Note that the first call to memcpy changed so that it's source went from Now let's see why does
This use is the one shown above that is the source of that first memcpy. This is not, itself, an static bool ReplaceConstantWithInst(Constant *C, Value *V,
IRBuilder<> &Builder) {
bool replacedAll = true;
Function *F = Builder.GetInsertBlock()->getParent();
Instruction *VInst = dyn_cast<Instruction>(V);
DominatorTree DT;
if (VInst) {
DT.recalculate(*F);
}
for (auto it = C->user_begin(); it != C->user_end();) {
User *U = *(it++);
if (Instruction *I = dyn_cast<Instruction>(U)) {
if (I->getParent()->getParent() != F)
continue;
if (VInst && DT.dominates(VInst, I)) {
I->replaceUsesOfWith(C, V);
} else {
replacedAll = false;
}
} else {
<----------------------------------- HERE
// Skip unused ConstantExpr.
if (U->user_empty())
continue;
ConstantExpr *CE = cast<ConstantExpr>(U);
Instruction *Inst = CE->getAsInstruction();
Builder.Insert(Inst);
Inst->replaceUsesOfWith(C, V);
if (!ReplaceConstantWithInst(CE, Inst, Builder)) {
replacedAll = false;
}
}
}
C->removeDeadConstantUsers();
return replacedAll;
} Since the use, U, is a
We then call
Before we recurse into if (Constant *C = dyn_cast<Constant>(V)) {
updateLifetimeForReplacement(V, Src);
if (TyV == TySrc) {
if (isa<Constant>(Src)) {
V->replaceAllUsesWith(Src);
} else {
// Replace Constant with a non-Constant.
IRBuilder<> Builder(MC); <---------------------------- HERE
if (!ReplaceConstantWithInst(C, Src, Builder)) {
return false;
}
}
} else { So this new instruction is inserted after the first memcpy in which we're going to replace it soon. We now recurse, calling So the bug is that we should not replace uses of a value C in I, with another one, V, if I dominates V; or conversely, we should only replace uses of value C in I with another one V if V dominates I. This is exactly what this patch does: static bool ReplaceConstantWithInst(Constant *C, Value *V,
IRBuilder<> &Builder) {
bool replacedAll = true;
Function *F = Builder.GetInsertBlock()->getParent();
Instruction *VInst = dyn_cast<Instruction>(V);
DominatorTree DT;
if (VInst) {
DT.recalculate(*F);
}
for (auto it = C->user_begin(); it != C->user_end();) {
User *U = *(it++);
if (Instruction *I = dyn_cast<Instruction>(U)) {
if (I->getParent()->getParent() != F)
continue;
if (VInst && DT.dominates(VInst, I)) { <------------ HERE
I->replaceUsesOfWith(C, V);
} else {
replacedAll = false;
}
} else {
// Skip unused ConstantExpr.
if (U->user_empty())
continue;
ConstantExpr *CE = cast<ConstantExpr>(U);
Instruction *Inst = CE->getAsInstruction();
Builder.Insert(Inst);
Inst->replaceUsesOfWith(C, V);
if (!ReplaceConstantWithInst(CE, Inst, Builder)) {
replacedAll = false;
}
}
}
C->removeDeadConstantUsers();
return replacedAll;
} Now maybe there's another way to solve this by making sure to insert our new instruction above all uses? Maybe instead of creating a builder around the 2nd memcpy, we should create it around the alloca of the Src Hopefully this clarifies what's going on with this bug, and why our patch works this way. Basically, while replacing constants, we may end up injecting new instructions above the current memcpy, but then have to be careful not to replace references to it in instructions before that memcpy. |
ReplaceConstantWithInst(C, V) replaces uses of C in the current function with V. If such a use C is an instruction I, the it replaces uses of C in I with V. However, this function did not make sure to only perform this replacement if V dominates I. As a result, it may end up replacing uses of C in instructions before the definition of V. The fix is to lazily compute the dominator tree in ReplaceConstantWithInst so that we can guard the replacement with that dominance check.
dd202d2
to
407ce70
Compare
@llvm-beanz addressed all your comments and 👍 each one. |
ReplaceConstantWithInst(C, V) replaces uses of C in the current function with V. If such a use C is an instruction I, the it replaces uses of C in I with V. However, this function did not make sure to only perform this replacement if V dominates I. As a result, it may end up replacing uses of C in instructions before the definition of V.
The fix is to lazily compute the dominator tree in ReplaceConstantWithInst so that we can guard the replacement with that dominance check.