Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
JIT: limited version of forward substitution for some relops
Add a new optimization to redundant branch opts that looks at prior statements
in the same block for redundant relop trees. If one is found, we check to see
if it can be forward-substituted down to the terminal jump tree. And if it can
we duplicate the computation down at the jump.

This removes many of the cases we see in our generated code where we materialize
a boolean value in a register and then immedately test that register to see if
it is true/false, and then use that second test to drive branching -- instead we
now use the initial test logic to drive the jump and so the boolean value only
exists in the flags.
  • Loading branch information
AndyAyersMS committed Oct 29, 2021
commit 1cd5a999e034e61c58554ac19173b149fa6d5547
1 change: 1 addition & 0 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -7468,6 +7468,7 @@ class Compiler
// Redundant branch opts
//
PhaseStatus optRedundantBranches();
bool optRedundantRelop(BasicBlock* const block);
bool optRedundantBranch(BasicBlock* const block);
bool optJumpThread(BasicBlock* const block, BasicBlock* const domBlock, bool domIsSameRelop);
bool optReachable(BasicBlock* const fromBlock, BasicBlock* const toBlock, BasicBlock* const excludedBlock);
Expand Down
284 changes: 283 additions & 1 deletion src/coreclr/jit/redundantbranchopts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ PhaseStatus Compiler::optRedundantBranches()
//
if (block->bbJumpKind == BBJ_COND)
{
madeChanges |= m_compiler->optRedundantRelop(block);
madeChanges |= m_compiler->optRedundantBranch(block);
}
}
Expand Down Expand Up @@ -98,7 +99,7 @@ bool Compiler::optRedundantBranch(BasicBlock* const block)

GenTree* const tree = jumpTree->AsOp()->gtOp1;

if (!(tree->OperKind() & GTK_RELOP))
if (!tree->OperIsCompare())
{
return false;
}
Expand Down Expand Up @@ -691,6 +692,287 @@ bool Compiler::optJumpThread(BasicBlock* const block, BasicBlock* const domBlock
return true;
}

//------------------------------------------------------------------------
// optRedundantRelop: see if the value of tree is redundant given earlier
// relops in this block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful if you give an example here of the type of transformations done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one -- was that what you had in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great! Much more detailed even than what I imagined :-)

//
// Arguments:
// block - block of interest (BBJ_COND)
//
// Returns:
// true, if changes were made.
//
bool Compiler::optRedundantRelop(BasicBlock* const block)
{
Statement* const stmt = block->lastStmt();

if (stmt == nullptr)
{
return false;
}

GenTree* const jumpTree = stmt->GetRootNode();

if (!jumpTree->OperIs(GT_JTRUE))
{
return false;
}

GenTree* const tree = jumpTree->AsOp()->gtOp1;

if (!tree->OperIsCompare())
{
return false;
}

// If tree has side effects other than GTF_EXCEPT, bail.
//
if ((tree->gtFlags & GTF_SIDE_EFFECT) != 0)
{
if ((tree->gtFlags & GTF_SIDE_EFFECT) != GTF_EXCEPT)
{
return false;
}
}

// If relop's value is known, bail.
//
const ValueNum treeVN = tree->GetVN(VNK_Liberal);

if (vnStore->IsVNConstant(treeVN))
{
JITDUMP(" -- no, jump tree cond is constant\n");
return false;
}

// If there's just one statement, bail.
//
if (stmt == block->firstStmt())
{
JITDUMP(" -- no, no prior stmt\n");
return false;
}

JITDUMP("\noptRedundantRelop in " FMT_BB "; jump tree is\n", block->bbNum);
DISPTREE(jumpTree);

// We're going to search back to find the earliest tree in block that
// * makes the current relop redundant;
// * can safely and profitably forward substituted to the jump.
//
Statement* prevStmt = stmt;
GenTree* candidateTree = nullptr;
bool reverse = false;

// We need to keep track of which locals might be killed by
// the trees between the expression we want to forward substitute
// and the jump.
//
// We don't use a varset here because we are indexing by local ID,
// not by tracked index.
//
// The table size here also implicitly limits how far back we'll search.
//
enum
{
DEFINED_LOCALS_SIZE = 10
};
unsigned definedLocals[DEFINED_LOCALS_SIZE];
unsigned definedLocalsCount = 0;

while (true)
{
prevStmt = prevStmt->GetPrevStmt();

// Backwards statement walks wrap around, so if we get
// back to stmt we've seen everything there is to see.
//
if (prevStmt == stmt)
{
break;
}

// We are looking for ASG(lcl, ...)
//
GenTree* const prevTree = prevStmt->GetRootNode();

JITDUMP(" ... checking previous tree\n");
DISPTREE(prevTree);

// Ignore nops.
//
if (prevTree->OperIs(GT_NOP))
{
continue;
}

// If prevTree has side effects other than GTF_EXCEPT or GTF_ASG, bail.
//
if ((prevTree->gtFlags & GTF_SIDE_EFFECT) != (prevTree->gtFlags & (GTF_EXCEPT | GTF_ASG)))
{
JITDUMP(" -- prev tree has side effects\n");
break;
}
Comment on lines +887 to +891
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears we only keep track of locals assigned at the root.

Do we need to check (prevTreeRHS->gtFlags & GTF_ASG) == 0 to guard against nested definitions?


if (!prevTree->OperIs(GT_ASG))
{
JITDUMP(" -- prev tree not ASG\n");
break;
}

GenTree* const prevTreeLHS = prevTree->AsOp()->gtOp1;
GenTree* const prevTreeRHS = prevTree->AsOp()->gtOp2;

if (!prevTreeLHS->OperIs(GT_LCL_VAR))
{
JITDUMP(" -- prev tree not ASG(LCL...)\n");
break;
}

// If we are seeing PHIs we have run out of interesting stmts.
//
if (prevTreeRHS->OperIs(GT_PHI))
{
JITDUMP(" -- prev tree is a phi\n");
break;
}

// If the VN of RHS is the VN of the current tree, or is "related", consider foward sub.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo: foward

//
// Todo: generalize to allow when normal VN of RHS == normal VN of tree, and RHS except set contains tree except
// set.
// The concern here is whether can we forward sub an excepting (but otherwise non-side effecting tree) and
// guarantee
// the same side effect behavior.
//
// A common case we see is that there's a compare of an array element guarded by a bounds check,
// so prevTree is something like:
//
// (ASG lcl, (COMMA (bds-check), (NE (array-indir), ...)))
//
// This may have the same normal VN as our tree, but also has an exception set.
//
// Another common pattern is that the prevTree contains a call.
//
const ValueNum domCmpVN = prevTreeRHS->GetVN(VNK_Liberal);
const ValueNum domCmpSwpVN = vnStore->GetRelatedRelop(domCmpVN, ValueNumStore::VN_RELATION_KIND::VRK_Swap);
const ValueNum domCmpRevVN = vnStore->GetRelatedRelop(domCmpVN, ValueNumStore::VN_RELATION_KIND::VRK_Reverse);
const ValueNum domCmpSwpRevVN =
vnStore->GetRelatedRelop(domCmpVN, ValueNumStore::VN_RELATION_KIND::VRK_SwapReverse);

const bool matchCmp = ((domCmpVN != ValueNumStore::NoVN) && (domCmpVN == treeVN));
const bool matchSwp = ((domCmpSwpVN != ValueNumStore::NoVN) && (domCmpSwpVN == treeVN));
const bool matchRev = ((domCmpRevVN != ValueNumStore::NoVN) && (domCmpRevVN == treeVN));
const bool matchSwpRev = ((domCmpSwpRevVN != ValueNumStore::NoVN) && (domCmpSwpRevVN == treeVN));
const bool domIsSameRelop = matchCmp || matchSwp;
const bool domIsRevRelop = matchRev || matchSwpRev;

// If the tree is some unrelated computation, keep looking farther up.
//
if (!domIsSameRelop && !domIsRevRelop)
{
JITDUMP(" -- prev tree VN is not related\n");
continue;
}
Comment on lines +981 to +985
Copy link
Member

@jakobbotsch jakobbotsch Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how this works. Doesn't this effectively make the interference check below useless? E.g.

Foo(3, 4);

void Foo(int a, int b)
{
  bool foo = a < b;
  a = 15;
  if (foo)
    Console.WriteLine(a);
  else
    Console.WriteLine(b);
}

15 does not have the same VN as a < b, so the assignment to a will be skipped in the interference check and the program will always print 4 when it should print 15.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should be adding a to the interfering set here before continuing the search.


JITDUMP(" -- prev tree has %srelop with %s liberal VN\n", matchSwp || matchSwpRev ? "swapped " : "",
domIsSameRelop ? "the same" : "a reverse");

// If lcl is not tracked, assume we can't safely reason about interference
// or liveness.
//
const unsigned prevTreeLcl = prevTreeLHS->AsLclVarCommon()->GetLclNum();
LclVarDsc* const prevTreeLclDsc = lvaGetDesc(prevTreeLcl);

if (!prevTreeLclDsc->lvTracked)
{
JITDUMP(" -- prev tree defs untracked V%02u\n", prevTreeLcl);
break;
}

// If we've run out of room to keep track of defined locals, bail.
//
if (definedLocalsCount >= DEFINED_LOCALS_SIZE)
{
JITDUMP(" -- ran out of space for tracking kills\n");
break;
}

// See if we can safely move a copy of prevTreeRHS later, to replace tree.
// We can, if none of its lcls are killed.
//
definedLocals[definedLocalsCount++] = prevTreeLcl;

for (unsigned int i = 0; i < definedLocalsCount; i++)
{
if (gtHasRef(prevTreeRHS, definedLocals[i], /*def only*/ false))
{
JITDUMP(" -- prev tree ref to V%02u interferes\n", definedLocals[i]);
break;
}
}

// Heuristic: only forward sub a relop
//
if (!prevTreeRHS->OperIsCompare())
{
JITDUMP(" -- prev tree is not relop\n");
continue;
}

// Heuristic: if the lcl defined here is live out, forward sub
// won't result in the original becoming dead. If the local is not
// live out that still may happen, but it's less likely.
//
if (VarSetOps::IsMember(this, block->bbLiveOut, prevTreeLclDsc->lvVarIndex))
{
JITDUMP(" -- prev tree lcl V%02u is live-out\n", prevTreeLcl);
continue;
}

JITDUMP(" -- prev tree is viable candidate for relop fwd sub!\n");
candidateTree = prevTreeRHS;
reverse = domIsRevRelop;
}

if (candidateTree == nullptr)
{
return false;
}

// Looks good -- we are going to forward-sub candidateTree
//
GenTree* const substituteTree = gtCloneExpr(candidateTree);

// If we need the reverse compare, make it so
//
if (reverse)
{
substituteTree->SetOper(GenTree::ReverseRelop(substituteTree->OperGet()));

// This substitute tree will have the VN the same as the original.
// (modulo excep sets...? hmmm)
substituteTree->SetVNsFromNode(tree);
}

// We just intentionally created a duplicate -- we don't want CSE to undo this.
// (hopefully, the original is now dead).
//
// Also this relop is now a subtree of a jump.
//
substituteTree->gtFlags |= (GTF_DONT_CSE | GTF_RELOP_JMP_USED);

gtReplaceTree(stmt, tree, substituteTree);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, gtReplaceTree was dead code before this change and as such is sort of in the process of being deleted (in #59912)...

It appears here we have the use edge, so maybe just replace the operand and resequence the statement explicitly? It should be a bit cheaper I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just replace the operand and resequence the statement explicitly?

Sure.

gtUpdateStmtSideEffects(stmt);

DEBUG_DESTROY_NODE(tree);

JITDUMP(" -- done! new jump tree is\n");
DISPTREE(jumpTree);

return true;
}

//------------------------------------------------------------------------
// optReachable: see if there's a path from one block to another,
// including paths involving EH flow.
Expand Down
15 changes: 0 additions & 15 deletions src/coreclr/jit/valuenum.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4653,21 +4653,6 @@ ValueNum ValueNumStore::GetRelatedRelop(ValueNum vn, VN_RELATION_KIND vrk)
ValueNum newVN = VNForFunc(TYP_INT, newFunc, funcAttr.m_args[swap ? 1 : 0], funcAttr.m_args[swap ? 0 : 1]);
ValueNum result = VNWithExc(newVN, excepVN);

#ifdef DEBUG
if (m_pComp->verbose)
{
printf("%s of ", swap ? (reverse ? "swap-reverse" : "swap") : "reverse");
m_pComp->vnPrint(vn, 1);
printf(" => ");
m_pComp->vnPrint(newVN, 1);
if (result != newVN)
{
m_pComp->vnPrint(result, 1);
}
printf("\n");
}
#endif

return result;
}

Expand Down