Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
32236ac
Detect inner loop and add 10 bytes of padding at the beginning
kunalspathak Oct 8, 2020
a2ec5d1
generate nop in previous blocks
kunalspathak Oct 9, 2020
89b5812
TODO: figure out if anything needs to be done in optCanonicalizeLoop
kunalspathak Oct 9, 2020
a84b8be
Add COMPlus_JitAlignLoopMinBlockWeight and COMPlus_JitAlignLoopMaxCod…
kunalspathak Oct 9, 2020
a51759a
Reuse AlignLoops flag for dynamic loop alignment
kunalspathak Oct 10, 2020
2f57c78
Detect back edge and count no. of instructions before doing loop alig…
kunalspathak Oct 14, 2020
b4848f8
fix bugs
kunalspathak Oct 15, 2020
fe92c62
propagate the basic block flag
kunalspathak Oct 15, 2020
51c3171
Switch from instrCount to codeSize
kunalspathak Oct 16, 2020
0b227a8
JitAlignLoopWith32BPadding
kunalspathak Oct 16, 2020
a9c9764
Add emitLoopAlign32Bytes()
kunalspathak Oct 23, 2020
7f1f787
wip
kunalspathak Oct 23, 2020
aac18a4
Add logic to avoid emitting nop if not needed
kunalspathak Oct 28, 2020
5c3c40e
fix a condition
kunalspathak Oct 28, 2020
fb91d41
Several things:
kunalspathak Oct 30, 2020
0b8eb78
Added JitAlignLoopAdaptive algorithm
kunalspathak Nov 5, 2020
072a113
wip
kunalspathak Nov 5, 2020
7e1328a
revert emitarm64.cpp changes
kunalspathak Nov 5, 2020
d49e84d
fix errors during merge
kunalspathak Nov 5, 2020
f6b5135
fix build errors
kunalspathak Nov 5, 2020
256aae5
refactoring and cleanup
kunalspathak Nov 6, 2020
26c3c84
refactoring and build errors fix
kunalspathak Nov 7, 2020
6bb1a75
jit format
kunalspathak Nov 7, 2020
03f6ba6
one more build error
kunalspathak Nov 7, 2020
8036061
Add emitLoopAlignAdjustments()
kunalspathak Nov 11, 2020
809fc85
Update emitLoopAlignAdjustments to just include loopSize calc
kunalspathak Nov 11, 2020
5b9b7a0
Remove #ifdef ADAPTIVE_LOOP_ALIGNMENT
kunalspathak Nov 11, 2020
5584d71
Code cleanup
kunalspathak Nov 12, 2020
f66ee24
minor fixes
kunalspathak Nov 12, 2020
f2f935d
Fix issues:
kunalspathak Nov 12, 2020
99ea31f
Other fixes
kunalspathak Nov 12, 2020
8f64963
Remove align_loops flag from coreclr
kunalspathak Dec 4, 2020
1c85c3c
Review feedback
kunalspathak Dec 4, 2020
ef0b149
jit format
kunalspathak Dec 4, 2020
599ad43
Add FEATURE_LOOP_ALIGN
kunalspathak Dec 5, 2020
97fd373
remove special case for align
kunalspathak Dec 9, 2020
37b0cdb
Do not propagate BBF_LOOP_ALIGN in certain cases
kunalspathak Dec 14, 2020
c0cc8af
Introduce instrDescAlign and emitLastAlignedIgNum
kunalspathak Dec 15, 2020
26f7e61
Several changes:
kunalspathak Dec 15, 2020
305b812
jit format
kunalspathak Dec 16, 2020
f8bdfec
fix issue related to needLabel
kunalspathak Dec 16, 2020
a205cc0
align memory correctly in superpmi
kunalspathak Dec 17, 2020
d576e9c
Few more fixes:
kunalspathak Dec 17, 2020
28480d1
minor JITDUMP messages
kunalspathak Dec 18, 2020
bf03842
Review comments
kunalspathak Dec 18, 2020
dae9749
missing check
kunalspathak Dec 18, 2020
a153742
Mark the last align IG the one that has non-zero padding
kunalspathak Dec 18, 2020
ef02fbb
More review comments
kunalspathak Dec 19, 2020
e7e0d68
Propagate BBF_LOOP_ALIGN for compacting blocks
kunalspathak Dec 19, 2020
4b0e64d
Handle ALIGN_LOOP flag for loops that are unrolled
kunalspathak Dec 20, 2020
6fddce8
jit format
kunalspathak Dec 20, 2020
bad5685
Loop size upto last back-edge instead of first back-edge
kunalspathak Dec 21, 2020
8c30a96
Take loop weight in consideration
kunalspathak Dec 21, 2020
f32f560
remove align flag if loop is no longer valid
kunalspathak Jan 4, 2021
23177b0
Adjust loop block weight to 4 instead of 8
kunalspathak Jan 6, 2021
9f3cb2d
missing space after rebase
kunalspathak Jan 6, 2021
74620ed
fix the enum values after rebase
kunalspathak Jan 7, 2021
b100226
review feedback
kunalspathak Jan 9, 2021
dbeb7d6
Add missing #ifdef DEBUG
kunalspathak Jan 12, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
More review comments
  • Loading branch information
kunalspathak committed Jan 9, 2021
commit ef02fbb360e8754327c1cc7831d5a50d8eaef4d6
10 changes: 7 additions & 3 deletions src/coreclr/jit/compiler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2624,14 +2624,18 @@ void Compiler::compInitOptions(JitFlags* jitFlags)
opts.compJitAlignLoopMaxCodeSize = (unsigned short)JitConfig.JitAlignLoopMaxCodeSize();
#else
opts.compJitAlignLoopAdaptive = true;
opts.compJitAlignLoopBoundary = DEFAULT_ALIGN_LOOP_BOUNDARY;
opts.compJitAlignLoopMinBlockWeight = DEFAULT_ALIGN_LOOP_MIN_BLOCK_WEIGHT;
#endif

// Adaptive alignment works on 32B boundary
if (opts.compJitAlignLoopAdaptive)
{
opts.compJitAlignLoopBoundary = DEFAULT_ALIGN_LOOP_BOUNDARY;
opts.compJitAlignPaddingLimit = (opts.compJitAlignLoopBoundary >> 1) - 1;
}
else
{
opts.compJitAlignPaddingLimit = opts.compJitAlignLoopBoundary - 1;
}

assert(isPow2(opts.compJitAlignLoopBoundary));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like opts.compJitAlignLoopBoundary doesn' get set to anything in release if we are not doing adaptive alignment.

And whatever value it does get set to should be overridable in debug, in case you want to experiment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed setting opts.compJitAlignLoopBoundary for non-adaptive / release case. I will do it. Are you also suggesting to give a way to set compJitAlignLoopBoundary even for adaptive? Currently, the logic I have for adaptive is restricted to have a single align instruction of max 15 bytes. I will have to think and restructure some code to make that alignment boundary flexible for adaptive scenario. If you agree, I can do that in a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it can wait.


#if REGEN_SHORTCUTS || REGEN_CALLPAT
Expand Down
3 changes: 3 additions & 0 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -9069,6 +9069,9 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
// be done. By default, 32B.
unsigned short compJitAlignLoopBoundary;

// Padding limit to align a loop.
unsigned short compJitAlignPaddingLimit;

// If set, perform adaptive loop alignment that limits number of padding based on loop size.
bool compJitAlignLoopAdaptive;

Expand Down
86 changes: 36 additions & 50 deletions src/coreclr/jit/emit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4612,6 +4612,11 @@ void emitter::emitJumpDistBind()

#ifdef FEATURE_LOOP_ALIGN

//-----------------------------------------------------------------------------
// emitLoopAlignment: Insert an align instruction at the end of emitCurIG and
// mark it as IGF_LOOP_ALIGN to indicate that next IG is a
// loop needing alignment.
//
void emitter::emitLoopAlignment()
{
if ((emitComp->opts.compJitAlignLoopBoundary > 16) && (!emitComp->opts.compJitAlignLoopAdaptive))
Expand All @@ -4632,7 +4637,15 @@ void emitter::emitLoopAlignment()
}

//-----------------------------------------------------------------------------
// For loopHeaderIg, find the size of the smallest possible loop that doesn't exceed maxLoopSize.
// getLoopSize: Starting from loopHeaderIg, find the size of the smallest possible loop
// such that it doesn't exceed the maxLoopSize.
//
// Arguments:
// igLoopHeader - The header IG of a loop
// maxLoopSize - Maximum loop size. If the loop is bigger than this value, we will just
// return this value.
//
// Returns: size of a loop in bytes.
//
unsigned emitter::getLoopSize(insGroup* igLoopHeader, unsigned maxLoopSize)
{
Expand All @@ -4649,17 +4662,7 @@ unsigned emitter::getLoopSize(insGroup* igLoopHeader, unsigned maxLoopSize)

// In such cases, the current loop size should exclude the align instruction size reserved for
// next loop.
unsigned maxPaddingAllowed;
if (emitComp->opts.compJitAlignLoopAdaptive)
{
maxPaddingAllowed = (emitComp->opts.compJitAlignLoopBoundary >> 1) - 1;
}
else
{
maxPaddingAllowed = emitComp->opts.compJitAlignLoopBoundary - 1;
}

loopSize -= maxPaddingAllowed;
loopSize -= emitComp->opts.compJitAlignPaddingLimit;
}
if ((igInLoop->igLoopBackEdge == igLoopHeader) || (loopSize > maxLoopSize))
{
Expand All @@ -4675,7 +4678,7 @@ unsigned emitter::getLoopSize(insGroup* igLoopHeader, unsigned maxLoopSize)
// if currIG has back-edge to dstIG.
//
// Notes:
// If the current loop covers a loop that is already marked as align, then remove
// If the current loop encloses a loop that is already marked as align, then remove
// the alignment flag present on IG before dstIG.
//
void emitter::emitSetLoopBackEdge(BasicBlock* loopTopBlock)
Expand Down Expand Up @@ -4724,7 +4727,7 @@ void emitter::emitSetLoopBackEdge(BasicBlock* loopTopBlock)
alignInstr->idaIG->igFlags &= ~IGF_LOOP_ALIGN;
}

JITDUMP("** Skip alignment for loop IG%02u ~ IG%02u, because it covers an aligned loop IG%02u ~ IG%02u.\n",
JITDUMP("** Skip alignment for loop IG%02u ~ IG%02u, because it encloses an aligned loop IG%02u ~ IG%02u.\n",
currLoopStart, currLoopEnd, emitLastInnerLoopStartIgNum, emitLastInnerLoopEndIgNum);
}
}
Expand All @@ -4749,51 +4752,33 @@ void emitter::emitLoopAlignAdjustments()

JITDUMP("*************** In emitLoopAlignAdjustments()\n");

unsigned short estimatedPaddingNeeded, alignmentBoundary = emitComp->opts.compJitAlignLoopBoundary;
unsigned maxLoopSize = 0;
unsigned short estimatedPaddingNeeded = emitComp->opts.compJitAlignPaddingLimit;
unsigned short alignmentBoundary = emitComp->opts.compJitAlignLoopBoundary;

if (emitComp->opts.compJitAlignLoopAdaptive)
{
// For adaptive, adjust the loop size depending on the alignment boundary
int maxBlocksAllowedForLoop = genLog2((unsigned)alignmentBoundary) - 1;
maxLoopSize = alignmentBoundary * maxBlocksAllowedForLoop;
estimatedPaddingNeeded = (alignmentBoundary >> 1) - 1;
}
else
{
// For non-adaptive, just take whatever is supplied using COMPlus_ variables
maxLoopSize = emitComp->opts.compJitAlignLoopMaxCodeSize;
estimatedPaddingNeeded = alignmentBoundary - 1;
}

unsigned alignBytesRemoved = 0;
unsigned loopSize = 0;
unsigned loopIGOffset = 0;
instrDescAlign* alignInstr = emitAlignList;

// track the IG that was adjusted so we can update the offsets
insGroup* lastIGAdj = emitAlignList->idaIG;

for (; alignInstr != nullptr; alignInstr = alignInstr->idaNext)
{
assert(alignInstr->idIns() == INS_align);

insGroup* alignIG = alignInstr->idaIG;

// Adjust offsets of all IGs until the current IG
while (lastIGAdj->igNum <= alignIG->igNum)
{
lastIGAdj->igOffs -= alignBytesRemoved;
lastIGAdj = lastIGAdj->igNext;
}

loopIGOffset = alignIG->igOffs + alignIG->igSize;

// igSize also includes INS_align instruction, take it off.
loopIGOffset -= estimatedPaddingNeeded;

// IG can be marked as not needing alignment if during setting igLoopBackEdge, it is detected
// that the igLoopBackEdge covers an IG that is marked for alignment.
// that the igLoopBackEdge encloses an IG that is marked for alignment.
unsigned actualPaddingNeeded =
alignIG->isLoopAlign() ? emitCalculatePaddingForLoopAlignment(alignIG, loopIGOffset DEBUG_ARG(false)) : 0;

Expand All @@ -4816,7 +4801,7 @@ void emitter::emitLoopAlignAdjustments()

if (emitComp->opts.compJitAlignLoopAdaptive)
{
assert(actualPaddingNeeded < 15);
assert(actualPaddingNeeded < MAX_ENCODED_SIZE);
alignInstr->idCodeSize(actualPaddingNeeded);
}
else
Expand All @@ -4825,14 +4810,14 @@ void emitter::emitLoopAlignAdjustments()

#ifdef DEBUG

int instrAdjusted = (alignmentBoundary + 14) / 15;
int instrAdjusted = (alignmentBoundary + (MAX_ENCODED_SIZE - 1)) / MAX_ENCODED_SIZE;
#endif
// Adjust the padding amount in all align instructions in this IG
instrDescAlign *alignInstrToAdj = alignInstr, *prevAlignInstr = nullptr;
for (; alignInstrToAdj != nullptr && alignInstrToAdj->idaIG == alignInstr->idaIG;
alignInstrToAdj = alignInstrToAdj->idaNext)
{
unsigned newPadding = min(paddingToAdj, 15);
unsigned newPadding = min(paddingToAdj, MAX_ENCODED_SIZE);
alignInstrToAdj->idCodeSize(newPadding);
paddingToAdj -= newPadding;
prevAlignInstr = alignInstrToAdj;
Expand All @@ -4851,6 +4836,16 @@ void emitter::emitLoopAlignAdjustments()
estimatedPaddingNeeded, actualPaddingNeeded);
}

// Adjust the offset of all IGs starting from next IG until we reach the IG having the next
// align instruction or the end of IG list.
insGroup* adjOffIG = alignIG->igNext;
insGroup* adjOffUptoIG = alignInstr->idaNext != nullptr ? alignInstr->idaNext->idaIG : emitIGlast;
while ((adjOffIG != nullptr) && (adjOffIG->igNum <= adjOffUptoIG->igNum))
{
adjOffIG->igOffs -= alignBytesRemoved;
adjOffIG = adjOffIG->igNext;
}

if (actualPaddingNeeded > 0)
{
// Record the last IG that has align instruction. No overestimation
Expand All @@ -4859,13 +4854,6 @@ void emitter::emitLoopAlignAdjustments()
}
}

// Do adjustments of remaining IGs
while (lastIGAdj != nullptr)
{
lastIGAdj->igOffs -= alignBytesRemoved;
lastIGAdj = lastIGAdj->igNext;
}

#ifdef DEBUG
emitCheckIGoffsets();
#endif
Expand Down Expand Up @@ -5935,11 +5923,6 @@ unsigned emitter::emitEndCodeGen(Compiler* comp,
{
printf("\n");
}

if (emitComp->verbose)
{
printf("Allocated method code size = %4u , actual size = %4u\n", emitTotalCodeSize, cp - codeBlock);
}
#endif

unsigned actualCodeSize = emitCurCodeOffs(cp);
Expand All @@ -5954,6 +5937,9 @@ unsigned emitter::emitEndCodeGen(Compiler* comp,
// If you add this padding during the emitIGlist loop, then it will
// emit offsets after the loop with wrong value (for example for GC ref variables).
unsigned unusedSize = emitTotalCodeSize - actualCodeSize;

JITDUMP("Allocated method code size = %4u , actual size = %4u, unused size = %4u\n", emitTotalCodeSize, actualCodeSize, unusedSize);

for (unsigned i = 0; i < unusedSize; ++i)
{
*cp++ = DEFAULT_CODE_BUFFER_INIT;
Expand Down
1 change: 0 additions & 1 deletion src/coreclr/jit/emitxarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13753,7 +13753,6 @@ size_t emitter::emitOutputInstr(insGroup* ig, instrDesc* id, BYTE** dp)

// Make sure we set the instruction descriptor size correctly
assert(sz == emitSizeOfInsDsc(id));
// assert((sz == emitSizeOfInsDsc(id)) || (ins == INS_align));

#if !FEATURE_FIXED_OUT_ARGS
bool updateStackLevel = !emitIGisInProlog(ig) && !emitIGisInEpilog(ig);
Expand Down