Skip to content
Merged
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
07de069
R2R -> InstrumentedTier0 for hot R2R methods
EgorBo Jun 23, 2022
a7bbfaf
Merge branch 'main' of github.com:dotnet/runtime into tier0instrumented
EgorBo Jun 23, 2022
4b4967b
Oops, it turns out that I didn't properly patch re-used callcountingstub
EgorBo Jun 24, 2022
1392368
remove TieredPGO
EgorBo Jun 24, 2022
1a41355
fix bug in the importer
EgorBo Jun 24, 2022
4828f3e
Add ability to instrument optimized code
EgorBo Jun 24, 2022
1fa264b
fix 32bit
EgorBo Jun 24, 2022
71928f3
fix ret type
EgorBo Jun 24, 2022
127b7d8
Clean up
EgorBo Jun 24, 2022
2cee8a9
Merge branch 'main' of github.com:dotnet/runtime into tier0instrumented
EgorBo Jun 24, 2022
911af8a
Clean up
EgorBo Jun 25, 2022
0316842
test fix
EgorBo Jun 26, 2022
2022e68
test2
EgorBo Jun 26, 2022
246dd95
Merge branch 'main' of github.com:dotnet/runtime into new-tier
EgorBo Jul 14, 2022
3a718af
Address feedback
EgorBo Jul 15, 2022
41733fe
Don't re-use callcountingstubs for now
EgorBo Jul 15, 2022
21c8f6e
Clean up
EgorBo Jul 15, 2022
f37012d
Address feedback
EgorBo Jul 15, 2022
db9c71d
Address feedback
EgorBo Jul 15, 2022
59b2bc9
Add docs
EgorBo Jul 15, 2022
ca5c347
Enable other strategies
EgorBo Jul 16, 2022
a6318a0
Clean up
EgorBo Jul 16, 2022
f75e289
Fix GetInitialOptimizationTier
EgorBo Jul 16, 2022
e9e12ea
Enable optimized Instrumentations
EgorBo Jul 16, 2022
81496c4
Update diagram
EgorBo Jul 16, 2022
be329a1
Fix assert
EgorBo Jul 16, 2022
48e375e
Update
EgorBo Jul 16, 2022
1ac3e19
Add test
EgorBo Jul 16, 2022
f6b457a
Update clrconfigvalues.h
EgorBo Jul 17, 2022
7b9e5d4
Merge branch 'main' of github.com:dotnet/runtime into new-tier
EgorBo Jul 17, 2022
cbddf5b
Update clrconfigvalues.h
EgorBo Jul 17, 2022
9345815
Update clrconfigvalues.h
EgorBo Jul 17, 2022
86523e7
Merge branch 'new-tier' of github.com:EgorBo/runtime-1 into new-tier
EgorBo Aug 6, 2022
6a0c05c
Merge branch 'main' of github.com:dotnet/runtime into new-tier
EgorBo Aug 6, 2022
a073ed7
Resolve conflicts, address feedback
EgorBo Aug 6, 2022
75da822
Apply suggestions from code review
EgorBo Aug 6, 2022
e83eb68
Remove InstrumentedTiers_strat4.csproj
EgorBo Aug 6, 2022
b1f4d2e
Address feedback
EgorBo Aug 6, 2022
acdce3d
Merge branch 'new-tier' of github.com:EgorBo/runtime-1 into new-tier
EgorBo Aug 6, 2022
27f7228
update DynamicPgo-InstrumentedTiers.md
EgorBo Aug 6, 2022
2e478c0
Clean up
EgorBo Aug 6, 2022
557ec0c
Clean up
EgorBo Aug 6, 2022
48cf945
clean up
EgorBo Aug 6, 2022
ca491a0
Rename arg in AsyncPromoteToTier1
EgorBo Aug 6, 2022
25bddf0
Address feedback
EgorBo Aug 6, 2022
f777274
Address feedback
EgorBo Aug 6, 2022
13e1211
Address feedback
EgorBo Aug 7, 2022
15c0e2d
Merge branch 'main' of github.com:dotnet/runtime into new-tier
EgorBo Aug 7, 2022
3d88252
Fix issues found during testing
EgorBo Aug 7, 2022
60a77e5
Merge branch 'main' of github.com:dotnet/runtime into new-tier
EgorBo Oct 13, 2022
df19c68
Simplify PR, get rid of strategies
EgorBo Oct 14, 2022
f46679c
Enable TieredPGO_InstrumentOnlyHotCode by default
EgorBo Oct 14, 2022
4d77f85
Merge branch 'main' of github.com:dotnet/runtime into new-tier
EgorBo Oct 17, 2022
c9bd079
Fix an assert in optimized instrumented tier
EgorBo Oct 18, 2022
d3f205b
fix osr tests
EgorBo Oct 18, 2022
ef4ae58
Update docs
EgorBo Oct 18, 2022
7fd0749
Update DynamicPgo-InstrumentedTiers.md
EgorBo Oct 18, 2022
c4395f8
Address feedback
EgorBo Oct 18, 2022
7fbcf17
Merge branch 'main' of github.com:dotnet/runtime into new-tier
EgorBo Oct 22, 2022
a183996
Address Andy's feedback
EgorBo Oct 22, 2022
ff471e1
Disable edge-profiling with a comment
EgorBo Oct 22, 2022
b202b9f
fix assert
EgorBo Oct 22, 2022
64eb6df
Update docs/design/features/DynamicPgo-InstrumentedTiers.md
EgorBo Oct 24, 2022
07339dc
Fix "mb" in the doc, add one more example to the "perf impact" section
EgorBo Oct 25, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
79 changes: 79 additions & 0 deletions docs/design/features/DynamicPgo-InstrumentedTiers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Instrumented Tiers

_Disclaimer: the functionality described in this doc is still in the preview stage and is not enabled by default even for `DOTNET_TieredPGO=1`._

[#70941](https://github.com/dotnet/runtime/pull/70941) introduced new opt-in strategies for Tiered Compilation + TieredPGO mainly to address
two existing limitations of the current design:
1) R2R code never benefits from Dynamic PGO as it's not instrumented and is promoted straight to Tier1 when it's hot
2) Instrumentation in Tier0 comes with a big overhead and it's better to only instrument hot Tier0 code (whether it's ILOnly or R2R)

A good example explaining boths problems is this TechEmpower benchmark (plaintext-plaintext):

![Plaintext](DynamicPgo-InstrumentedTiers-Plaintext.png)

Legend:
* Red - `DOTNET_TieredPGO=0`, `DOTNET_ReadyToRun=1` (default)
* Black - `DOTNET_TieredPGO=1`, `DOTNET_ReadyToRun=1`
* Yellow - `DOTNET_TieredPGO=1`, `DOTNET_ReadyToRun=0`

Yellow line provides the highest level of performance (RPS) by sacrificing start up speed (and, hence, time it takes to process the first request). It happens because the benchmark is quite simple and most of its code is already prejitted so we can only instrument it when we completely drop R2R and compile everything from scratch. It also explains why the black line (when we enable Dynamic PGO but still rely on R2R) didn't really show a lot of improvements. With the separate instrumentation tier for hot R2R we achieve "Yellow"-level of performance while maintaining the same start up speed as it was before. Also, for the mode where we have to compile a lot of code to Tier0, switching to "instrument only hot Tier0 code" strategy shows ~8% time-to-first-request reduction across all TE benchmarks.

![Plaintext](DynamicPgo-InstrumentedTiers-Plaintext-opt.png)
(_predicted results according to local runs of crank with custom binaries_)

# Tiered compilation workflow in TieredPGO mode

The following diagram explains how the instrumentation for hot R2R code works under the hood when TieredPGO is enabled (it's disabled by default):

```mermaid
flowchart
prestub(.NET Function) -->|Compilation| hasAO{"Marked with<br/>[AggressiveOpts]?"}
hasAO-->|Yes|tier1ao["JIT to <b><ins>Tier1</ins></b><br/><br/>(that attribute is extremely<br/> rarely a good idea)"]
hasAO-->|No|hasR2R
hasR2R{"Is prejitted (R2R)<br/>and ReadyToRun==1"?} -->|No| istrTier0Q

istrTier0Q{"<b>TieredPGO_Strategy:</b><br/>Instrument only<br/>hot Tier0 code?"}
istrTier0Q-->|No, always instrument tier0|tier0
istrTier0Q-->|Yes, only hot|tier000
tier000["JIT to <b><ins>Tier0</ins></b><br/><br/>(not optimized, not instrumented,<br/> with patchpoints)"]-->|Running...|ishot555
ishot555{"Is hot?<br/>(called >30 times)"}
ishot555-.->|No,<br/>keep running...|ishot555
ishot555-->|Yes|tier0

hasR2R -->|Yes| R2R
R2R["Use <b><ins>R2R</ins></b> code<br/><br/>(optimized, not instrumented,<br/>with patchpoints)"] -->|Running...|ishot1
ishot1{"Is hot?<br/>(called >30 times)"}-.->|No,<br/>keep running...|ishot1
ishot1--->|"Yes"|instrumentR2R

instrumentR2R{"<b>TieredPGO_Strategy:</b><br/>Instrument hot<br/>R2R'd code?"}
instrumentR2R-->|Yes, instrument R2R'd code|istier1inst
instrumentR2R-->|No, don't instrument R2R'd code|tier1nopgo["JIT to <b><ins>Tier1</ins></b><br/><br/>(no dynamic profile data)"]

tier0["JIT to <b><ins>InstrumentedTier</ins></b><br/><br/>(not optimized, instrumented,<br/> with patchpoints)"]-->|Running...|ishot5
tier1pgo2["JIT to <b><ins>Tier1</ins></b><br/><br/>(optimized with profile data)"]
tier1pgo2_1["JIT to <b><ins>Tier1</ins></b><br/><br/>(optimized with profile data)"]

istier1inst{"<b>TieredPGO_Strategy:</b><br/>Enable optimizations<br/>for InstrumentedTier?"}-->|"No"|tier0_1
istier1inst--->|"Yes"|tier1inst["JIT to <b><ins>InstrumentedTierOptimized</ins></b><br/><br/>(optimized, instrumented, <br/>with patchpoints)"]
tier1inst-->|Running...|ishot5_1
ishot5{"Is hot?<br/>(called >30 times)"}-->|Yes|tier1pgo2
ishot5-.->|No,<br/>keep running...|ishot5


ishot5_1{"Is hot?<br/>(called >30 times)"}
ishot5_1-.->|No,<br/>keep running...|ishot5_1
ishot5_1{"Is hot?<br/>(called >30 times)"}-->|Yes|tier1pgo2_1

tier0_1["JIT to <b><ins>InstrumentedTier</ins></b><br/><br/>(not optimized, instrumented,<br/> with patchpoints)"]
tier0_1-->|Running...|ishot5_1
```
(_VSCode doesn't support mermaid diagrams, consider installing external add-ins_)

## Pros & cons of using optimizations inside the instrumented tiers

Pros:
* Lower overhead from instrumentation (and thanks to optimizations we _can_ optimize probes and emit less of those)
* Optimized code is able to inline methods so we won't be producing new Compilation units for even small methods

Cons:
* Currently, we won't instrument inlinees -> we'll probably miss a lot of opportunities and produce less accurate profile leading to a less optimized final tier
6 changes: 3 additions & 3 deletions docs/design/features/DynamicPgo.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,9 +257,9 @@ If we confidently could identify the top N% of methods (say 5%) then one could i
R2R methods bypass Tier0 and so don't get instrumentation in the current TieredPGO prototype. We probably don't want to instrument the code in the R2R image. And many of these R2R methods are key framework methods that are important for performance. So we need to find a way to get data for these methods.

There are a few basic ideas:
* Leverage IBC. If there is IBC data in the R2R image then we can make that data available to the JIT. It may not be as relevant as in-process collected data, but it's quite likely better than synthetic data or no data.
* Sampled instrumentation for R2R methods. Produce an instrumented version and run it every so often before the method gets promoted to Tier1. This may be costly, especially if we have to use unoptimized methods for instrumentation, as we'll do quite a bit of extra jitting.
* Make R2R methods go through Tier0 on their way to Tier1. Likely introduces an unacceptable perf hit.
1) Leverage IBC. If there is IBC data in the R2R image then we can make that data available to the JIT. It may not be as relevant as in-process collected data, but it's quite likely better than synthetic data or no data.
2) Sampled instrumentation for R2R methods. Produce an instrumented version and run it every so often before the method gets promoted to Tier1. This may be costly, especially if we have to use unoptimized methods for instrumentation, as we'll do quite a bit of extra jitting.
3) Make R2R methods go through a separate instrumentation tier on their way to Tier1, see [DynamicPgo-InstrumentedTiers.md](DynamicPgo-InstrumentedTiers.md) prototype.

#### Dynamic PGO, QuickJitForLoops, OSR

Expand Down
6 changes: 6 additions & 0 deletions src/coreclr/debug/daccess/request.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1183,6 +1183,12 @@ HRESULT ClrDataAccess::GetTieredVersions(
case NativeCodeVersion::OptimizationTierOptimized:
nativeCodeAddrs[count].OptimizationTier = DacpTieredVersionData::OptimizationTier_Optimized;
break;
case NativeCodeVersion::OptimizationTierInstrumented:
nativeCodeAddrs[count].OptimizationTier = DacpTieredVersionData::OptimizationTier_InstrumentedTier;
break;
case NativeCodeVersion::OptimizationTierInstrumentedOptimized:
nativeCodeAddrs[count].OptimizationTier = DacpTieredVersionData::OptimizationTier_InstrumentedTierOptimized;
break;
}
}
else if (pMD->IsJitOptimizationDisabled())
Expand Down
20 changes: 20 additions & 0 deletions src/coreclr/inc/clrconfigvalues.h
Original file line number Diff line number Diff line change
Expand Up @@ -612,6 +612,26 @@ RETAIL_CONFIG_STRING_INFO(INTERNAL_PGODataPath, W("PGODataPath"), "Read/Write PG
RETAIL_CONFIG_DWORD_INFO(INTERNAL_ReadPGOData, W("ReadPGOData"), 0, "Read PGO data")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_WritePGOData, W("WritePGOData"), 0, "Write PGO data")
RETAIL_CONFIG_DWORD_INFO(EXTERNAL_TieredPGO, W("TieredPGO"), 0, "Instrument Tier0 code and make counts available to Tier1")

// TieredPGO_Strategy values:
//
// 0) Instrument any non-prejitted code
// 1) Instrument any non-prejitted code and only hot R2R code
// 2) Instrument any non-prejitted code and only hot R2R code (use optimizations in the instrumented tier for hot R2R)
// 3) Instrument only hot non-prejitted code and only hot R2R code
// 4) Instrument only hot non-prejitted code and only hot R2R code (use optimizations in the instrumented tier for hot R2R)
//
//
// Pros & cons of using optimizations inside the instrumented tiers (mode '2' and '4')
// Pros:
// * Lower overhead from instrumentation (and thanks to optimizations we _can_ optimize probes and emit less of those)
// * Optimized code is able to inline methods so we won't be producing new Compilation units for even small methods
//
// Cons:
// * Currently, we won't instrument inlinees -> we'll probably miss a lot of oportunities and produce less accurate profile
// leading to a less optimized final tier
//
RETAIL_CONFIG_DWORD_INFO(UNSUPPORTED_TieredPGO_Strategy, W("TieredPGO_Strategy"), 0, "Strategy for TieredPGO, see comments in clrconfigvalues.h")
#endif

///
Expand Down
2 changes: 2 additions & 0 deletions src/coreclr/inc/dacprivate.h
Original file line number Diff line number Diff line change
Expand Up @@ -610,6 +610,8 @@ struct MSLAYOUT DacpTieredVersionData
OptimizationTier_OptimizedTier1,
OptimizationTier_ReadyToRun,
OptimizationTier_OptimizedTier1OSR,
OptimizationTier_InstrumentedTier,
OptimizationTier_InstrumentedTierOptimized,
};

CLRDATA_ADDRESS NativeCodeAddr;
Expand Down
10 changes: 10 additions & 0 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -9097,6 +9097,16 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
}
#endif

bool IsInstrumented() const
{
return jitFlags->IsSet(JitFlags::JIT_FLAG_BBINSTR);
}

bool IsInstrumentedOptimized() const
{
return IsInstrumented() && jitFlags->IsSet(JitFlags::JIT_FLAG_TIER1);
}

// true if we should use the PINVOKE_{BEGIN,END} helpers instead of generating
// PInvoke transitions inline. Normally used by R2R, but also used when generating a reverse pinvoke frame, as
// the current logic for frame setup initializes and pushes
Expand Down
8 changes: 4 additions & 4 deletions src/coreclr/jit/fgprofile.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ void BlockCountInstrumentor::Prepare(bool preImport)
//
// If we see any, we need to adjust our instrumentation pattern.
//
if (m_comp->opts.IsOSR() && ((m_comp->optMethodFlags & OMF_HAS_TAILCALL_SUCCESSOR) != 0))
if (m_comp->opts.IsInstrumentedOptimized() && ((m_comp->optMethodFlags & OMF_HAS_TAILCALL_SUCCESSOR) != 0))
{
JITDUMP("OSR + PGO + potential tail call --- preparing to relocate block probes\n");

Expand Down Expand Up @@ -1887,8 +1887,8 @@ PhaseStatus Compiler::fgPrepareToInstrumentMethod()
(JitConfig.TC_PartialCompilation() > 0);
const bool prejit = opts.jitFlags->IsSet(JitFlags::JIT_FLAG_PREJIT);
const bool tier0WithPatchpoints = opts.jitFlags->IsSet(JitFlags::JIT_FLAG_TIER0) && mayHavePatchpoints;
const bool osrMethod = opts.IsOSR();
const bool useEdgeProfiles = (JitConfig.JitEdgeProfiling() > 0) && !prejit && !tier0WithPatchpoints && !osrMethod;
const bool instrOpt = opts.IsInstrumentedOptimized();
const bool useEdgeProfiles = (JitConfig.JitEdgeProfiling() > 0) && !prejit && !tier0WithPatchpoints && !instrOpt;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need block profiles for full instrumented/optimized methods? Seems like edge profiles might work -- unless perhaps the think you need is the is special handling for tail calls.

if so can you add a clarifying comment here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we still need it for now, I'll allocate some time to investigate what is needed to enable edge-profiling after this PR


if (useEdgeProfiles)
{
Expand All @@ -1899,7 +1899,7 @@ PhaseStatus Compiler::fgPrepareToInstrumentMethod()
JITDUMP("Using block profiling, because %s\n",
(JitConfig.JitEdgeProfiling() == 0)
? "edge profiles disabled"
: prejit ? "prejitting" : osrMethod ? "OSR" : "tier0 with patchpoints");
: prejit ? "prejitting" : instrOpt ? "optimized instr" : "tier0 with patchpoints");

fgCountInstrumentor = new (this, CMK_Pgo) BlockCountInstrumentor(this);
}
Expand Down
13 changes: 11 additions & 2 deletions src/coreclr/jit/importer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9602,7 +9602,16 @@ var_types Compiler::impImportCall(OPCODE opcode,
{
return impImportJitTestLabelMark(sig->numArgs);
}
#endif // DEBUG

// static ulong JitHelpers_JitFlags() => 0;
// can be defined anywhere and will be replaced by Debug-version of RyuJIT
if ((mflags & CORINFO_FLG_STATIC) && (sig->numArgs == 0) && (sig->retType == CorInfoType::CORINFO_TYPE_ULONG) &&
(strcmp("JitHelpers_JitFlags", eeGetMethodName(methHnd, nullptr)) == 0))
{
call = gtNewLconNode((__int64)opts.jitFlags->GetRawFlags());
goto DONE_CALL;
}
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only recognize this "intrinsic" under some config setting? Or else make it into a real / official intrinsic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll prepare a public api proposal

Copy link
Member

@jkotas jkotas Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use tracing for this instead of intrinsics like this?

We need to be able to replay the sequence of tiered compilations, without modifying user code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas I needed this intrinsic to make sure the correct version of code was applied and is executed, it helped me to catch a bug: everything looked fine on the traces level, JitDisasm also displayed codegen for all code versions but it turns out one of them was not applied because I forgot to update EntryPoint (it was in the initial impl where I was patching existing call counting stubs instead of creating new ones) - I can delete this one now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to figure out using tracing what kind of code is executing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I don't really need this now so will remove once the outerloop tests finish


// <NICE> Factor this into getCallInfo </NICE>
bool isSpecialIntrinsic = false;
Expand Down Expand Up @@ -22224,7 +22233,7 @@ bool Compiler::impConsiderCallProbe(GenTreeCall* call, IL_OFFSET ilOffset)
return false;
}

assert(opts.OptimizationDisabled() || opts.IsOSR());
assert(opts.OptimizationDisabled() || opts.IsInstrumentedOptimized());
assert(!compIsForInlining());

// During importation, optionally flag this block as one that
Expand Down
5 changes: 5 additions & 0 deletions src/coreclr/jit/jitee.h
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,11 @@ class JitFlags
return m_jitFlags == 0;
}

unsigned __int64 GetRawFlags() const
{
return m_jitFlags;
}

void SetFromFlags(CORJIT_FLAGS flags)
{
// We don't want to have to check every one, so we assume it is exactly the same values as the JitFlag
Expand Down
10 changes: 5 additions & 5 deletions src/coreclr/vm/callcounting.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -574,7 +574,7 @@ bool CallCountingManager::SetCodeEntryPoint(
// For a default code version that is not tier 0, call counting will have been disabled by this time (checked
// below). Avoid the redundant and not-insignificant expense of GetOptimizationTier() on a default code version.
!activeCodeVersion.IsDefaultVersion() &&
activeCodeVersion.GetOptimizationTier() != NativeCodeVersion::OptimizationTier0
activeCodeVersion.IsFinalTier()
) ||
!g_pConfig->TieredCompilation_CallCounting())
{
Expand Down Expand Up @@ -602,7 +602,7 @@ bool CallCountingManager::SetCodeEntryPoint(
return true;
}

_ASSERTE(activeCodeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
_ASSERTE(!activeCodeVersion.IsFinalTier());

// If the tiering delay is active, postpone further work
if (GetAppDomain()
Expand Down Expand Up @@ -649,7 +649,7 @@ bool CallCountingManager::SetCodeEntryPoint(
}
else
{
_ASSERTE(activeCodeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
_ASSERTE(!activeCodeVersion.IsFinalTier());

// If the tiering delay is active, postpone further work
if (GetAppDomain()
Expand All @@ -659,7 +659,7 @@ bool CallCountingManager::SetCodeEntryPoint(
return true;
}

CallCount callCountThreshold = (CallCount)g_pConfig->TieredCompilation_CallCountThreshold();
CallCount callCountThreshold = g_pConfig->TieredCompilation_CallCountThreshold();
_ASSERTE(callCountThreshold != 0);

NewHolder<CallCountingInfo> callCountingInfoHolder = new CallCountingInfo(activeCodeVersion, callCountThreshold);
Expand Down Expand Up @@ -780,7 +780,7 @@ PCODE CallCountingManager::OnCallCountThresholdReached(TransitionBlock *transiti
// used going forward under appropriate locking to synchronize further with deletion.
GCX_PREEMP_THREAD_EXISTS(CURRENT_THREAD);

_ASSERTE(codeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
_ASSERTE(!codeVersion.IsFinalTier());

codeEntryPoint = codeVersion.GetNativeCode();
do
Expand Down
19 changes: 14 additions & 5 deletions src/coreclr/vm/codeversion.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,11 @@ NativeCodeVersion::OptimizationTier NativeCodeVersionNode::GetOptimizationTier()
void NativeCodeVersionNode::SetOptimizationTier(NativeCodeVersion::OptimizationTier tier)
{
LIMITED_METHOD_CONTRACT;
_ASSERTE(tier >= m_optTier);

_ASSERTE(
tier == m_optTier ||
(m_optTier != NativeCodeVersion::OptimizationTier::OptimizationTier1 &&
m_optTier != NativeCodeVersion::OptimizationTier::OptimizationTierOptimized));

m_optTier = tier;
}
Expand Down Expand Up @@ -333,6 +337,13 @@ NativeCodeVersion::OptimizationTier NativeCodeVersion::GetOptimizationTier() con
}
}

bool NativeCodeVersion::IsFinalTier() const
{
LIMITED_METHOD_DAC_CONTRACT;
OptimizationTier tier = GetOptimizationTier();
return tier == OptimizationTier1 || tier == OptimizationTierOptimized;
}

#ifndef DACCESS_COMPILE
void NativeCodeVersion::SetOptimizationTier(OptimizationTier tier)
{
Expand Down Expand Up @@ -808,7 +819,7 @@ bool ILCodeVersion::HasAnyOptimizedNativeCodeVersion(NativeCodeVersion tier0Nati
_ASSERTE(!tier0NativeCodeVersion.IsNull());
_ASSERTE(tier0NativeCodeVersion.GetILCodeVersion() == *this);
_ASSERTE(tier0NativeCodeVersion.GetMethodDesc()->IsEligibleForTieredCompilation());
_ASSERTE(tier0NativeCodeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
_ASSERTE(!tier0NativeCodeVersion.IsFinalTier());

NativeCodeVersionCollection nativeCodeVersions = GetNativeCodeVersions(tier0NativeCodeVersion.GetMethodDesc());
for (auto itEnd = nativeCodeVersions.End(), it = nativeCodeVersions.Begin(); it != itEnd; ++it)
Expand Down Expand Up @@ -1708,9 +1719,7 @@ PCODE CodeVersionManager::PublishVersionableCodeIfNecessary(
{
#ifdef FEATURE_TIERED_COMPILATION
_ASSERTE(!config->ShouldCountCalls() || pMethodDesc->IsEligibleForTieredCompilation());
_ASSERTE(
!config->ShouldCountCalls() ||
activeVersion.GetOptimizationTier() == NativeCodeVersion::OptimizationTier0);
_ASSERTE(!config->ShouldCountCalls() || !activeVersion.IsFinalTier());
if (config->ShouldCountCalls()) // the generated code was at a tier that is call-counted
{
// This is the first call to a call-counted code version of the method
Expand Down
4 changes: 4 additions & 0 deletions src/coreclr/vm/codeversion.h
Original file line number Diff line number Diff line change
Expand Up @@ -71,15 +71,19 @@ class NativeCodeVersion
BOOL SetNativeCodeInterlocked(PCODE pCode, PCODE pExpected = NULL);
#endif

// NOTE: Don't change existing values to avoid breaking changes in event tracing
enum OptimizationTier
{
OptimizationTier0,
OptimizationTier1,
OptimizationTier1OSR,
OptimizationTierOptimized, // may do less optimizations than tier 1
OptimizationTierInstrumented,
OptimizationTierInstrumentedOptimized,
};
#ifdef FEATURE_TIERED_COMPILATION
OptimizationTier GetOptimizationTier() const;
bool IsFinalTier() const;
#ifndef DACCESS_COMPILE
void SetOptimizationTier(OptimizationTier tier);
#endif
Expand Down
Loading