Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Better strategy for ReserveInitialMemory on arm64
  • Loading branch information
EgorBo committed Jun 14, 2022
commit 704b74985d5936604fa50fe0ecfd6321cf842086
4 changes: 4 additions & 0 deletions src/coreclr/inc/clrconfigvalues.h
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,10 @@ CONFIG_DWORD_INFO(INTERNAL_ContinueOnAssert, W("ContinueOnAssert"), 0, "If set,
CONFIG_DWORD_INFO(INTERNAL_InjectFatalError, W("InjectFatalError"), 0, "")
CONFIG_DWORD_INFO(INTERNAL_InjectFault, W("InjectFault"), 0, "")
CONFIG_DWORD_INFO(INTERNAL_SuppressChecks, W("SuppressChecks"),0, "")

// If we manage to reserve the initial memory close to coreclr we might get a better performance
// but it's better to turn it off when we run benchmarks for more stable results (always reserve far from coreclr)
RETAIL_CONFIG_DWORD_INFO(UNSUPPORTED_ReserveInitialMemoryNearClr, W("UNSUPPORTED_DontReserveInitialMemoryNearClr"), 0, "Don't try to reserve the initial memory close to coreclr")
#ifdef FEATURE_EH_FUNCLETS
CONFIG_DWORD_INFO(INTERNAL_SuppressLockViolationsOnReentryFromOS, W("SuppressLockViolationsOnReentryFromOS"), 0, "64 bit OOM tests re-enter the CLR via RtlVirtualUnwind. This indicates whether to suppress resulting locking violations.")
#endif // FEATURE_EH_FUNCLETS
Expand Down
6 changes: 6 additions & 0 deletions src/coreclr/pal/src/include/pal/virtual.h
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,8 @@ class ExecutableMemoryAllocator
int32_t GenerateRandomStartOffset();

private:

#ifdef TARGET_XARCH
// There does not seem to be an easy way find the size of a library on Unix.
// So this constant represents an approximation of the libcoreclr size (on debug build)
// that can be used to calculate an approximate location of the memory that
Expand All @@ -191,6 +193,10 @@ class ExecutableMemoryAllocator
// will try to reserve during initialization. We want all JIT-ed code and the
// entire libcoreclr to be located in a 2GB range.
static const int32_t MaxExecutableMemorySize = 0x7FFF0000;
#else
static const int32_t CoreClrLibrarySize = 16 * 1024 * 1024;
static const int32_t MaxExecutableMemorySize = 768 * 1024 * 1024;
#endif
static const int32_t MaxExecutableMemorySizeNearCoreClr = MaxExecutableMemorySize - CoreClrLibrarySize;

// Start address of the reserved virtual address space
Expand Down
44 changes: 43 additions & 1 deletion src/coreclr/pal/src/map/virtual.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ SET_DEFAULT_DEBUG_CHANNEL(VIRTUAL); // some headers have code with asserts, so d
#include <string.h>
#include <unistd.h>
#include <limits.h>
#include <clrconfignocache.h>

#if HAVE_VM_ALLOCATE
#include <mach/vm_map.h>
Expand Down Expand Up @@ -2140,7 +2141,15 @@ void ExecutableMemoryAllocator::TryReserveInitialMemory()
int32_t preferredStartAddressIncrement;
UINT_PTR preferredStartAddress;
UINT_PTR coreclrLoadAddress;
const int32_t MemoryProbingIncrement = 128 * 1024 * 1024;

// If we manage to reserve the initial memory close to coreclr we might get a better performance
// but it's better to turn it off when we run benchmarks for more stable results (always reserve far from coreclr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not sound right for the benchmarks to measure something else than what customers see.

Copy link
Member Author

@EgorBo EgorBo Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem that currently on ARM we have quite shaky results and it's difficult to measure improvements from various changes, e.g. note this "ampere" line compared to x64 windows and Linux
image

I'll experiment on crank what my flags do to a series of measurements and how big RPS is when we're lucky to reserve a piece next to coreclr (and how often) - according to my previous measurements it's 5% larger in that case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and in fact, in like 50% cases apps won't be able to reserve such a big chunk next to coreclr so that flag simulates, probably, the most common case.

CLRConfigNoCache dontReserve = CLRConfigNoCache::Get("DontReserveInitialMemoryNearClr", /*noprefix*/ false, &getenv);
if (dontReserve.IsSet())
{
m_startAddress = nullptr;
goto FALLBACK;
}

// Try to find and reserve an available region of virtual memory that is located
// within 2GB range (defined by the MaxExecutableMemorySizeNearCoreClr constant) from the
Expand All @@ -2157,6 +2166,9 @@ void ExecutableMemoryAllocator::TryReserveInitialMemory()
// (thus avoiding reserving memory below 4GB; besides some operating systems do not allow that).
// If libcoreclr is loaded at high addresses then try to reserve memory below its location.
coreclrLoadAddress = (UINT_PTR)PAL_GetSymbolModuleBase((void*)VirtualAlloc);

#ifdef TARGET_XARCH
const int32_t MemoryProbingIncrement = 128 * 1024 * 1024;
if ((coreclrLoadAddress < 0xFFFFFFFF) || ((coreclrLoadAddress - MaxExecutableMemorySizeNearCoreClr) < 0xFFFFFFFF))
{
// Try to allocate above the location of libcoreclr
Expand Down Expand Up @@ -2184,9 +2196,39 @@ void ExecutableMemoryAllocator::TryReserveInitialMemory()
preferredStartAddress += preferredStartAddressIncrement;

} while (sizeOfAllocation >= MemoryProbingIncrement);
#else

// Always try to allocate above the location of libcoreclr on arm - we want to start allocating near it (withih 128Mb distance)
// however, we want to reserve as much as possible (ideally, 1Gb)
preferredStartAddress = coreclrLoadAddress + CoreClrLibrarySize;

do
{
m_startAddress = ReserveVirtualMemory(pthrCurrent, (void*)preferredStartAddress, sizeOfAllocation, MEM_RESERVE_EXECUTABLE);
if (m_startAddress != nullptr)
{
break;
}

// Try to allocate a smaller region...
sizeOfAllocation -= 64 * 1024 * 1024;
const int32_t smallestAllocSize = MaxExecutableMemorySizeNearCoreClr / 3 * 2;
if (sizeOfAllocation < smallestAllocSize)
{
// ...but not less than 2/3rd of what we initially planned
sizeOfAllocation = smallestAllocSize;
}

// Probe each 8Mb
preferredStartAddress += 8 * 1024 * 1024;

// bail out if preferredStartAddress is already too far from coreclr and we won't be able to use relocs
} while ((preferredStartAddress - coreclrLoadAddress) < (100 * 1024 * 1024));
#endif

if (m_startAddress == nullptr)
{
FALLBACK:
// We were not able to reserve any memory near libcoreclr. Try to reserve approximately 2 GB of address space somewhere
// anyway:
// - This sets aside address space that can be used for executable code, such that jumps/calls between such code may
Expand Down