Skip to content

Conversation

@janvorli
Copy link
Member

@janvorli janvorli commented Jun 1, 2021

While stress testing runtime on Windows ARM64 to repro an issue, I have
hit an ordering issue in access to m_maxDynamicEntries and
m_pDynamicStaticsInfo in the Module::AllocateDynamicEntry.

In one of a few thousands of runs, I got a runtime crash when the
m_pDynamicStaticsInfo was NULL. I was able to repro this issue reliably
in the above mentioned number of runs. After the fix, it doesn't repro
anymore.

The newId was 1 at the time of the crash, the m_maxDynamicEntries was
around 128. So the crashing thread has got the id 1, then another thread
updated the m_maxDynamicEntries before the crashing thread executed the
if (newId >= m_maxDynamicEntries), so the thread went right to the
m_pDynamicStaticsInfo[newId].pEnclosingMT = pMT;. The
m_pDynamicStaticInfo is written before the m_maxDynamicEntries, so it
was expected that it cannot be NULL, but the crashing thread has seen
the operations in an opposite order due to the CPU access reordering.

The fix is to add VolatileLoad for accessing the m_maxDynamicEntries and
VolatileStore to set it. That ensures that the writes will be visible in
the intended order.

While stress testing runtime on Windows ARM64 to repro an issue, I have
hit an ordering issue in access to m_maxDynamicEntries and
m_pDynamicStaticsInfo in the Module::AllocateDynamicEntry.

In one of a few thousands of runs, I got a runtime crash when the
m_pDynamicStaticsInfo was NULL. I was able to repro this issue reliably
in the above mentioned number of runs. After the fix, it doesn't repro
anymore.

The newId was 1 at the time of the crash, the m_maxDynamicEntries was
around 128. So the crashing thread has got the id 1, then another thread
updated the m_maxDynamicEntries before the crashing thread executed the
`if (newId >= m_maxDynamicEntries)`, so the thread went right to the
`m_pDynamicStaticsInfo[newId].pEnclosingMT = pMT;`. The
m_pDynamicStaticInfo is written before the m_maxDynamicEntries, so it
was expected that it cannot be NULL, but the crashing thread has seen
the operations in an opposite order due to the CPU access reordering.

The fix is to add VolatileLoad for accessing the m_maxDynamicEntries and
VolatileStore to set it. That ensures that the writes will be visible in
the intended order.
@janvorli janvorli added this to the 6.0.0 milestone Jun 1, 2021
@janvorli janvorli requested a review from jkotas June 1, 2021 22:26
@janvorli janvorli self-assigned this Jun 1, 2021
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@janvorli janvorli merged commit 190c69d into dotnet:main Jun 2, 2021
@janvorli janvorli deleted the fix-arm64-access-ordering-issue branch June 2, 2021 00:31
@ghost ghost locked as resolved and limited conversation to collaborators Jul 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants