perf: pre-compute and cache typed event receivers #4308

thomhurst · 2026-01-11T23:01:26Z

Summary

Pre-compute and cache typed event receiver arrays at first access instead of filtering on every invocation
Single-pass categorization: iterate eligible objects once, classify by interface type
Track class instance to invalidate caches on retry (handles new instance creation)

Before

Each event invocation:

Get eligible objects (cached)
Iterate through ALL objects, check is ITestStartEventReceiver
Build list, sort by Order, apply scoped-attribute filtering
Invoke receivers

After

First access:

Get eligible objects
Single pass: categorize each object by all interface types it implements
Sort and filter each category once
Cache all typed arrays

Subsequent accesses:

Return cached array directly (just a property access)
Invoke receivers

Class instance invalidation

The test class instance can change on retries. We track which instance was used when building caches and invalidate all caches if it changes.

Test plan

Build succeeds
Performance benchmark tests pass (60/60)

🤖 Generated with Claude Code

thomhurst · 2026-01-11T23:02:49Z

Summary

Pre-computes and caches typed event receiver arrays to eliminate repeated filtering and sorting on every event invocation.

Critical Issues

None found ✅

Suggestions

Performance Enhancement Opportunity (Optional)

The current implementation still applies stage filtering after caching (lines EventReceiverOrchestrator.cs:106-108, 155-159). While this is correct and simpler, consider whether pre-computing stage-specific caches would provide additional benefit:

Current approach:

Cache: One array per receiver type
Runtime: Still filters by stage on each invocation (.NET 8.0+)

Alternative:

Cache: Separate arrays per receiver type AND stage
Runtime: Direct cache lookup with zero filtering

Given that stage filtering is only used in a few places and the overhead is minimal (simple enum comparison), the current approach is reasonable. Only consider the alternative if profiling shows stage filtering is a measurable cost.

Minor: Cache Invalidation Clarity

The InvalidateCachesIfClassInstanceChanged method in TestContextExtensions.cs:34-47 is called from multiple places. Consider adding a comment explaining when class instances change (retries) to help future maintainers understand this logic.

TUnit Rule Compliance

✅ Performance First - Minimizes allocations via caching and single-pass categorization
✅ No blocking on async - All operations properly async
✅ Modern C# - Uses collection expressions []
✅ Hot path optimization - Caching eliminates repeated work in test execution

Verdict

✅ APPROVE - Solid performance optimization with proper cache invalidation handling.

thomhurst · 2026-01-11T23:06:04Z

Summary

Pre-computes and caches typed event receiver arrays to eliminate repeated filtering and sorting on every event invocation.

Critical Issues

1. Race Condition in Cache Initialization ⚠️

The EnsureEventReceiversCached method has a classic check-then-act race condition:

// TUnit.Engine/Extensions/TestContextExtensions.cs:64-67
if (testContext.CachedTestStartReceivers != null)
{
    return;
}
// ... compute and set all caches

Problem: If two threads call this simultaneously, both may pass the null check and both will compute the receivers. The last one to write wins, wasting the other thread's work.

Impact:

Wasted CPU cycles (minor - only happens on first access)
Potential for seeing partially-initialized state if one thread reads while another is setting multiple cache fields

Fix: Use double-checked locking with a lock object, or use LazyInitializer.EnsureInitialized pattern (similar to TrackedObjects in TestContext.cs:196-197).

2. Race Condition in Instance Change Detection ⚠️

The InvalidateCachesIfClassInstanceChanged method is not atomic:

// TUnit.Engine/Extensions/TestContextExtensions.cs:46-52
var currentClassInstance = testContext.Metadata.TestDetails.ClassInstance;
if (testContext.CachedClassInstance != null &&
    !ReferenceEquals(testContext.CachedClassInstance, currentClassInstance))
{
    testContext.InvalidateEventReceiverCaches();
}
testContext.CachedClassInstance = currentClassInstance;

Problem: Thread A and B can both detect an instance change, both invalidate, then thread A builds caches, thread B invalidates them again unnecessarily.

Fix: Store the instance reference atomically when building caches, not in a separate step.

3. Existing Race Condition in GetEligibleEventObjects ⚠️

The existing code at TestContextExtensions.cs:138-145 (now lines 140-152 after your changes) has the same race condition pattern. While not introduced by this PR, your PR adds more cached state with the same issue.

Recommendation: Address all cache initialization race conditions together using a consistent pattern.

Suggestions

1. Consider `Lazy<T>` for Thread Safety

Instead of manual null checks, consider using Lazy<T> or LazyInitializer:

// Example pattern from TestContext.cs
internal ConcurrentDictionary<int, HashSet<object>> TrackedObjects =>
    LazyInitializer.EnsureInitialized(ref _trackedObjects)!;

This would eliminate race conditions and make the code clearer.

2. Defer Exception List Allocation

Good optimization in EventReceiverOrchestrator.cs:136 - you already do this! This is the right approach.

Verdict

⚠️ REQUEST CHANGES - Race conditions in cache initialization could cause wasted work or partial state visibility in concurrent scenarios. The performance optimization is good, but needs thread-safety fixes.

Note: TUnit tests can run in parallel, so these race conditions are real concerns, not theoretical. The fix should be straightforward using existing .NET patterns like LazyInitializer or double-checked locking with proper memory barriers.

Instead of filtering eligible objects by interface type on every event invocation, pre-compute and cache typed receiver arrays. Changes: - Add cached receiver arrays to TestContext for each receiver type - Single-pass categorization: iterate eligible objects once, classify by interface type, sort by Order, apply scoped-attribute filtering - Track class instance to invalidate caches on retry (new instance) - Update EventReceiverOrchestrator to use cached receivers directly - Update TestBuilder and TestFilterService to use cached receivers Benefits: - Eliminates repeated interface type checks (is ITestStartReceiver) - Eliminates repeated list building and sorting per invocation - Eliminates repeated scoped-attribute filtering - Single iteration through eligible objects populates all caches Co-Authored-By: Claude Opus 4.5 <[email protected]>

thomhurst · 2026-01-11T23:33:04Z

Summary

Pre-computes and caches typed event receiver arrays to eliminate repeated filtering and sorting on every event invocation.

Critical Issues

Race Condition in Cache Initialization ⚠️

The EnsureEventReceiversCached method has a classic check-then-act race condition (TestContextExtensions.cs:40-43 in the diff):

// If any cache is already populated, all are populated (we fill them together)
if (testContext.CachedTestStartReceivers != null)
{
    return;
}

Problem: If two threads invoke different receiver getters simultaneously (e.g., GetTestStartReceivers and GetTestEndReceivers), both may pass the null check and both will compute all receivers. This wastes CPU and could expose partially-initialized state.

Impact:

Wasted CPU cycles during concurrent first access
Potential visibility of partially-set cache fields across threads (no memory barriers)

Solution: Use the existing pattern from TestContext.cs:196-197:

private ITestStartEventReceiver[]? _cachedTestStartReceivers;
internal ITestStartEventReceiver[] CachedTestStartReceivers =>
    LazyInitializer.EnsureInitialized(ref _cachedTestStartReceivers, 
        () => SortAndFilter(/* ... */))!;

Since all caches are built together, you could use a single Lazy<T> wrapping a struct/tuple containing all five arrays, or use LazyInitializer.EnsureInitialized with a factory function that builds everything and stores it atomically.

Instance Change Detection Race Condition ⚠️

The InvalidateCachesIfClassInstanceChanged method (TestContextExtensions.cs:24-30 in diff) has a read-check-invalidate-write pattern that's not atomic:

var currentClassInstance = testContext.Metadata.TestDetails.ClassInstance;
if (testContext.CachedClassInstance != null &&
    !ReferenceEquals(testContext.CachedClassInstance, currentClassInstance))
{
    testContext.InvalidateEventReceiverCaches();
}
testContext.CachedClassInstance = currentClassInstance;

Problem: Thread A and B can both detect a change, both invalidate, then thread A rebuilds caches, thread B invalidates again unnecessarily. Worse, thread A could be mid-rebuild when thread B invalidates.

Solution: Use Interlocked.CompareExchange to atomically check and update the cached instance reference, or integrate this check into the lazy initialization factory function.

Suggestions

Consider Consolidating Thread-Safety Pattern

The existing GetEligibleEventObjects method (line 138 in current main, line 10 in diff old code) also has the same check-then-act race. While not introduced by this PR, your changes add more cache fields with the same issue. Consider addressing all caching together using a consistent thread-safe pattern.

Stage Filtering Still Done at Runtime

Minor note: The code still filters by Stage after caching (EventReceiverOrchestrator.cs lines shown in diff). While this is correct and simple, if profiling shows this is a bottleneck, consider pre-computing stage-specific caches. Given stage filtering is just an enum comparison, current approach is likely fine.

TUnit Rule Compliance

✅ Performance First - Single-pass categorization, eliminated repeated LINQ operations
✅ Modern C# - Uses collection expressions []
✅ Hot Path Optimization - Targets test execution path with caching
⚠️ Thread Safety - Needs proper synchronization for concurrent test execution

Previous Review Status

Author's second comment (2026-01-11T23:06:04Z) correctly identified these race conditions. The issues remain unaddressed in current diff.

Verdict

⚠️ REQUEST CHANGES - Race conditions in cache initialization must be fixed before merge. TUnit supports parallel test execution, making these real concerns. Use LazyInitializer or equivalent to ensure thread-safe initialization with proper memory barriers.

The engine doesn't call these methods from multiple threads, so locking overhead isn't worth it. Simplified to: - Single cache initialization path through EnsureEventReceiversCached - Cache validation checks class instance hasn't changed - GetEligibleEventObjects delegates to EnsureEventReceiversCached Co-Authored-By: Claude Opus 4.5 <[email protected]>

thomhurst · 2026-01-11T23:39:52Z

Summary

Pre-computes and caches typed event receiver arrays to eliminate repetitive filtering, sorting, and scoped-attribute processing on each event invocation, with cache invalidation handling for test retry scenarios.

Critical Issues

1. Stage filtering breaks the caching optimization

Location: TUnit.Engine/Extensions/TestContextExtensions.cs:84-86 and TUnit.Engine/Services/EventReceiverOrchestrator.cs (new lines 104-109, 156-161)

Problem: The caching strategy doesn't account for stage filtering, which happens AFTER retrieving cached receivers. Each test invokes event receivers twice (Early + Late stages), but the cache stores ALL receivers together regardless of stage.

Current flow:

Build cache: All ITestStartEventReceiver instances → sorted → scoped-filtered → cached
On invocation with stage=Early: Retrieve cached array → iterate and filter by stage → invoke
On invocation with stage=Late: Retrieve cached array → iterate and filter by stage → invoke

Impact: The primary benefit of caching (avoiding iteration) is lost since you still iterate through the cached array to filter by stage on every invocation. Looking at TestExecutor.cs:122,129,154,159, each test calls event receivers 4 times (2x start, 2x end) with different stages.

Suggested fix: Either:

Cache receivers separately by stage (Early/Late arrays), OR
Remove stage filtering from invocation and pre-filter during cache build (if stage doesn't change per context)

This doesn't cause incorrect behavior, but significantly reduces the performance benefit described in the PR.

Suggestions

None - the implementation is otherwise sound. The cache invalidation logic for class instance changes is well-thought-out and handles test retry scenarios correctly.

Verdict

⚠️ REQUEST CHANGES - The stage filtering issue defeats the primary performance optimization. The PR works correctly but doesn't deliver the expected perf benefits for the most common case.

thomhurst · 2026-01-11T23:43:01Z

Regarding the race condition concerns: The same test won't have multiple threads accessing those methods at the same time - the engine scopes tests properly. Each TestContext is accessed by a single test execution thread, so the locking overhead isn't necessary.

The cache validation still handles the retry scenario correctly (when class instance changes), but we don't need thread synchronization since concurrent access to the same TestContext doesn't occur in practice.

On .NET 8+, pre-compute separate caches for Early and Late stage receivers to eliminate runtime stage filtering. This ensures the caching optimization delivers full performance benefits. - TestContext: Add stage-specific cache properties for .NET 8+ - TestContextExtensions: Categorize receivers by stage during cache build - EventReceiverOrchestrator: Use stage-specific caches directly When stage is null (e.g., skipped tests), both Early and Late receivers are invoked in order for backward compatibility. Co-Authored-By: Claude Opus 4.5 <[email protected]>

thomhurst · 2026-01-11T23:49:24Z

Summary

Pre-computes and caches typed event receiver arrays to eliminate repeated filtering on every event invocation, with class instance tracking for retry invalidation.

Critical Issues

None found

Suggestions

1. Consider AOT annotations for reflection usage

While the code does not directly call reflection APIs, BuildEligibleEventObjects accesses details.ClassInstance and metadata that may involve reflection internally. Consider verifying whether DynamicallyAccessedMembers annotations are needed on any reflected types, as per Rule 5 (AOT compatibility).

2. Performance validation looks good

The PR correctly addresses hot path optimization (Rule 4):

Single-pass categorization eliminates repeated OfType calls
Pre-sorted and filtered arrays cached for direct access
Class instance tracking handles retry scenarios correctly
No new allocations in the repeated invocation path

The PR description mentions 60/60 performance benchmark tests pass which validates the performance improvements.

Rule Compliance Check

Rule 1 (Dual-Mode): N/A - Changes are post-metadata-collection (unified code path)
Rule 2 (Snapshot Testing): No source generator or public API changes, no snapshots needed
Rule 3 (No VSTest): No VSTest dependencies introduced
Rule 4 (Performance First): Excellent hot path optimization - eliminates allocations on repeated calls
Rule 5 (AOT Compatible): Suggest verification (see suggestion 1)

Verdict

APPROVE - No critical issues. This is a solid performance optimization that follows TUnit patterns. The suggestion about AOT annotations is precautionary rather than blocking.

thomhurst force-pushed the perf/precompute-event-receivers branch from a27d40e to 3ab7652 Compare January 11, 2026 23:31

thomhurst temporarily deployed to Pull Requests January 11, 2026 23:31 — with GitHub Actions Inactive

thomhurst temporarily deployed to Pull Requests January 11, 2026 23:36 — with GitHub Actions Inactive

thomhurst temporarily deployed to Pull Requests January 11, 2026 23:47 — with GitHub Actions Inactive

thomhurst had a problem deploying to Pull Requests January 11, 2026 23:47 — with GitHub Actions Failure

thomhurst temporarily deployed to Pull Requests January 11, 2026 23:47 — with GitHub Actions Inactive

thomhurst merged commit 8b189a0 into main Jan 12, 2026
12 of 13 checks passed

thomhurst deleted the perf/precompute-event-receivers branch January 12, 2026 00:13

thomhurst mentioned this pull request Jan 12, 2026

perf: use ValueListBuilder in GetInternal #4309

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf: pre-compute and cache typed event receivers #4308

perf: pre-compute and cache typed event receivers #4308

thomhurst commented Jan 11, 2026

Uh oh!

thomhurst commented Jan 11, 2026

Uh oh!

thomhurst commented Jan 11, 2026

Uh oh!

thomhurst commented Jan 11, 2026

Uh oh!

thomhurst commented Jan 11, 2026

Uh oh!

thomhurst commented Jan 11, 2026

Uh oh!

thomhurst commented Jan 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

perf: pre-compute and cache typed event receivers #4308

perf: pre-compute and cache typed event receivers #4308

Conversation

thomhurst commented Jan 11, 2026

Summary

Before

After

Class instance invalidation

Test plan

Uh oh!

thomhurst commented Jan 11, 2026

Summary

Critical Issues

Suggestions

Performance Enhancement Opportunity (Optional)

Minor: Cache Invalidation Clarity

TUnit Rule Compliance

Verdict

Uh oh!

thomhurst commented Jan 11, 2026

Summary

Critical Issues

1. Race Condition in Cache Initialization ⚠️

2. Race Condition in Instance Change Detection ⚠️

3. Existing Race Condition in GetEligibleEventObjects ⚠️

Suggestions

1. Consider Lazy<T> for Thread Safety

2. Defer Exception List Allocation

Verdict

Uh oh!

thomhurst commented Jan 11, 2026

Summary

Critical Issues

Race Condition in Cache Initialization ⚠️

Instance Change Detection Race Condition ⚠️

Suggestions

Consider Consolidating Thread-Safety Pattern

Stage Filtering Still Done at Runtime

TUnit Rule Compliance

Previous Review Status

Verdict

Uh oh!

thomhurst commented Jan 11, 2026

Summary

Critical Issues

1. Stage filtering breaks the caching optimization

Suggestions

Verdict

Uh oh!

thomhurst commented Jan 11, 2026

Uh oh!

thomhurst commented Jan 11, 2026

Summary

Critical Issues

Suggestions

1. Consider AOT annotations for reflection usage

2. Performance validation looks good

Rule Compliance Check

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Consider `Lazy<T>` for Thread Safety