Skip to content

feat: Added initial resilient background service implementation#12

Merged
hughesjs merged 9 commits intomasterfrom
initial-setup
Dec 15, 2025
Merged

feat: Added initial resilient background service implementation#12
hughesjs merged 9 commits intomasterfrom
initial-setup

Conversation

@hughesjs
Copy link
Owner

No description provided.

Repository owner deleted a comment from claude bot Dec 15, 2025
Repository owner deleted a comment from claude bot Dec 15, 2025
Repository owner deleted a comment from claude bot Dec 15, 2025
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Issue

Bug in test: The test DoesntCatchFireIfInnerLoopThrows references a non-existent property ExecuteTask on line 43 of LazarusServiceTests.cs. The LazarusService class does not expose its ExecuteAsync task, so _ts.ExecuteTask will fail to compile.

Fix: Either expose the ExecuteTask property from BackgroundService in your LazarusService class, or remove this test assertion.


Potential Issue

Race condition in tests: In LazarusServiceTests.cs, test RunsInnerLoop (line 26) does not await StartAsync, which could cause race conditions. The service starts executing immediately, but the test advances time right after without ensuring the service is ready. While this may work with FakeTimeProvider, it is fragile.

Recommendation: Either await the StartAsync call or add a synchronization point to ensure the service has started before advancing time.


Minor Observation

Missing cancellation handling: In LazarusService.cs:40, Task.Delay is called with the cancellation token, but if cancellation occurs during the delay, an OperationCanceledException will be thrown outside the try-catch that handles it (since it is after the try-catch block). This could cause the service to exit without logging.

Recommendation: Wrap the delay in the try-catch, or handle cancellation explicitly after the delay.

Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Bug: Delay After Exception

Issue in LazarusService.cs:20-40

The delay logic is incorrectly placed inside the try block, causing the service to retry immediately after exceptions instead of respecting the configured delay.

Current behavior:

  1. Exception occurs in PerformLoop
  2. Exception is caught (line 34-37)
  3. Loop continues immediately (delay at line 28 was skipped)
  4. Service hammers the failing operation in a tight loop

This defeats the purpose of a resilient service pattern. After an exception, the service should delay before retrying to:

  • Avoid resource exhaustion
  • Give external dependencies time to recover
  • Prevent log spam
  • Match the expected loop timing

Fix: Move the delay outside the try block so it executes regardless of success/failure.


Note: The test at line 65-70 of LazarusServiceTests.cs expects immediate retry after exception ("Retries immediately after exception"), but this behavior is problematic for production use. Consider whether immediate retry is actually desirable, or if the delay should apply after exceptions as well.

Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Bug: Delay After Exception

Issue in LazarusService.cs:20-40

The delay logic is incorrectly placed inside the try block, causing the service to retry immediately after exceptions instead of respecting the configured delay.

Current behavior:

  1. Exception occurs in PerformLoop
  2. Exception is caught (line 34-37)
  3. Loop continues immediately (delay at line 28 was skipped)
  4. Service hammers the failing operation in a tight loop

This defeats the purpose of a resilient service pattern. After an exception, the service should delay before retrying to:

  • Avoid resource exhaustion
  • Give external dependencies time to recover
  • Prevent log spam
  • Match the expected loop timing

Fix: Move the delay outside the try block so it executes regardless of success/failure.


Note: The test at line 65-70 of LazarusServiceTests.cs expects immediate retry after exception ("Retries immediately after exception"), but this behavior is problematic for production use. Consider whether immediate retry is actually desirable, or if the delay should apply after exceptions as well.

Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substantive Issues Found

1. Logic Error in Exception Handling (LazarusService.cs:30-32)

When OperationCanceledException is caught with cancellationToken.IsCancellationRequested == true, the code logs but continues the loop. Since cancellation was requested, the method should exit instead of continuing. Consider adding break; or return; after logging.

2. Resource Leak in Tests (LazarusServiceTests.cs:105)

SemaphoreSlim _loopSignal is created but never disposed. The TestService class should implement IDisposable and dispose the semaphore, or the LazarusServiceTests.Dispose() should handle it.

3. Potential Race Condition in Tests (LazarusServiceTests.cs:105, 120, 132)

The semaphore is initialized with count=1, which means the first WaitForLoopAsync() will succeed immediately without waiting for the loop to actually run. This could cause race conditions in tests. Consider initializing with count=0 and ensuring proper synchronization.

4. Test Logic Issue in RunsInnerLoop (LazarusServiceTests.cs:24-31)

After StartAsync, calling AdvanceTime() should trigger the loop to run once (Counter becomes 1), but the assertion expects Counter==0 on the first iteration. Verify this test logic is correct - it appears the assertion should be IsEqualTo(i + 1) or the loop structure needs adjustment.

Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substantive Issues Found

1. Logic Error in Exception Handling (LazarusService.cs:30-32)

When OperationCanceledException is caught with cancellationToken.IsCancellationRequested == true, the code logs but continues the loop. Since cancellation was requested, the method should exit instead of continuing. Consider adding break or return after logging.

2. Resource Leak in Tests (LazarusServiceTests.cs:105)

SemaphoreSlim _loopSignal is created but never disposed. The TestService class should implement IDisposable and dispose the semaphore, or the LazarusServiceTests.Dispose() should handle it.

3. Potential Race Condition in Tests (LazarusServiceTests.cs:105, 120, 132)

The semaphore is initialized with count=1, which means the first WaitForLoopAsync() will succeed immediately without waiting for the loop to actually run. This could cause race conditions in tests. Consider initializing with count=0 and ensuring proper synchronization.

4. Test Logic Issue in RunsInnerLoop (LazarusServiceTests.cs:24-31)

After StartAsync, calling AdvanceTime() should trigger the loop to run once (Counter becomes 1), but the assertion expects Counter==0 on the first iteration. Verify this test logic is correct - it appears the assertion should be IsEqualTo(i + 1) or the loop structure needs adjustment.

@hughesjs hughesjs merged commit 98784b5 into master Dec 15, 2025
2 checks passed
@hughesjs hughesjs deleted the initial-setup branch December 15, 2025 01:17
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found one substantive issue:

Race condition in loop timing (LazarusService.cs:26-28)

The delay happens before executing the loop body. This means:

  • No work is done until after the first delay on startup
  • If PerformLoop() throws an exception, the next iteration still waits the full _loopDelay, creating inconsistent timing between successful and failed iterations

Consider moving Task.Delay() to the end of the loop body, or implementing a backoff strategy for failed loops.

The rest looks good - solid exception handling, proper cancellation support, and comprehensive tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant