feat: Added initial resilient background service implementation#12
feat: Added initial resilient background service implementation#12
Conversation
…nts a fault in the loop from taking out the service
2482f54 to
1670727
Compare
There was a problem hiding this comment.
Critical Issue
Bug in test: The test DoesntCatchFireIfInnerLoopThrows references a non-existent property ExecuteTask on line 43 of LazarusServiceTests.cs. The LazarusService class does not expose its ExecuteAsync task, so _ts.ExecuteTask will fail to compile.
Fix: Either expose the ExecuteTask property from BackgroundService in your LazarusService class, or remove this test assertion.
Potential Issue
Race condition in tests: In LazarusServiceTests.cs, test RunsInnerLoop (line 26) does not await StartAsync, which could cause race conditions. The service starts executing immediately, but the test advances time right after without ensuring the service is ready. While this may work with FakeTimeProvider, it is fragile.
Recommendation: Either await the StartAsync call or add a synchronization point to ensure the service has started before advancing time.
Minor Observation
Missing cancellation handling: In LazarusService.cs:40, Task.Delay is called with the cancellation token, but if cancellation occurs during the delay, an OperationCanceledException will be thrown outside the try-catch that handles it (since it is after the try-catch block). This could cause the service to exit without logging.
Recommendation: Wrap the delay in the try-catch, or handle cancellation explicitly after the delay.
There was a problem hiding this comment.
Critical Bug: Delay After Exception
Issue in LazarusService.cs:20-40
The delay logic is incorrectly placed inside the try block, causing the service to retry immediately after exceptions instead of respecting the configured delay.
Current behavior:
- Exception occurs in
PerformLoop - Exception is caught (line 34-37)
- Loop continues immediately (delay at line 28 was skipped)
- Service hammers the failing operation in a tight loop
This defeats the purpose of a resilient service pattern. After an exception, the service should delay before retrying to:
- Avoid resource exhaustion
- Give external dependencies time to recover
- Prevent log spam
- Match the expected loop timing
Fix: Move the delay outside the try block so it executes regardless of success/failure.
Note: The test at line 65-70 of LazarusServiceTests.cs expects immediate retry after exception ("Retries immediately after exception"), but this behavior is problematic for production use. Consider whether immediate retry is actually desirable, or if the delay should apply after exceptions as well.
There was a problem hiding this comment.
Critical Bug: Delay After Exception
Issue in LazarusService.cs:20-40
The delay logic is incorrectly placed inside the try block, causing the service to retry immediately after exceptions instead of respecting the configured delay.
Current behavior:
- Exception occurs in
PerformLoop - Exception is caught (line 34-37)
- Loop continues immediately (delay at line 28 was skipped)
- Service hammers the failing operation in a tight loop
This defeats the purpose of a resilient service pattern. After an exception, the service should delay before retrying to:
- Avoid resource exhaustion
- Give external dependencies time to recover
- Prevent log spam
- Match the expected loop timing
Fix: Move the delay outside the try block so it executes regardless of success/failure.
Note: The test at line 65-70 of LazarusServiceTests.cs expects immediate retry after exception ("Retries immediately after exception"), but this behavior is problematic for production use. Consider whether immediate retry is actually desirable, or if the delay should apply after exceptions as well.
There was a problem hiding this comment.
Substantive Issues Found
1. Logic Error in Exception Handling (LazarusService.cs:30-32)
When OperationCanceledException is caught with cancellationToken.IsCancellationRequested == true, the code logs but continues the loop. Since cancellation was requested, the method should exit instead of continuing. Consider adding break; or return; after logging.
2. Resource Leak in Tests (LazarusServiceTests.cs:105)
SemaphoreSlim _loopSignal is created but never disposed. The TestService class should implement IDisposable and dispose the semaphore, or the LazarusServiceTests.Dispose() should handle it.
3. Potential Race Condition in Tests (LazarusServiceTests.cs:105, 120, 132)
The semaphore is initialized with count=1, which means the first WaitForLoopAsync() will succeed immediately without waiting for the loop to actually run. This could cause race conditions in tests. Consider initializing with count=0 and ensuring proper synchronization.
4. Test Logic Issue in RunsInnerLoop (LazarusServiceTests.cs:24-31)
After StartAsync, calling AdvanceTime() should trigger the loop to run once (Counter becomes 1), but the assertion expects Counter==0 on the first iteration. Verify this test logic is correct - it appears the assertion should be IsEqualTo(i + 1) or the loop structure needs adjustment.
There was a problem hiding this comment.
Substantive Issues Found
1. Logic Error in Exception Handling (LazarusService.cs:30-32)
When OperationCanceledException is caught with cancellationToken.IsCancellationRequested == true, the code logs but continues the loop. Since cancellation was requested, the method should exit instead of continuing. Consider adding break or return after logging.
2. Resource Leak in Tests (LazarusServiceTests.cs:105)
SemaphoreSlim _loopSignal is created but never disposed. The TestService class should implement IDisposable and dispose the semaphore, or the LazarusServiceTests.Dispose() should handle it.
3. Potential Race Condition in Tests (LazarusServiceTests.cs:105, 120, 132)
The semaphore is initialized with count=1, which means the first WaitForLoopAsync() will succeed immediately without waiting for the loop to actually run. This could cause race conditions in tests. Consider initializing with count=0 and ensuring proper synchronization.
4. Test Logic Issue in RunsInnerLoop (LazarusServiceTests.cs:24-31)
After StartAsync, calling AdvanceTime() should trigger the loop to run once (Counter becomes 1), but the assertion expects Counter==0 on the first iteration. Verify this test logic is correct - it appears the assertion should be IsEqualTo(i + 1) or the loop structure needs adjustment.
There was a problem hiding this comment.
Found one substantive issue:
Race condition in loop timing (LazarusService.cs:26-28)
The delay happens before executing the loop body. This means:
- No work is done until after the first delay on startup
- If
PerformLoop()throws an exception, the next iteration still waits the full_loopDelay, creating inconsistent timing between successful and failed iterations
Consider moving Task.Delay() to the end of the loop body, or implementing a backoff strategy for failed loops.
The rest looks good - solid exception handling, proper cancellation support, and comprehensive tests.
No description provided.