-
Notifications
You must be signed in to change notification settings - Fork 751
fix: unit tests broken, better cancellation tests #3105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: unit tests broken, better cancellation tests #3105
Conversation
Signed-off-by: michaelfeil <[email protected]>
|
👋 Hi michaelfeil! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
WalkthroughReworks cancellation tests’ server lifecycle to run in a dedicated thread coordinated by a threading.Event, introduces a session-scoped runtime fixture, adjusts the server fixture to yield a thread and handler, removes the subprocess wrapper test module, and adds a pre_merge pytest marker to individual test modules. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor PyTest as PyTest Runner
participant CF as conftest.py
participant RT as runtime (session-scoped)
participant SF as server fixture
participant TH as Server Thread
participant EV as stop_event
PyTest->>CF: request runtime
CF-->>PyTest: provide RT (session-scoped)
PyTest->>SF: request server(namespace)
SF->>TH: start thread asyncio.run(init_server(RT, ns, EV))
TH->>TH: init backend service and endpoint
TH->>TH: start serving (async task)
SF-->>PyTest: yield (thread, handler)
rect rgba(230,240,255,0.6)
note over PyTest,TH: Tests execute and interact with handler/server
end
PyTest->>SF: teardown server fixture
SF->>EV: set() (signal shutdown)
TH->>TH: await serve task stop or timeout
TH-->>SF: exit thread
SF->>SF: join thread with timeout
SF-->>PyTest: teardown complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Pre-merge checks❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/bindings/python/tests/test_cancellation/conftest.py (1)
151-174: Fix task lifecycle: avoid pending-task leaks, await cancellations, and handle timeout deterministically.Current flow can leave
event_done_taskpending, not await a cancelledserve_task, and silently proceed on timeout. Tighten it and surface timeouts.- # Serve the endpoint - this will block until shutdown - serve_task = asyncio.create_task(endpoint.serve_endpoint(handler.generate)) - event_done_task = asyncio.create_task(asyncio.to_thread(stop_event.wait)) - - done, pending = await asyncio.wait( - {serve_task, event_done_task}, - return_when=asyncio.FIRST_COMPLETED, - timeout=30.0, - ) - if serve_task in done: - print("Server task completed") - await serve_task # Propagate exceptions if any - else: - serve_task.cancel() + # Serve the endpoint - this will block until shutdown + serve_task = asyncio.create_task( + endpoint.serve_endpoint(handler.generate), name=f"serve:{namespace}" + ) + stop_wait_task = asyncio.create_task( + asyncio.to_thread(stop_event.wait), name=f"stopwait:{namespace}" + ) + + try: + done, _ = await asyncio.wait( + {serve_task, stop_wait_task}, + return_when=asyncio.FIRST_COMPLETED, + timeout=30.0, + ) + if not done: + raise TimeoutError("server did not start/stop within 30s") + # Stop signal: cancel server if still running + if stop_wait_task in done and not serve_task.done(): + serve_task.cancel() + # Propagate server outcome (including CancelledError) + if serve_task in done: + await serve_task + finally: + for t in (serve_task, stop_wait_task): + if not t.done(): + t.cancel() + with suppress(asyncio.CancelledError): + await t
🧹 Nitpick comments (9)
lib/bindings/python/tests/test_cancellation/conftest.py (3)
176-179: Make the server thread daemon and name it for easier debugging.Prevents process hangs on interpreter shutdown and improves observability.
- thread = threading.Thread( - target=asyncio.run, args=(init_server(runtime, namespace, stop_event),) - ) + thread = threading.Thread( + target=asyncio.run, + args=(init_server(runtime, namespace, stop_event),), + name=f"server:{namespace}", + daemon=True, + )
181-183: Drop fixed startup sleep; rely on client.wait_for_instances().The client fixture already awaits readiness; the sleep just adds nondeterminism and test latency.
- # Give server time to start up - await asyncio.sleep(0.5) + # Client fixture handles readiness via wait_for_instances()
184-186: Fail fast if the server thread doesn’t stop.Surface teardown failures instead of silently continuing with a live thread.
yield thread, handler stop_event.set() - await asyncio.to_thread(thread.join, 5) + await asyncio.to_thread(thread.join, 5) + if thread.is_alive(): + pytest.fail("Server thread failed to shut down within 5s")lib/bindings/python/tests/test_cancellation/test_server_raise_cancelled.py (2)
19-20: Register the custom marker to avoid PytestUnknownMarkWarning.Add
pre_mergeto pytest markers in pyproject/pytest.ini.Example (pyproject.toml):
[tool.pytest.ini_options] markers = [ "pre_merge: run on pre-merge pipelines", ]
35-41: Assertion on exact error string is brittle; assert on intent.Match on key substrings instead of the full message with trailing space.
- assert ( - str(e) - == "a python exception was caught while processing the async generator: CancelledError: " - ) + msg = str(e) + assert "python exception was caught" in msg + assert "CancelledError" in msglib/bindings/python/tests/test_cancellation/test_server_context_cancel.py (2)
19-20: Register the custom marker to avoid PytestUnknownMarkWarning.Same as other modules adding
pre_merge.
35-39: Loosen exact-match assertion to reduce flakiness across backends.Error wording may vary slightly; assert on the salient part.
- assert str(e) == "Stream ended before generation completed" + assert "Stream ended before generation completed" in str(e)lib/bindings/python/tests/test_cancellation/test_client_loop_break.py (2)
20-21: Register the custom marker to avoid PytestUnknownMarkWarning.Same marker advice as siblings.
36-45: Consider explicitly closing the stream after breaking the loop.If supported by the client API, closing ensures cancellation propagates immediately instead of relying on GC.
Possible pattern:
# after break await getattr(stream, "aclose", lambda: None)()
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
lib/bindings/python/tests/test_cancellation/conftest.py(3 hunks)lib/bindings/python/tests/test_cancellation/test_cancellation.py(0 hunks)lib/bindings/python/tests/test_cancellation/test_client_loop_break.py(1 hunks)lib/bindings/python/tests/test_cancellation/test_server_context_cancel.py(1 hunks)lib/bindings/python/tests/test_cancellation/test_server_raise_cancelled.py(1 hunks)
💤 Files with no reviewable changes (1)
- lib/bindings/python/tests/test_cancellation/test_cancellation.py
🧰 Additional context used
🧬 Code graph analysis (1)
lib/bindings/python/tests/test_cancellation/conftest.py (3)
lib/bindings/python/rust/lib.rs (6)
namespace(362-367)endpoint(549-555)serve_endpoint(569-595)generate(753-765)cancel(534-536)shutdown(376-378)lib/bindings/python/rust/engine.rs (2)
generate(95-97)generate(150-283)lib/bindings/python/src/dynamo/_core.pyi (1)
DistributedRuntime(31-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
| @pytest.fixture(scope="session") | ||
| async def runtime(): | ||
| """Create a DistributedRuntime for testing""" | ||
| # TODO(michaelfeil): consider re-using runtime across tests to not have to launch tests via subprocess. | ||
| loop = asyncio.get_running_loop() | ||
| runtime = DistributedRuntime(loop, True) | ||
| yield runtime | ||
| runtime.shutdown() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Session-scoped runtime bound to the first test’s event loop can break later tests.
pytest-asyncio often creates a new loop per test by default. Capturing asyncio.get_running_loop() in a session fixture risks “attached to a different loop” issues. Bind the runtime to a session-scoped loop explicitly and inject it.
Proposed change in this file:
-@pytest.fixture(scope="session")
-async def runtime():
+@pytest.fixture(scope="session")
+async def runtime(event_loop):
"""Create a DistributedRuntime for testing"""
- # TODO(michaelfeil): consider re-using runtime across tests to not have to launch tests via subprocess.
- loop = asyncio.get_running_loop()
- runtime = DistributedRuntime(loop, True)
+ runtime = DistributedRuntime(event_loop, True)
yield runtime
runtime.shutdown()Add a session-scoped event loop fixture (can live in the same conftest or a top-level one):
@pytest.fixture(scope="session")
def event_loop():
loop = asyncio.new_event_loop()
try:
yield loop
finally:
loop.close()Run this to verify pytest-asyncio config and marker registration:
🏁 Script executed:
#!/bin/bash
# Check pytest-asyncio version and mode hints
rg -n --glob '{pyproject.toml,pytest.ini,setup.cfg}' -e 'pytest-asyncio|asyncio_mode|markers|pre_merge' -C2 || true
# Detect existing event_loop fixtures
rg -nP 'def\s+event_loop\s*\(' -g '!**/.venv/**' -C2 || trueLength of output: 1004
Don't bind a session-scoped DistributedRuntime to asyncio.get_running_loop()
pyproject.toml sets asyncio_mode = "auto" and there is no custom event_loop fixture; the current session fixture captures the first running loop and can cause "attached to a different loop"/ScopeMismatch failures. Location: lib/bindings/python/tests/test_cancellation/conftest.py (lines 189–196).
Fix (choose one):
- Recommended: create and manage a dedicated session loop inside the fixture (loop = asyncio.new_event_loop(); pass that loop to DistributedRuntime; close it on teardown).
- Alternative: override pytest-asyncio's event_loop as session-scoped and inject it into the runtime fixture (changes global per-test loop isolation).
|
I made some enhancement on top of this PR commit: #3127 This should fix the runtime already initialized issue and clean up the subprocess launch code. |
Signed-off-by: michaelfeil [email protected]
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
Tests
Refactor
Chores