Skip to content

Conversation

@grahamking
Copy link
Contributor

@grahamking grahamking commented Sep 17, 2025

This removes one use of the EtcdClient binding, which is going away.

Also remove some unnecessary logging of lease_id, which used the binding too.

And rename etcd_client() to do_not_use_etcd_client() so no-one adds another use of it.

Summary by CodeRabbit

  • New Features
    • Added a runtime API to clear key-value namespaces, enabling quicker cleanup during development and testing.
  • Refactor
    • Migrated components and examples to use the runtime for namespace clearing and client access, reducing direct dependency on the underlying key-value store.
    • Renamed the public client accessor in bindings and marked the old name as deprecated.
  • Style
    • Simplified startup logs in examples by removing lease ID details.
    • Removed redundant license headers in a few example and test files.
  • Tests
    • Updated tests to use the new client access path.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 17, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 17, 2025

Walkthrough

The change replaces direct EtcdKvCache usage and public etcd_client access with runtime-provided APIs. A new temp_clear_namespace method is added to DistributedRuntime (Rust and Python bindings). Call sites now use runtime.temp_clear_namespace and runtime.do_not_use_etcd_client. Some example logs drop lease IDs. Type hints/tests reflect the renaming/removal.

Changes

Cohort / File(s) Summary of changes
Runtime API and Python bindings
lib/runtime/src/distributed.rs, lib/bindings/python/rust/lib.rs, lib/bindings/python/src/dynamo/_core.pyi
Added DistributedRuntime.temp_clear_namespace (Rust + Python binding). Renamed DistributedRuntime.etcd_client → do_not_use_etcd_client. Removed EtcdKvCache.clear_all from Python bindings and .pyi stubs. Updated stub docstring to warn deprecation.
Backend namespace clearing utilities
components/backends/sglang/src/dynamo/sglang/utils/clear_namespace.py, components/backends/trtllm/utils/clear_namespace.py
Replaced EtcdKvCache-based clearing with await runtime.temp_clear_namespace(f"/{namespace}/"). Removed EtcdKvCache imports. Adjusted log messages. Public function signatures unchanged.
Etcd client acquisition updates in workers/tests
components/backends/vllm/src/dynamo/vllm/main.py, components/planner/src/dynamo/planner/virtual_connector.py, examples/multimodal/components/worker.py, tests/router/test_router_e2e_with_mockers.py, lib/bindings/python/tests/test_etcd_bindings.py
Switched runtime.etcd_client() calls to runtime.do_not_use_etcd_client(). Control flow otherwise unchanged. Some files also remove license headers (text-only).
Examples: remove lease logging
examples/custom_backend/hello_world/hello_world.py, lib/bindings/python/examples/hello_world/server.py
Deleted retrieval/printing of primary lease ID; logging now omits lease info. No behavior changes; one file also removes license header block.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User as Caller/CLI
  participant PyUtil as clear_namespace (Py)
  participant PyRuntime as DistributedRuntime (Py binding)
  participant RsRuntime as DistributedRuntime (Rust)
  participant Etcd as EtcdClient

  User->>PyUtil: clear_namespace(runtime, namespace)
  PyUtil->>PyRuntime: temp_clear_namespace("/<ns>/")
  PyRuntime->>RsRuntime: temp_clear_namespace(name)
  alt Etcd client present
    RsRuntime->>Etcd: kv_get_prefix(name)
    loop for each key
      RsRuntime->>Etcd: kv_delete(key)
      Etcd-->>RsRuntime: ok
    end
    RsRuntime-->>PyRuntime: Ok(())
  else No etcd (static workers)
    RsRuntime-->>PyRuntime: Ok(())
  end
  PyRuntime-->>PyUtil: Ok
  PyUtil-->>User: Done (logged)
Loading
sequenceDiagram
  autonumber
  participant Worker as Worker process
  participant PyRuntime as DistributedRuntime (Py binding)
  participant Etcd as EtcdClient
  participant Cfg as configure_ports_with_etcd

  Worker->>PyRuntime: do_not_use_etcd_client()
  PyRuntime-->>Worker: Optional[EtcdClient]
  Worker->>Cfg: configure_ports_with_etcd(etcd_client, ...)
  Cfg->>Etcd: read/update port info
  Etcd-->>Cfg: responses
  Cfg-->>Worker: ports configured
  note over Worker: Flow unchanged aside from client accessor rename
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

A bunny hops through keys and space,
Sweeps namespaces with gentle grace.
“Do not use,” it squeaks—new doors ajar,
Runtime now the guiding star.
Lease-less logs, a tidier trail—
Thump-thump! The refactor will prevail. 🐇✨

Pre-merge checks

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description states the high-level intent (remove EtcdClient usage, drop lease_id logging, rename etcd_client) but does not follow the repository template: it omits a detailed "Details" section listing changed files, the "Where should the reviewer start?" section, and a "Related Issues" entry, leaving reviewers without clear pointers to verify the changes. Please update the PR description to follow the template by adding a "Details" section that enumerates the key files and precise changes (e.g., lib/runtime/src/distributed.rs, lib/bindings/python/rust/lib.rs, and callers of clear_namespace), a "Where should the reviewer start?" section pointing to the primary diffs, and a "Related Issues" line (issue number or note if none); also indicate any breaking-change/migration notes and relevant tests or runtime considerations.
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "chore(bindings): Provide a binding to clear etcd namespace" accurately summarizes the primary change (adding a temp_clear_namespace binding and related etct client renames/removals), is concise, and is focused for a reviewer scanning history.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

This removes one use of the EtcdClient binding, which is going away.

Also remove some unnecessary logging of lease_id, which used the binding
too.

And rename `etcd_client()` to `do_not_use_etcd_client()` so no-one adds
another use of it.

Signed-off-by: Graham King <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
lib/bindings/python/tests/test_etcd_bindings.py (1)

18-19: Guard against None: do_not_use_etcd_client() can return None.

Without a check this test will raise on None when etcd isn’t running. Add an assertion to fail fast with a helpful message.

-    etcd = runtime.do_not_use_etcd_client()
+    etcd = runtime.do_not_use_etcd_client()
+    assert etcd is not None, "Etcd client not available; ensure etcd is running for this test"
tests/router/test_router_e2e_with_mockers.py (1)

238-239: Add explicit check for None before using etcd in tests.

If the runtime instance lacks an etcd client, this will raise. Fail fast with a clearer message.

-    etcd = runtime.do_not_use_etcd_client()
+    etcd = runtime.do_not_use_etcd_client()
+    assert etcd is not None, "Etcd client not available; runtime_services must start etcd"
examples/multimodal/components/worker.py (1)

423-425: Handle absence of etcd client gracefully.

configure_ports_with_etcd likely expects a client; skip when None to avoid surprises in static/no-etcd runs.

-    etcd_client = runtime.do_not_use_etcd_client()
-    await configure_ports_with_etcd(config, etcd_client)
+    etcd_client = runtime.do_not_use_etcd_client()
+    if etcd_client is not None:
+        await configure_ports_with_etcd(config, etcd_client)
+    else:
+        logger.debug("No etcd client; skipping port configuration via etcd")
components/backends/vllm/src/dynamo/vllm/main.py (1)

72-74: Gate etcd-based port configuration on client presence.

Prevents runtime errors when etcd isn’t wired up.

-    etcd_client = runtime.do_not_use_etcd_client()
-    await configure_ports_with_etcd(config, etcd_client)
+    etcd_client = runtime.do_not_use_etcd_client()
+    if etcd_client is not None:
+        await configure_ports_with_etcd(config, etcd_client)
+    else:
+        logger.debug("No etcd client; skipping port configuration via etcd")
components/backends/sglang/src/dynamo/sglang/utils/clear_namespace.py (1)

16-17: Normalize namespace to avoid accidental double slashes.

Minor hardening to handle inputs with leading/trailing slashes.

-    await runtime.temp_clear_namespace(f"/{namespace}/")
-    logging.info(f"Cleared /{namespace}")
+    ns = namespace.strip("/")
+    await runtime.temp_clear_namespace(f"/{ns}/")
+    logging.info(f"Cleared /{ns}")
components/backends/trtllm/utils/clear_namespace.py (1)

17-18: Normalize namespace formatting before clearing.

Prevents path anomalies if users pass "/foo/" vs "foo".

-    await runtime.temp_clear_namespace(f"/{namespace}/")
-    logger.info(f"Cleared /{namespace}")
+    ns = namespace.strip("/")
+    await runtime.temp_clear_namespace(f"/{ns}/")
+    logger.info(f"Cleared /{ns}")
lib/bindings/python/src/dynamo/_core.pyi (1)

44-48: Expose temp_clear_namespace in the stub to match the binding.

The Rust binding adds DistributedRuntime.temp_clear_namespace, but the pyi doesn’t declare it. Add it so type-checkers and IDEs stay in sync.

 class DistributedRuntime:
     """
     The runtime object for dynamo applications
     """
 
     ...
 
+    async def temp_clear_namespace(self, name: str) -> None:
+        """
+        Remove everything under the given etcd prefix.
+        Temporary; will be removed once MDC auto-delete exists.
+        """
+        ...
+
     def namespace(self, name: str) -> Namespace:
         """
         Create a `Namespace` object
         """
         ...
lib/bindings/python/rust/lib.rs (1)

377-382: Surface deprecation at call time.

Emit a runtime warning so callers see the deprecation when they use this method.

 fn do_not_use_etcd_client(&self) -> PyResult<Option<EtcdClient>> {
+        tracing::warn!("do_not_use_etcd_client() is deprecated and will be removed soon; prefer runtime APIs.");
         match self.inner.etcd_client().clone() {
             Some(etcd_client) => Ok(Some(EtcdClient { inner: etcd_client })),
             None => Ok(None),
         }
     }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26889b0 and e5f4a18.

📒 Files selected for processing (12)
  • components/backends/sglang/src/dynamo/sglang/utils/clear_namespace.py (1 hunks)
  • components/backends/trtllm/utils/clear_namespace.py (2 hunks)
  • components/backends/vllm/src/dynamo/vllm/main.py (1 hunks)
  • components/planner/src/dynamo/planner/virtual_connector.py (1 hunks)
  • examples/custom_backend/hello_world/hello_world.py (1 hunks)
  • examples/multimodal/components/worker.py (1 hunks)
  • lib/bindings/python/examples/hello_world/server.py (0 hunks)
  • lib/bindings/python/rust/lib.rs (1 hunks)
  • lib/bindings/python/src/dynamo/_core.pyi (2 hunks)
  • lib/bindings/python/tests/test_etcd_bindings.py (1 hunks)
  • lib/runtime/src/distributed.rs (1 hunks)
  • tests/router/test_router_e2e_with_mockers.py (1 hunks)
💤 Files with no reviewable changes (1)
  • lib/bindings/python/examples/hello_world/server.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-01T13:55:03.940Z
Learnt from: nnshah1
PR: ai-dynamo/dynamo#1444
File: tests/fault_tolerance/utils/metrics.py:30-32
Timestamp: 2025-07-01T13:55:03.940Z
Learning: The `dynamo_worker()` decorator in the dynamo codebase returns a wrapper that automatically injects the `runtime` parameter before calling the wrapped function. This means callers only need to provide the non-runtime parameters, while the decorator handles injecting the runtime argument automatically. For example, a function with signature `async def get_metrics(runtime, log_dir)` decorated with `dynamo_worker()` can be called as `get_metrics(log_dir)` because the decorator wrapper injects the runtime parameter.

Applied to files:

  • components/backends/vllm/src/dynamo/vllm/main.py
  • components/backends/trtllm/utils/clear_namespace.py
🧬 Code graph analysis (10)
lib/runtime/src/distributed.rs (1)
lib/bindings/python/rust/lib.rs (1)
  • temp_clear_namespace (363-368)
components/planner/src/dynamo/planner/virtual_connector.py (3)
lib/runtime/src/distributed.rs (2)
  • etcd_client (269-271)
  • runtime (197-199)
lib/bindings/python/rust/lib.rs (1)
  • do_not_use_etcd_client (377-382)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • do_not_use_etcd_client (44-49)
tests/router/test_router_e2e_with_mockers.py (2)
lib/bindings/python/rust/lib.rs (1)
  • do_not_use_etcd_client (377-382)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • do_not_use_etcd_client (44-49)
components/backends/vllm/src/dynamo/vllm/main.py (3)
lib/runtime/src/distributed.rs (2)
  • etcd_client (269-271)
  • runtime (197-199)
lib/bindings/python/rust/lib.rs (1)
  • do_not_use_etcd_client (377-382)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • do_not_use_etcd_client (44-49)
components/backends/sglang/src/dynamo/sglang/utils/clear_namespace.py (4)
lib/runtime/src/distributed.rs (3)
  • runtime (197-199)
  • namespace (216-218)
  • temp_clear_namespace (332-341)
lib/bindings/python/src/dynamo/runtime/__init__.py (1)
  • dynamo_worker (36-62)
components/backends/trtllm/utils/clear_namespace.py (1)
  • clear_namespace (16-18)
lib/bindings/python/rust/lib.rs (2)
  • namespace (370-375)
  • temp_clear_namespace (363-368)
lib/bindings/python/tests/test_etcd_bindings.py (2)
lib/bindings/python/rust/lib.rs (1)
  • do_not_use_etcd_client (377-382)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • do_not_use_etcd_client (44-49)
components/backends/trtllm/utils/clear_namespace.py (2)
lib/runtime/src/distributed.rs (3)
  • runtime (197-199)
  • temp_clear_namespace (332-341)
  • namespace (216-218)
lib/bindings/python/rust/lib.rs (2)
  • temp_clear_namespace (363-368)
  • namespace (370-375)
lib/bindings/python/src/dynamo/_core.pyi (1)
lib/bindings/python/rust/lib.rs (1)
  • do_not_use_etcd_client (377-382)
lib/bindings/python/rust/lib.rs (2)
lib/runtime/src/distributed.rs (2)
  • temp_clear_namespace (332-341)
  • namespace (216-218)
lib/bindings/python/src/dynamo/_core.pyi (3)
  • namespace (38-42)
  • do_not_use_etcd_client (44-49)
  • EtcdClient (57-102)
examples/multimodal/components/worker.py (3)
lib/runtime/src/distributed.rs (2)
  • etcd_client (269-271)
  • runtime (197-199)
lib/bindings/python/rust/lib.rs (1)
  • do_not_use_etcd_client (377-382)
lib/bindings/python/src/dynamo/_core.pyi (1)
  • do_not_use_etcd_client (44-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: pre-merge-rust (lib/bindings/python)
  • GitHub Check: pre-merge-rust (lib/runtime/examples)
  • GitHub Check: pre-merge-rust (.)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (2)
examples/custom_backend/hello_world/hello_world.py (1)

37-37: Good cleanup: removed lease_id from log.

This aligns with discouraging direct etcd binding usage; no functional change.

components/planner/src/dynamo/planner/virtual_connector.py (1)

45-47: Switch to do_not_use_etcd_client is fine; None-path handled.

Constructor guards None with a clear error; later usage is safe.

@grahamking grahamking force-pushed the gk-clear-namespace-binding branch from e5f4a18 to 4206b3c Compare September 17, 2025 20:37
@grahamking grahamking enabled auto-merge (squash) September 17, 2025 21:15
Copy link
Collaborator

@whoisj whoisj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Assuming the tests pass, I'm happy with it.

@grahamking grahamking merged commit b6595e2 into main Sep 18, 2025
13 checks passed
@grahamking grahamking deleted the gk-clear-namespace-binding branch September 18, 2025 15:49
@biswapanda
Copy link
Contributor

Lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants