Skip to content

Conversation

@biswapanda
Copy link
Contributor

@biswapanda biswapanda commented Jul 31, 2025

Overview:

fix: add curl and jq for health checks #2203

Cherry-pick : #2203

closes:

  • nvbug: 5425651
  • linear ticket: DYN-792

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • Documentation

    • Improved and reorganized README files for clarity, including enhanced framework support matrix and installation instructions.
    • Fixed and updated documentation links in feature support tables across multiple backend READMEs.
    • Removed redundant sections and clarified framework support in the examples documentation.
  • New Features

    • Added explicit configuration for port range allocation and introduced block port allocation for side channels in distributed environments.
  • Refactor

    • Centralized and modularized port allocation and host IP resolution logic for improved reliability and maintainability.
  • Chores

    • Updated Dockerfiles to include additional utilities (jq, curl) for endpoint polling and health checks.
    • Refined package installation steps and dependencies for SGLang and related environments.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 31, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@biswapanda biswapanda self-assigned this Jul 31, 2025
@biswapanda biswapanda changed the base branch from main to release/0.4.0 July 31, 2025 01:20
@github-actions github-actions bot added the fix label Jul 31, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 31, 2025

Caution

Review failed

Failed to post review comments.

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 62c7898 and 7d6c556.

📒 Files selected for processing (11)
  • README.md (2 hunks)
  • components/backends/sglang/README.md (1 hunks)
  • components/backends/trtllm/README.md (1 hunks)
  • components/backends/vllm/README.md (1 hunks)
  • components/backends/vllm/src/dynamo/vllm/args.py (5 hunks)
  • components/backends/vllm/src/dynamo/vllm/ports.py (1 hunks)
  • container/Dockerfile.sglang (2 hunks)
  • container/Dockerfile.sglang-wideep (2 hunks)
  • container/Dockerfile.tensorrt_llm (1 hunks)
  • container/Dockerfile.vllm (1 hunks)
  • examples/README.md (2 hunks)
🧰 Additional context used
🧠 Learnings (7)
components/backends/trtllm/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

container/Dockerfile.vllm (1)

Learnt from: grahamking
PR: #1177
File: container/Dockerfile.vllm:102-105
Timestamp: 2025-05-28T22:54:46.875Z
Learning: In Dockerfiles, when appending to environment variables that may not exist in the base image, Docker validation will fail if you reference undefined variables with ${VARIABLE} syntax. In such cases, setting the environment variable directly (e.g., ENV CPATH=/usr/include) rather than appending is the appropriate approach.

components/backends/vllm/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

components/backends/sglang/README.md (1)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

container/Dockerfile.sglang (1)

Learnt from: grahamking
PR: #1177
File: container/Dockerfile.vllm:102-105
Timestamp: 2025-05-28T22:54:46.875Z
Learning: In Dockerfiles, when appending to environment variables that may not exist in the base image, Docker validation will fail if you reference undefined variables with ${VARIABLE} syntax. In such cases, setting the environment variable directly (e.g., ENV CPATH=/usr/include) rather than appending is the appropriate approach.

examples/README.md (1)

Learnt from: PeaBrane
PR: #1409
File: examples/router_standalone/worker.py:171-186
Timestamp: 2025-06-08T08:30:45.126Z
Learning: Example code in the examples/ directory may intentionally use hard-coded values or simplified implementations that wouldn't be appropriate for production code, but are acceptable for demonstration and testing purposes.

README.md (2)

Learnt from: dmitry-tokarev-nv
PR: #2179
File: docs/support_matrix.md:61-63
Timestamp: 2025-07-30T00:34:35.810Z
Learning: In docs/support_matrix.md, the NIXL version difference between runtime dependencies (0.5.0) and build dependencies (0.4.0) is intentional and expected, not an error that needs to be corrected.

Learnt from: biswapanda
PR: #1412
File: lib/bindings/python/src/dynamo/runtime/logging.py:100-100
Timestamp: 2025-06-06T21:48:35.214Z
Learning: In the Dynamo codebase, BentoML has been completely removed from all executable code, with only documentation and attribution references remaining. The error_loggers configuration in lib/bindings/python/src/dynamo/runtime/logging.py should not include "bentoml" since those modules no longer exist.

🧬 Code Graph Analysis (1)
components/backends/vllm/src/dynamo/vllm/args.py (1)
components/backends/vllm/src/dynamo/vllm/ports.py (7)
  • DynamoPortRange (26-40)
  • EtcdContext (44-53)
  • PortAllocationRequest (78-85)
  • PortMetadata (57-74)
  • allocate_and_reserve_port (229-260)
  • allocate_and_reserve_port_block (142-226)
  • get_host_ip (263-290)
🔇 Additional comments (12)
components/backends/vllm/src/dynamo/vllm/ports.py (3)

25-41: Well-designed port range validation.

The dataclass properly validates that ports are within the registered ports range and ensures min < max. Good defensive programming.


88-114: Robust port holding implementation.

Good use of context manager pattern with proper cleanup in the finally block. The SO_REUSEADDR option is appropriate for this use case.


263-291: Excellent host IP detection with comprehensive error handling.

The function properly handles all failure scenarios and tests bindability before returning an IP. The fallback to localhost is appropriate for development/testing scenarios.

components/backends/vllm/src/dynamo/vllm/args.py (2)

75-86: Well-designed CLI arguments for port range configuration.

The arguments have clear documentation and sensible defaults that align with the registered ports range requirements.


165-201: Correct implementation of NIXL port allocation scheme.

The implementation properly follows NIXL's port calculation formula and allocates the required contiguous block of ports. The validation for negative base ports is a good defensive measure.

container/Dockerfile.tensorrt_llm (1)

383-386: Addition of jq looks good and keeps runtime parity with other images
Including jq alongside curl in the runtime layer enables JSON-based health-probe parsing without materially increasing image size. Installation is already guarded by --no-install-recommends and followed by an apt-cache purge, so no further action needed.

container/Dockerfile.vllm (1)

440-443: Consistent health-check tooling – LGTM
jq is now co-installed with curl, matching the other back-end images. Thanks for keeping the comment in sync.

container/Dockerfile.sglang-wideep (2)

155-158: Health-check utilities added – OK
jq and the clarifying comment align this Dockerfile with the rest of the fleet.


123-125: Duplicate install of the same wheel inflates build time
pip install -e . (editable) is immediately followed by a regular pip install .. The second call re-copies/compiles everything and negates the benefit of the editable install.

-RUN cd lib/bindings/python && pip install --break-system-packages -e . && cd ../../..
-RUN pip install --break-system-packages .
+RUN cd lib/bindings/python && \
+    pip install --break-system-packages . && cd ../../..

Either keep the editable install only (for live-code tweaks) or the regular install, but not both.

Likely an incorrect or invalid review comment.

components/backends/trtllm/README.md (1)

52-57: Fixed broken documentation links – thank you
Bumping the relative path depth from ../../ to ../../../ correctly resolves all six links from the TRTLLM README to the docs folder.

components/backends/sglang/README.md (1)

37-42: Corrected link depth – all good
The amended paths now point to the right architecture docs. No further issues spotted.

components/backends/vllm/README.md (1)

38-43: Relative-path fix looks correct

Jumping three directories up (../../../) is the right depth from components/backends/vllm/README.md to reach the repo root, so the updated links should now resolve on GitHub and in rendered docs.

Walkthrough

This update reorganizes and enhances documentation for framework support and installation instructions, particularly for SGLang. It introduces a new modular port allocation system for the vLLM backend, centralizing port management and ETCD coordination in a dedicated module. Dockerfiles are updated to install additional utilities and adjust package installation steps.

Changes

Cohort / File(s) Change Summary
Documentation Reorganization & Enhancement
README.md, examples/README.md
Reorganized framework support matrix for higher visibility, improved SGLang installation instructions, removed duplicate sections, and clarified prerequisites.
Backend Feature Matrix Link Fixes
components/backends/sglang/README.md, components/backends/trtllm/README.md, components/backends/vllm/README.md
Fixed relative documentation links in feature support matrices to ensure correct navigation.
vLLM Port Allocation Refactor
components/backends/vllm/src/dynamo/vllm/args.py
Refactored port allocation logic: removed internal functions, adopted new port allocation module, added port range config, and block allocation for side channels.
New Port Allocation Utilities
components/backends/vllm/src/dynamo/vllm/ports.py
Introduced a new module for ETCD-coordinated port allocation, port range management, block reservations, and host IP resolution.
SGLang Docker & Install Updates
container/Dockerfile.sglang, container/Dockerfile.sglang-wideep
Updated to install specific flashinfer-python prerelease, ai-dynamo with SGLang support, and added jq/curl utilities. Adjusted Python install method and removed redundant PYTHONPATH.
Dockerfile Utility Additions
container/Dockerfile.tensorrt_llm, container/Dockerfile.vllm
Added jq alongside curl for endpoint polling and health checks in runtime images.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant Config
    participant PortsModule
    participant ETCD

    User->>CLI: Launch vLLM backend with port range args
    CLI->>Config: Parse --dynamo-port-min/max
    Config->>PortsModule: Request port(s) allocation (with range)
    PortsModule->>PortsModule: Check port availability & hold ports
    PortsModule->>ETCD: Reserve port(s) with metadata
    ETCD-->>PortsModule: Acknowledge reservation
    PortsModule-->>Config: Return allocated port(s)
    Config->>CLI: Complete backend startup with assigned ports
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

Poem

In the warren of code, a rabbit hops with cheer,
Ports now reserved—no race conditions here!
Docs are refactored, support tables in view,
With Docker and jq, deployments feel new.
Frameworks aligned, installation’s a breeze—
This bunny’s quite pleased, hopping with ease! 🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@biswapanda biswapanda enabled auto-merge (squash) July 31, 2025 05:03
@dmitry-tokarev-nv dmitry-tokarev-nv merged commit 54fbff3 into release/0.4.0 Jul 31, 2025
5 checks passed
@dmitry-tokarev-nv dmitry-tokarev-nv deleted the bis/cp-2203-jq-DYN-792 branch July 31, 2025 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants