Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge main into examples-changes
Resolved conflicts in:
- components/backends/trtllm/README.md: Fixed table of contents anchor and table formatting
- components/backends/vllm/README.md: Updated feature support status from main
- docs/architecture/distributed_runtime.md: Kept backend-agnostic description, used cleaner Python examples text
- examples/deployments/EKS/Deploy_Dynamo_Cloud.md: Removed trailing slash from ECR login command
  • Loading branch information
athreesh committed Jul 29, 2025
commit 14d984712d51dac0976cad754c9389ef935dc774
14 changes: 11 additions & 3 deletions components/backends/trtllm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,9 @@ git checkout $(git describe --tags $(git rev-list --tags --max-count=1))

| Feature | TensorRT-LLM | Notes |
|--------------------|--------------|-----------------------------------------------------------------------|
| **WideEP** | ✅ | |
| **DP Rank Routing**| ✅ | |
| **GB200 Support** | ✅ | |
| **WideEP** | ✅ | |
| **DP Rank Routing**| ✅ | |
| **GB200 Support** | ✅ | |

## Quick Start

Expand Down Expand Up @@ -222,9 +222,17 @@ The migrated request will continue responding to the original request, allowing

See [client](../llm/README.md#client) section to learn how to send request to the deployment.

<<<<<<< HEAD
NOTE: To send a request to a multi-node deployment, target the node which is running `dynamo-run in=http`.
=======
NOTE: To send a request to a multi-node deployment, target the node which is running `python3 -m dynamo.frontend <args>`.
>>>>>>> origin/main

## Benchmarking

To benchmark your deployment with GenAI-Perf, see this utility script, configuring the
<<<<<<< HEAD
`model` name and `host` based on your deployment: [perf.sh](../../benchmarks/llm/perf.sh)
=======
`model` name and `host` based on your deployment: [perf.sh](../../../benchmarks/llm/perf.sh)
>>>>>>> origin/main
8 changes: 6 additions & 2 deletions components/backends/vllm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,9 @@ git checkout $(git describe --tags $(git rev-list --tags --max-count=1))

| Feature | vLLM | Notes |
|--------------------|------|-----------------------------------------------------------------------|
| **WideEP** | 🚧 | Not supported |
| **WideEP** | | Support for PPLX / DeepEP not verified |
| **DP Rank Routing**| ✅ | Supported via external control of DP ranks |
| **GB200 Support** | 🚧 | Not supported |
| **GB200 Support** | 🚧 | Container functional on main |

## Quick Start

Expand Down Expand Up @@ -81,7 +81,11 @@ This includes the specific commit [vllm-project/vllm#19790](https://github.com/v
## Run Single Node Examples

> [!IMPORTANT]
<<<<<<< HEAD
> Below we provide simple shell scripts that run the components for each configuration. Each shell script runs `python3 dynamo.frontend` to start the ingress and uses `python3 dynamo.vllm` to start the vLLM workers. You can also run each command in separate terminals for better log visibility.
=======
> Below we provide simple shell scripts that run the components for each configuration. Each shell script runs `python3 -m dynamo.frontend` to start the ingress and uses `python3 -m dynamo.vllm` to start the vLLM workers. You can also run each command in separate terminals for better log visibility.
>>>>>>> origin/main

This figure shows an overview of the major components to deploy:

Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/distributed_runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,6 @@ After selecting which endpoint to hit, the `Client` sends the serialized request
We provide native rust and python (through binding) examples for basic usage of `DistributedRuntime`:

- Rust: `/lib/runtime/examples/`
- Python: We also provide complete examples of using `DistributedRuntime` for communication and Dynamo's LLM library for prompt templates and (de)tokenization to deploy Dynamo graphs. Please refer to the directories in `/components/backends`. ` for full implementation details.
- Python: We also provide complete examples of using `DistributedRuntime`. Please refer to the engines in `/components/backends` for full implementation details.


2 changes: 1 addition & 1 deletion examples/deployments/EKS/Deploy_Dynamo_Cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Push Image

```
docker tag dynamo:latest-vllm <ECR_REGISTRY>/<ECR_REPOSITORY>:$IMAGE_TAG
aws ecr get-login-password | docker login --username AWS --password-stdin <ECR_REGISTRY>/
aws ecr get-login-password | docker login --username AWS --password-stdin <ECR_REGISTRY>
docker push <ECR_REGISTRY>/<ECR_REPOSITORY>:$IMAGE_TAG
```

Expand Down
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.