Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
c83ca40
cp(#2351): Move backend READMEs to docs folder and fix relative path …
rmccorm4 Aug 16, 2025
5bb296a
cp(#2346): Move hello_world example README to docs, swap symlinks, fi…
rmccorm4 Aug 16, 2025
ce14f57
docs: Copy over index.rst and hidden_toctree.rst from v0.4.0 to main
rmccorm4 Aug 17, 2025
5a8d52e
Revert "cp(#2351): Move backend READMEs to docs folder and fix relati…
rmccorm4 Aug 17, 2025
016d2c1
Revert "cp(#2346): Move hello_world example README to docs, swap syml…
rmccorm4 Aug 17, 2025
915847c
Rename multimodal_v1 to multimodal, and fix sglang link
rmccorm4 Aug 17, 2025
201c7f9
Bring back missing benchmark README
rmccorm4 Aug 17, 2025
241c614
Fix all broken links caught by lychee
rmccorm4 Aug 17, 2025
eac4cc5
Add github action for link validation (lychee)
rmccorm4 Aug 17, 2025
fc63b57
Update RELEASE_VERSION to 0.4.0 in dynamo_deploy quickstart
rmccorm4 Aug 17, 2025
ced1a73
Merge branch 'main' into rmccormick/cp_anish_docs_to_main
rmccorm4 Aug 17, 2025
8d6be0a
Remove benchmark README for easier review - will restore in a separat…
rmccorm4 Aug 17, 2025
f7f9350
Merge branch 'rmccormick/cp_anish_docs_to_main' of github.com:ai-dyna…
rmccorm4 Aug 17, 2025
a8e5396
Remove unused env var from link check action
rmccorm4 Aug 17, 2025
8d0e50b
Add WAR for lychee cert error
rmccorm4 Aug 17, 2025
c3ec608
Address CodeRabbit feedback
rmccorm4 Aug 17, 2025
4381aa8
Address CodeRabbit feedback - add TODO in workflow for lychee install
rmccorm4 Aug 17, 2025
ef0b231
Try installing ca-certs for cert errors
rmccorm4 Aug 17, 2025
fdfd807
Set GITHUB_TOKEN to avoid github rate limits on URL checks
rmccorm4 Aug 17, 2025
6b7c690
Add lychee result caching
rmccorm4 Aug 17, 2025
3e60e42
Add lychee result caching docs reference
rmccorm4 Aug 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Address CodeRabbit feedback
  • Loading branch information
rmccorm4 committed Aug 17, 2025
commit c3ec608975709e65ec4f297b16da18438588b78f
2 changes: 1 addition & 1 deletion components/backends/sglang/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ git checkout $(git describe --tags $(git rev-list --tags --max-count=1))

## Quick Start

Below we provide a guide that lets you run all of our the common deployment patterns on a single node.
Below we provide a guide that lets you run all of our common deployment patterns on a single node.

### Start NATS and ETCD in the background

Expand Down
2 changes: 1 addition & 1 deletion components/backends/sglang/deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,4 +159,4 @@ Common issues and solutions:
3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
4. **Out of memory**: Increase memory limits or reduce model batch size

For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/quickstart.md#troubleshooting).
For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/quickstart.md).
2 changes: 1 addition & 1 deletion components/backends/sglang/docs/dsr1-wideep-h100.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ SPDX-License-Identifier: Apache-2.0

# Running DeepSeek-R1 Disaggregated with WideEP on H100s

Dynamo supports SGLang's implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-05-05-large-scale-ep/) for more details. We provide a Dockerfile for this in `container/Dockerfile.sglang-deepep` and configurations to deploy this at scale. In this example, we will run 1 prefill worker on 4 H100 nodes and 1 decode worker on 9 H100 nodes (104 total GPUs).
Dynamo supports SGLang's implementation of wide expert parallelism and large scale P/D for DeepSeek-R1! You can read their blog post [here](https://lmsys.org/blog/2025-05-05-large-scale-ep/) for more details. We provide a Dockerfile for this in `container/Dockerfile.sglang-wideep` and configurations to deploy this at scale. In this example, we will run 1 prefill worker on 4 H100 nodes and 1 decode worker on 9 H100 nodes (104 total GPUs).

## Instructions

Expand Down
6 changes: 3 additions & 3 deletions components/backends/trtllm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ For complete Kubernetes deployment instructions, configurations, and troubleshoo

### Client

See [client](../vllm/README.md#client) section to learn how to send request to the deployment.
See [client](../sglang/README.md#testing-the-deployment) section to learn how to send request to the deployment.

NOTE: To send a request to a multi-node deployment, target the node which is running `python3 -m dynamo.frontend <args>`.

Expand All @@ -218,7 +218,7 @@ DISAGGREGATION_STRATEGY="prefill_first" ./launch/disagg.sh

## KV Cache Transfer in Disaggregated Serving

Dynamo with TensorRT-LLM supports two methods for transferring KV cache in disaggregated serving: UCX (default) and NIXL (experimental). For detailed information and configuration instructions for each method, see the [KV cache transfer guide](./kv-cache-tranfer.md).
Dynamo with TensorRT-LLM supports two methods for transferring KV cache in disaggregated serving: UCX (default) and NIXL (experimental). For detailed information and configuration instructions for each method, see the [KV cache transfer guide](./kv-cache-transfer.md).


## Request Migration
Expand All @@ -233,7 +233,7 @@ This allows a request to be migrated up to 3 times before failing. See the [Requ

## Client

See [client](../vllm/README.md#client) section to learn how to send request to the deployment.
See [client](../sglang/README.md#testing-the-deployment) section to learn how to send request to the deployment.

NOTE: To send a request to a multi-node deployment, target the node which is running `python3 -m dynamo.frontend <args>`.

Expand Down
2 changes: 1 addition & 1 deletion components/backends/trtllm/deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ TensorRT-LLM supports two methods for KV cache transfer in disaggregated serving
- **UCX** (default): Standard method for KV cache transfer
- **NIXL** (experimental): Alternative transfer method

For detailed configuration instructions, see the [KV cache transfer guide](../kv-cache-tranfer.md).
For detailed configuration instructions, see the [KV cache transfer guide](../kv-cache-transfer.md).

## Request Migration

Expand Down
2 changes: 1 addition & 1 deletion docs/guides/deploy/k8s_metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ This will create two components:
- A Worker component exposing metrics on its system port

Both components expose a `/metrics` endpoint following the OpenMetrics format, but with different metrics appropriate to their roles. For details about:
- Deployment configuration: See the [vLLM README](../../components/backends/vllm/README.md)
- Deployment configuration: See the [vLLM README](../../../components/backends/vllm/README.md)
- Available metrics: See the [metrics guide](../metrics.md)

### Validate the Deployment
Expand Down
2 changes: 1 addition & 1 deletion docs/hidden_toctree.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
components/backends/trtllm/deploy/README.md
components/backends/trtllm/llama4_plus_eagle.md
components/backends/trtllm/multinode-examples.md
components/backends/trtllm/kv-cache-tranfer.md
components/backends/trtllm/kv-cache-transfer.md
components/backends/vllm/deploy/README.md
components/backends/vllm/multi-node.md

Loading