Skip to content
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
051a8ee
First draft: conf.py greatly simplified (need to add back nv theme, e…
rmccorm4 Aug 19, 2025
9d46868
Bring back some bits from old conf.py
rmccorm4 Aug 19, 2025
80b408d
Try relative paths for component READMEs, undo whitespace changes to …
rmccorm4 Aug 19, 2025
2a14789
Add symlink for sglang README, comment out suppressed myst warning
rmccorm4 Aug 19, 2025
d103271
Add minimal symlinks, and suppress myst.xref_missing warnings
rmccorm4 Aug 19, 2025
efbbfc8
Restore docs/components/router/README.md
rmccorm4 Aug 19, 2025
92cb7ec
Sync with main
rmccorm4 Aug 25, 2025
f5a6e49
Fix new warnings of unused docs after sync with main
rmccorm4 Aug 25, 2025
c18e5b4
docs: Refactor left side table of contents for docs website - v1
rmccorm4 Aug 25, 2025
95ba86e
Merge branch 'main' into rmccormick/docs_build
rmccorm4 Aug 25, 2025
fddd1e3
docs: Apply Harry's feedback, condense overview/quickstart, remove sl…
rmccorm4 Aug 25, 2025
eb6ff81
docs: Address CodeRabbit feedback
rmccorm4 Aug 25, 2025
ba773bf
Replace k8s quickstart with dynamo_deploy/README.md instead of dynamo…
rmccorm4 Aug 25, 2025
97be098
Replace k8s quickstart with dynamo_deploy/README.md instead of dynamo…
rmccorm4 Aug 25, 2025
bc1d940
Anish's feedback - use dynamo kubernetes platform doc as quickstart, …
rmccorm4 Aug 25, 2025
c373219
Anish's feedback - remove duplicate deploy quickstart, we already hav…
rmccorm4 Aug 25, 2025
d6db2fa
fix broken links from deleted dynamo deploy quickstart
rmccorm4 Aug 25, 2025
41ed5c6
CodeRabbit feedback: download docker compose file since no assumed gi…
rmccorm4 Aug 25, 2025
050fdc2
Remove outdated deploy quickstart from hidden_toctree
rmccorm4 Aug 25, 2025
6629466
Merge branch 'main' into rmccormick/docs_build
rmccorm4 Aug 25, 2025
a378217
Fix merge conflicts with main
rmccorm4 Aug 25, 2025
5841ada
Merge branch 'rmccormick/docs_build' of github.com:ai-dynamo/dynamo i…
rmccorm4 Aug 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions components/backends/sglang/deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
## Further Reading

- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
Expand All @@ -159,4 +159,4 @@ Common issues and solutions:
3. **Health check failures**: Review model loading logs and increase `initialDelaySeconds`
4. **Out of memory**: Increase memory limits or reduce model batch size

For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/quickstart.md).
For additional support, refer to the [deployment guide](../../../../docs/guides/dynamo_deploy/README.md).
6 changes: 3 additions & 3 deletions components/backends/trtllm/deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ extraPodSpec:

Before using these templates, ensure you have:

1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/quickstart.md)
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for TensorRT-LLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
Expand Down Expand Up @@ -257,7 +257,7 @@ Configure the `model` name and `host` based on your deployment.
## Further Reading

- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
Expand All @@ -277,4 +277,4 @@ Common issues and solutions:
6. **Git LFS issues**: Ensure git-lfs is installed before building containers
7. **ARM deployment**: Use `--platform linux/arm64` when building on ARM machines

For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/quickstart.md#troubleshooting).
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
6 changes: 3 additions & 3 deletions components/backends/vllm/deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ extraPodSpec:

Before using these templates, ensure you have:

1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/quickstart.md)
1. **Dynamo Cloud Platform installed** - See [Quickstart Guide](../../../../docs/guides/dynamo_deploy/README.md)
2. **Kubernetes cluster with GPU support**
3. **Container registry access** for vLLM runtime images
4. **HuggingFace token secret** (referenced as `envFromSecret: hf-token-secret`)
Expand Down Expand Up @@ -236,7 +236,7 @@ args:
## Further Reading

- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/guides/dynamo_deploy/create_deployment.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/quickstart.md)
- **Quickstart**: [Deployment Quickstart](../../../../docs/guides/dynamo_deploy/README.md)
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
- **SLA Planner**: [SLA Planner Deployment Guide](../../../../docs/guides/dynamo_deploy/sla_planner_deployment.md)
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
Expand All @@ -252,4 +252,4 @@ Common issues and solutions:
4. **Out of memory**: Increase memory limits or reduce model batch size
5. **Port forwarding issues**: Ensure correct pod UUID in port-forward command

For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/quickstart.md#troubleshooting).
For additional support, refer to the [deployment troubleshooting guide](../../../../docs/guides/dynamo_deploy/README.md).
2 changes: 1 addition & 1 deletion deploy/inference-gateway/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat

1. **Install Dynamo Platform**

[See Quickstart Guide](../../docs/guides/dynamo_deploy/quickstart.md) to install Dynamo Cloud.
[See Quickstart Guide](../../docs/guides/dynamo_deploy/README.md) to install Dynamo Cloud.


2. **Deploy Inference Gateway**
Expand Down
32 changes: 32 additions & 0 deletions docs/_includes/dive_in_examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
The examples below assume you build the latest image yourself from source. If using a prebuilt image follow the examples from the corresponding branch.

.. grid:: 1 2 2 2
:gutter: 3
:margin: 0
:padding: 3 4 0 0

.. grid-item-card:: :doc:`Hello World <../examples/runtime/hello_world/README>`
:link: ../examples/runtime/hello_world/README
:link-type: doc

Demonstrates the basic concepts of Dynamo by creating a simple GPU-unaware graph

.. grid-item-card:: :doc:`vLLM <../components/backends/vllm/README>`
:link: ../components/backends/vllm/README
:link-type: doc

Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with VLLM.

.. grid-item-card:: :doc:`SGLang <../components/backends/sglang/README>`
:link: ../components/backends/sglang/README
:link-type: doc

Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with SGLang.

.. grid-item-card:: :doc:`TensorRT-LLM <../components/backends/trtllm/README>`
:link: ../components/backends/trtllm/README
:link-type: doc

Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with TensorRT-LLM.


44 changes: 44 additions & 0 deletions docs/_includes/install.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
Pip (PyPI)
----------

Install a pre-built wheel from PyPI.

.. code-block:: bash

# Create a virtual environment and activate it
uv venv venv
source venv/bin/activate

# Install Dynamo from PyPI (choose one backend extra)
uv pip install "ai-dynamo[sglang]==0.4.1" # or [vllm], [trtllm]


Pip from source
---------------

Install directly from a local checkout for development.

.. code-block:: bash

# Clone the repository
git clone https://github.com/ai-dynamo/dynamo.git
cd dynamo

# Create a virtual environment and activate it
uv venv venv
source venv/bin/activate
uv pip install ".[sglang]" # or [vllm], [trtllm]


Docker
------

Pull and run prebuilt images from NVIDIA NGC (`nvcr.io`).

.. code-block:: bash

# Run a container (mount your workspace if needed)
docker run --rm -it \
--gpus all \
--network host \
nvcr.io/nvidia/ai-dynamo/sglang-runtime:0.4.1 # or vllm, tensorrtllm
43 changes: 43 additions & 0 deletions docs/_includes/quick_start_local.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
Get started with Dynamo locally in just a few commands:

**1. Install Dynamo**

.. code-block:: bash

# Install uv (recommended Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create virtual environment and install Dynamo
uv venv venv
source venv/bin/activate
uv pip install "ai-dynamo[sglang]==0.4.1" # or [vllm], [trtllm]

**2. Start etcd/NATS**

.. code-block:: bash

# Fetch and start etcd and NATS using Docker Compose
curl -fsSL -o docker-compose.yml https://raw.githubusercontent.com/ai-dynamo/dynamo/release/0.4.1/deploy/docker-compose.yml
docker compose -f docker-compose.yml up -d

**3. Run Dynamo**

.. code-block:: bash

# Start the OpenAI compatible frontend (default port is 8080)
python -m dynamo.frontend

# In another terminal, start an SGLang worker
python -m dynamo.sglang --model-path Qwen/Qwen3-0.6B

**4. Test your deployment**

.. code-block:: bash

curl localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "Qwen/Qwen3-0.6B",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 50}'


11 changes: 11 additions & 0 deletions docs/_sections/architecture.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Overview
============

.. include:: ../architecture/architecture.md
:parser: myst_parser.sphinx_

.. toctree::
:hidden:

Overview <self>
Disaggregated Serving <../architecture/disagg_serving>
42 changes: 42 additions & 0 deletions docs/_sections/backends.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
..
SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Backends
========

NVIDIA Dynamo supports multiple inference backends to provide flexibility and performance optimization for different use cases and model architectures. Backends are the underlying engines that execute AI model inference, each optimized for specific scenarios, hardware configurations, and performance requirements.

Overview
--------

Dynamo's multi-backend architecture allows you to:

* **Choose the optimal engine** for your specific workload and hardware
* **Switch between backends** without changing your application code
* **Leverage specialized optimizations** from each backend
* **Scale flexibly** across different deployment scenarios

Supported Backends
------------------

Dynamo currently supports the following high-performance inference backends:

.. toctree::
:maxdepth: 1

vLLM <../components/backends/vllm/README>
SGLang <../components/backends/sglang/README>
TensorRT-LLM <../components/backends/trtllm/README>
8 changes: 8 additions & 0 deletions docs/_sections/examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
..
Quickstart Page (left sidebar target)
..

Examples
========

.. include:: ../_includes/dive_in_examples.rst
10 changes: 10 additions & 0 deletions docs/_sections/installation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
..
Installation Page (left sidebar target)
..

Installation
============

.. include:: ../_includes/install.rst


5 changes: 2 additions & 3 deletions docs/architecture/kvbm_intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,6 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes
* -
- ❌
- SGLang
* -
- ❌
- llama.cpp
* - **Serving Type**
- ✅
- Aggregated
Expand All @@ -61,7 +58,9 @@ The Dynamo KV Block Manager serves as a reference implementation that emphasizes
.. toctree::
:hidden:

Overview <self>
Motivation <kvbm_motivation.md>
KVBM Architecture <kvbm_architecture.md>
Understanding KVBM components <kvbm_components.md>
KVBM Further Reading <kvbm_reading>
LMCache Integration <../components/backends/vllm/LMCache_Integration.md>
8 changes: 3 additions & 5 deletions docs/architecture/planner_intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,6 @@ Key features include:
* -
- ❌
- SGLang
* -
- ❌
- llama.cpp
* - **Serving Type**
- ✅
- Aggregated
Expand All @@ -73,6 +70,7 @@ Key features include:
.. toctree::
:hidden:

Overview <self>
Pre-Deployment Profiling <pre_deployment_profiling.md>
Load-based Planner <load_planner.md>
SLA-based Planner <sla_planner.md>
SLA-based Planner <sla_planner.md>
Planner Benchmark <../guides/planner_benchmark/README.md>
2 changes: 1 addition & 1 deletion docs/architecture/pre_deployment_profiling.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ Use the default pre-built image and inject custom configurations via PVC:

1. **Set the container image:**
```bash
export DOCKER_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.0 # or any existing image tag
export DOCKER_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.1 # or any existing image tag
```

2. **Inject your custom disagg configuration:**
Expand Down
1 change: 0 additions & 1 deletion docs/components/backends/llm/README.md

This file was deleted.

1 change: 1 addition & 0 deletions docs/components/backends/sglang/README.md
1 change: 1 addition & 0 deletions docs/components/backends/vllm/LMCache_Integration.md
Loading
Loading