From 6530ef17e603822217309a3a8fdebf3f194aea18 Mon Sep 17 00:00:00 2001 From: Anna Tchernych Date: Thu, 24 Jul 2025 17:43:53 -0700 Subject: [PATCH 1/6] Clean index.rst --- docs/index.rst | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index 7000e786e5..0670927388 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -49,22 +49,22 @@ The examples below assume you build the latest image yourself from source. If us :link: /examples/hello_world :link-type: doc - Demonstrates the basic concepts of Dynamo by creating a simple multi-service pipeline. + Demonstrates the basic concepts of Dynamo by creating a simple GPU-unaware graph - .. grid-item-card:: :doc:`LLM Deployment ` - :link: /examples/llm_deployment + .. grid-item-card:: :doc:`LLM Serving with VLLM ` + :link: /components/backends/vllm :link-type: doc - Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations. + Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with VLLM. - .. grid-item-card:: :doc:`Multinode ` - :link: /examples/multinode + .. grid-item-card:: :doc:`Multinode with SGLang ` + :link: /components/backends/sglang/docs/multinode-examples :link-type: doc - Demonstrates deployment for disaggregated serving on 3 nodes using `nvidia/Llama-3.1-405B-Instruct-FP8`. + Demonstrates disaggregated serving on several nodes. - .. grid-item-card:: :doc:`TensorRT-LLM ` - :link: /examples/trtllm + .. grid-item-card:: :doc:`TensorRT-LLM ` + :link: /components/backends/trtllm :link-type: doc Presents TensorRT-LLM examples and reference implementations for deploying Large Language Models (LLMs) in various configurations. @@ -110,7 +110,7 @@ The examples below assume you build the latest image yourself from source. If us Dynamo Deploy Quickstart Dynamo Cloud Kubernetes Platform - Manual Helm Deployment + Manual Helm Deployment GKE Setup Guide Minikube Setup Guide Model Caching with Fluid @@ -126,17 +126,14 @@ The examples below assume you build the latest image yourself from source. If us :hidden: :caption: API - Python API NIXL Connect API .. toctree:: :hidden: :caption: Examples - Aggregated and Disaggregated Deployment - LLM Deployment Examples - Multinode Examples - LLM Deployment Examples using TensorRT-LLM + Multinode Examples + LLM Deployment Examples using TensorRT-LLM .. toctree:: :hidden: From 2b8235d4c92624ec23ffaa757f0879cffc750655 Mon Sep 17 00:00:00 2001 From: Anna Tchernych Date: Thu, 24 Jul 2025 18:51:43 -0700 Subject: [PATCH 2/6] add filename --- docs/index.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/index.rst b/docs/index.rst index 0670927388..2520a3d98e 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -45,14 +45,14 @@ The examples below assume you build the latest image yourself from source. If us :margin: 0 :padding: 3 4 0 0 - .. grid-item-card:: :doc:`Hello World ` - :link: /examples/hello_world + .. grid-item-card:: :doc:`Hello World ` + :link: /examples/runtime/hello_world/README.md :link-type: doc Demonstrates the basic concepts of Dynamo by creating a simple GPU-unaware graph - .. grid-item-card:: :doc:`LLM Serving with VLLM ` - :link: /components/backends/vllm + .. grid-item-card:: :doc:`LLM Serving with VLLM ` + :link: /components/backends/vllm/README.md :link-type: doc Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with VLLM. From 19d9211a5a71e1b2f463bb38824b127e0012af1f Mon Sep 17 00:00:00 2001 From: Anna Tchernych Date: Thu, 24 Jul 2025 19:13:36 -0700 Subject: [PATCH 3/6] Add new links --- .../backends/sglang/docs/multinode-examples.md | 1 + docs/components/backends/trtllm/README.md | 1 + docs/components/backends/vllm/README.md | 1 + docs/examples/runtime/hello_world/README.md | 1 + docs/index.rst | 16 ++++++++-------- 5 files changed, 12 insertions(+), 8 deletions(-) create mode 120000 docs/components/backends/sglang/docs/multinode-examples.md create mode 120000 docs/components/backends/trtllm/README.md create mode 120000 docs/components/backends/vllm/README.md create mode 120000 docs/examples/runtime/hello_world/README.md diff --git a/docs/components/backends/sglang/docs/multinode-examples.md b/docs/components/backends/sglang/docs/multinode-examples.md new file mode 120000 index 0000000000..9929f08b4a --- /dev/null +++ b/docs/components/backends/sglang/docs/multinode-examples.md @@ -0,0 +1 @@ +../../../../../components/backends/sglang/docs/multinode-examples.md \ No newline at end of file diff --git a/docs/components/backends/trtllm/README.md b/docs/components/backends/trtllm/README.md new file mode 120000 index 0000000000..15969304d0 --- /dev/null +++ b/docs/components/backends/trtllm/README.md @@ -0,0 +1 @@ +../../../../components/backends/trtllm/README.md \ No newline at end of file diff --git a/docs/components/backends/vllm/README.md b/docs/components/backends/vllm/README.md new file mode 120000 index 0000000000..ec40eb5e49 --- /dev/null +++ b/docs/components/backends/vllm/README.md @@ -0,0 +1 @@ +../../../../components/backends/vllm/README.md \ No newline at end of file diff --git a/docs/examples/runtime/hello_world/README.md b/docs/examples/runtime/hello_world/README.md new file mode 120000 index 0000000000..aa7e284f34 --- /dev/null +++ b/docs/examples/runtime/hello_world/README.md @@ -0,0 +1 @@ +../../../../examples/runtime/hello_world/README.md \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index 2520a3d98e..d3a3e82ad6 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -45,26 +45,26 @@ The examples below assume you build the latest image yourself from source. If us :margin: 0 :padding: 3 4 0 0 - .. grid-item-card:: :doc:`Hello World ` - :link: /examples/runtime/hello_world/README.md + .. grid-item-card:: :doc:`Hello World ` + :link: examples/runtime/hello_world/README :link-type: doc Demonstrates the basic concepts of Dynamo by creating a simple GPU-unaware graph - .. grid-item-card:: :doc:`LLM Serving with VLLM ` - :link: /components/backends/vllm/README.md + .. grid-item-card:: :doc:`LLM Serving with VLLM ` + :link: components/backends/vllm/README :link-type: doc Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with VLLM. - .. grid-item-card:: :doc:`Multinode with SGLang ` - :link: /components/backends/sglang/docs/multinode-examples + .. grid-item-card:: :doc:`Multinode with SGLang ` + :link: components/backends/sglang/docs/multinode-examples :link-type: doc Demonstrates disaggregated serving on several nodes. - .. grid-item-card:: :doc:`TensorRT-LLM ` - :link: /components/backends/trtllm + .. grid-item-card:: :doc:`TensorRT-LLM ` + :link: components/backends/trtllm/README :link-type: doc Presents TensorRT-LLM examples and reference implementations for deploying Large Language Models (LLMs) in various configurations. From e790458f91451895e17862e809682f185b9666e0 Mon Sep 17 00:00:00 2001 From: Anna Tchernych Date: Thu, 24 Jul 2025 19:32:07 -0700 Subject: [PATCH 4/6] Add docs/deploy/metrics/docker-compose --- docs/deploy/metrics/docker-compose.yml | 1 + 1 file changed, 1 insertion(+) create mode 120000 docs/deploy/metrics/docker-compose.yml diff --git a/docs/deploy/metrics/docker-compose.yml b/docs/deploy/metrics/docker-compose.yml new file mode 120000 index 0000000000..f7c658ffff --- /dev/null +++ b/docs/deploy/metrics/docker-compose.yml @@ -0,0 +1 @@ +../../../deploy/metrics/docker-compose.yml \ No newline at end of file From 36232abaaf43391d488b1bf58411b134a092edaa Mon Sep 17 00:00:00 2001 From: Anna Tchernych Date: Thu, 24 Jul 2025 19:59:22 -0700 Subject: [PATCH 5/6] Add sym link --- docs/components/backends/llm/README.md | 1 + docs/index.rst | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) create mode 120000 docs/components/backends/llm/README.md diff --git a/docs/components/backends/llm/README.md b/docs/components/backends/llm/README.md new file mode 120000 index 0000000000..615da9417b --- /dev/null +++ b/docs/components/backends/llm/README.md @@ -0,0 +1 @@ +../../../../components/backends/llm/README.md \ No newline at end of file diff --git a/docs/index.rst b/docs/index.rst index d3a3e82ad6..c751f0d819 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -132,13 +132,16 @@ The examples below assume you build the latest image yourself from source. If us :hidden: :caption: Examples - Multinode Examples + Hello World + LLM Deployment Examples using VLLM + Multinode Examples using SGLang LLM Deployment Examples using TensorRT-LLM .. toctree:: :hidden: :caption: Reference + Glossary KVBM Reading From 321fc94f204de28a96facc5ba3ca95f9325fccad Mon Sep 17 00:00:00 2001 From: Anna Tchernych Date: Fri, 25 Jul 2025 10:42:27 -0700 Subject: [PATCH 6/6] fix nvcri reference --- docs/guides/dynamo_deploy/operator_deployment.md | 1 + docs/guides/dynamo_deploy/quickstart.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 120000 docs/guides/dynamo_deploy/operator_deployment.md diff --git a/docs/guides/dynamo_deploy/operator_deployment.md b/docs/guides/dynamo_deploy/operator_deployment.md new file mode 120000 index 0000000000..80ca4341ee --- /dev/null +++ b/docs/guides/dynamo_deploy/operator_deployment.md @@ -0,0 +1 @@ +../../../guides/dynamo_deploy/operator_deployment.md \ No newline at end of file diff --git a/docs/guides/dynamo_deploy/quickstart.md b/docs/guides/dynamo_deploy/quickstart.md index ebf2f57058..5639b92f87 100644 --- a/docs/guides/dynamo_deploy/quickstart.md +++ b/docs/guides/dynamo_deploy/quickstart.md @@ -67,7 +67,7 @@ Ensure you have the source code checked out and are in the `dynamo` directory: ### Set Environment Variables -Our examples use the [`nvcr.io`](nvcr.io/nvidia/ai-dynamo/) but you can setup your own values if you use another docker registry. +Our examples use the [`nvcr.io`](https://nvcr.io/nvidia/ai-dynamo/) but you can setup your own values if you use another docker registry. ```bash export NAMESPACE=dynamo-cloud # or whatever you prefer.