diff --git a/docs/components/backends/llm/README.md b/docs/components/backends/llm/README.md
new file mode 120000
index 0000000000..615da9417b
--- /dev/null
+++ b/docs/components/backends/llm/README.md
@@ -0,0 +1 @@
+../../../../components/backends/llm/README.md
\ No newline at end of file
diff --git a/docs/components/backends/sglang/docs/multinode-examples.md b/docs/components/backends/sglang/docs/multinode-examples.md
new file mode 120000
index 0000000000..9929f08b4a
--- /dev/null
+++ b/docs/components/backends/sglang/docs/multinode-examples.md
@@ -0,0 +1 @@
+../../../../../components/backends/sglang/docs/multinode-examples.md
\ No newline at end of file
diff --git a/docs/components/backends/trtllm/README.md b/docs/components/backends/trtllm/README.md
new file mode 120000
index 0000000000..15969304d0
--- /dev/null
+++ b/docs/components/backends/trtllm/README.md
@@ -0,0 +1 @@
+../../../../components/backends/trtllm/README.md
\ No newline at end of file
diff --git a/docs/components/backends/vllm/README.md b/docs/components/backends/vllm/README.md
new file mode 120000
index 0000000000..ec40eb5e49
--- /dev/null
+++ b/docs/components/backends/vllm/README.md
@@ -0,0 +1 @@
+../../../../components/backends/vllm/README.md
\ No newline at end of file
diff --git a/docs/deploy/metrics/docker-compose.yml b/docs/deploy/metrics/docker-compose.yml
new file mode 120000
index 0000000000..f7c658ffff
--- /dev/null
+++ b/docs/deploy/metrics/docker-compose.yml
@@ -0,0 +1 @@
+../../../deploy/metrics/docker-compose.yml
\ No newline at end of file
diff --git a/docs/examples/runtime/hello_world/README.md b/docs/examples/runtime/hello_world/README.md
new file mode 120000
index 0000000000..aa7e284f34
--- /dev/null
+++ b/docs/examples/runtime/hello_world/README.md
@@ -0,0 +1 @@
+../../../../examples/runtime/hello_world/README.md
\ No newline at end of file
diff --git a/docs/guides/dynamo_deploy/operator_deployment.md b/docs/guides/dynamo_deploy/operator_deployment.md
new file mode 120000
index 0000000000..80ca4341ee
--- /dev/null
+++ b/docs/guides/dynamo_deploy/operator_deployment.md
@@ -0,0 +1 @@
+../../../guides/dynamo_deploy/operator_deployment.md
\ No newline at end of file
diff --git a/docs/guides/dynamo_deploy/quickstart.md b/docs/guides/dynamo_deploy/quickstart.md
index ebf2f57058..5639b92f87 100644
--- a/docs/guides/dynamo_deploy/quickstart.md
+++ b/docs/guides/dynamo_deploy/quickstart.md
@@ -67,7 +67,7 @@ Ensure you have the source code checked out and are in the `dynamo` directory:
### Set Environment Variables
-Our examples use the [`nvcr.io`](nvcr.io/nvidia/ai-dynamo/) but you can setup your own values if you use another docker registry.
+Our examples use the [`nvcr.io`](https://nvcr.io/nvidia/ai-dynamo/) but you can setup your own values if you use another docker registry.
```bash
export NAMESPACE=dynamo-cloud # or whatever you prefer.
diff --git a/docs/index.rst b/docs/index.rst
index 7000e786e5..c751f0d819 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -45,26 +45,26 @@ The examples below assume you build the latest image yourself from source. If us
:margin: 0
:padding: 3 4 0 0
- .. grid-item-card:: :doc:`Hello World `
- :link: /examples/hello_world
+ .. grid-item-card:: :doc:`Hello World `
+ :link: examples/runtime/hello_world/README
:link-type: doc
- Demonstrates the basic concepts of Dynamo by creating a simple multi-service pipeline.
+ Demonstrates the basic concepts of Dynamo by creating a simple GPU-unaware graph
- .. grid-item-card:: :doc:`LLM Deployment `
- :link: /examples/llm_deployment
+ .. grid-item-card:: :doc:`LLM Serving with VLLM `
+ :link: components/backends/vllm/README
:link-type: doc
- Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations.
+ Presents examples and reference implementations for deploying Large Language Models (LLMs) in various configurations with VLLM.
- .. grid-item-card:: :doc:`Multinode `
- :link: /examples/multinode
+ .. grid-item-card:: :doc:`Multinode with SGLang `
+ :link: components/backends/sglang/docs/multinode-examples
:link-type: doc
- Demonstrates deployment for disaggregated serving on 3 nodes using `nvidia/Llama-3.1-405B-Instruct-FP8`.
+ Demonstrates disaggregated serving on several nodes.
- .. grid-item-card:: :doc:`TensorRT-LLM `
- :link: /examples/trtllm
+ .. grid-item-card:: :doc:`TensorRT-LLM `
+ :link: components/backends/trtllm/README
:link-type: doc
Presents TensorRT-LLM examples and reference implementations for deploying Large Language Models (LLMs) in various configurations.
@@ -110,7 +110,7 @@ The examples below assume you build the latest image yourself from source. If us
Dynamo Deploy Quickstart
Dynamo Cloud Kubernetes Platform
- Manual Helm Deployment
+ Manual Helm Deployment
GKE Setup Guide
Minikube Setup Guide
Model Caching with Fluid
@@ -126,22 +126,22 @@ The examples below assume you build the latest image yourself from source. If us
:hidden:
:caption: API
- Python API
NIXL Connect API
.. toctree::
:hidden:
:caption: Examples
- Aggregated and Disaggregated Deployment
- LLM Deployment Examples
- Multinode Examples
- LLM Deployment Examples using TensorRT-LLM
+ Hello World
+ LLM Deployment Examples using VLLM
+ Multinode Examples using SGLang
+ LLM Deployment Examples using TensorRT-LLM
.. toctree::
:hidden:
:caption: Reference
+
Glossary
KVBM Reading