feat: revamp kubernetes doc

ai-dynamo · julienmancuso · Sep 23, 2025 · Sep 22, 2025 · Sep 22, 2025 · Sep 23, 2025
commit 1df43bec2c607459596104d9292beb5091b0fa62
diff --git a/docs/benchmarks/pre_deployment_profiling.md b/docs/benchmarks/pre_deployment_profiling.md
@@ -89,7 +89,7 @@ SLA planner can work with any interpolation data that follows the above format.
 
 ## Running the Profiling Script in Kubernetes
 
-Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](../../deploy/README.md), then set up profiling resources using [deploy/utils/README](../../deploy/utils/README.md). If your namespace is already set up, skip this step.
+Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](/docs/kubernetes/installation_guide.md), then set up profiling resources using [deploy/utils/README](/deploy/utils/README.md). If your namespace is already set up, skip this step.
 
 **Prerequisites**: Ensure all dependencies are installed. If you ran the setup script above, dependencies are already installed. Otherwise, install them manually:
 ```bash

diff --git a/docs/kubernetes/create_deployment.md b/docs/kubernetes/create_deployment.md
@@ -13,13 +13,13 @@
 For example, when using the `VLLM` inference backend:
 
 - **Development / Testing**
-  Use [`agg.yaml`](../../../components/backends/vllm/deploy/agg.yaml) as the base configuration.
+  Use [`agg.yaml`](/components/backends/vllm/deploy/agg.yaml) as the base configuration.
 
 - **Production with Load Balancing**
-  Use [`agg_router.yaml`](../../../components/backends/vllm/deploy/agg_router.yaml) to enable scalable, load-balanced inference.
+  Use [`agg_router.yaml`](/components/backends/vllm/deploy/agg_router.yaml) to enable scalable, load-balanced inference.
 
 - **High Performance / Disaggregated Deployment**
-  Use [`disagg_router.yaml`](../../../components/backends/vllm/deploy/disagg_router.yaml) for maximum throughput and modular scalability.
+  Use [`disagg_router.yaml`](/components/backends/vllm/deploy/disagg_router.yaml) for maximum throughput and modular scalability.
 
 
 ## Step 2: Customize the Template
@@ -90,7 +90,7 @@

 The front end is launched with "python3 -m dynamo.frontend [--http-port 8000] [--router-mode kv]"
 Each worker will launch `python -m dynamo.YOUR_INFERENCE_BACKEND --model YOUR_MODEL --your-flags `command.
 If you are a Dynamo contributor the [dynamo run guide](../dynamo_run.md) for details on how to run this command.


 ## Step 3: Key Customization Points

diff --git a/docs/kubernetes/fluxcd.md b/docs/kubernetes/fluxcd.md
@@ -1,6 +1,6 @@
 # GitOps Deployment with FluxCD
 
-This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](../../../components/backends/vllm/README.md) to demonstrate the workflow.
+This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](/components/backends/vllm/README.md) to demonstrate the workflow.
 
 ## Prerequisites
 

diff --git a/docs/kubernetes/installation_guide.md b/docs/kubernetes/installation_guide.md
@@ -21,7 +21,7 @@

 ## Quick Start Paths

 Platform is installed using Dynamo Kubernetes Platform [helm chart](../../../deploy/cloud/helm/platform/README.md).

 **Path A: Production Install**
 Install from published artifacts on your existing cluster → [Jump to Path A](#path-a-production-install)
@@ -173,9 +173,9 @@
    ```
 
 2. **Explore Backend Guides**
-   - [vLLM Deployments](../../../components/backends/vllm/deploy/README.md)
-   - [SGLang Deployments](../../../components/backends/sglang/deploy/README.md)
-   - [TensorRT-LLM Deployments](../../../components/backends/trtllm/deploy/README.md)
+   - [vLLM Deployments](/components/backends/vllm/deploy/README.md)
+   - [SGLang Deployments](/components/backends/sglang/deploy/README.md)
+   - [TensorRT-LLM Deployments](/components/backends/trtllm/deploy/README.md)
 
 3. **Optional:**
    - [Set up Prometheus & Grafana](metrics.md)
@@ -215,7 +215,7 @@
 
 ## Advanced Options
 
-- [Helm Chart Configuration](../../../deploy/cloud/helm/platform/README.md)
+- [Helm Chart Configuration](/deploy/cloud/helm/platform/README.md)
 - [GKE-specific setup](gke_setup.md)
 - [Create custom deployments](create_deployment.md)
 - [Dynamo Operator details](dynamo_operator.md)

diff --git a/docs/kubernetes/metrics.md b/docs/kubernetes/metrics.md
@@ -64,8 +64,8 @@ This will create two components:
 - A Worker component exposing metrics on its system port
 
 Both components expose a `/metrics` endpoint following the OpenMetrics format, but with different metrics appropriate to their roles. For details about:
-- Deployment configuration: See the [vLLM README](../../components/backends/vllm/README.md)
-- Available metrics: See the [metrics guide](../metrics.md)
+- Deployment configuration: See the [vLLM README](/components/backends/vllm/README.md)
+- Available metrics: See the [metrics guide](/docs/guides/metrics.md)
 
 ### Validate the Deployment