Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat: revamp kubernetes doc
  • Loading branch information
julienmancuso committed Sep 23, 2025
commit 1df43bec2c607459596104d9292beb5091b0fa62
2 changes: 1 addition & 1 deletion docs/benchmarks/pre_deployment_profiling.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ SLA planner can work with any interpolation data that follows the above format.

## Running the Profiling Script in Kubernetes

Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](../../deploy/README.md), then set up profiling resources using [deploy/utils/README](../../deploy/utils/README.md). If your namespace is already set up, skip this step.
Set up your Kubernetes namespace for profiling (one-time per namespace). First ensure Dynamo Cloud platform is installed by following the [main installation guide](/docs/kubernetes/installation_guide.md), then set up profiling resources using [deploy/utils/README](/deploy/utils/README.md). If your namespace is already set up, skip this step.

**Prerequisites**: Ensure all dependencies are installed. If you ran the setup script above, dependencies are already installed. Otherwise, install them manually:
```bash
Expand Down
6 changes: 3 additions & 3 deletions docs/kubernetes/create_deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@
For example, when using the `VLLM` inference backend:

- **Development / Testing**
Use [`agg.yaml`](../../../components/backends/vllm/deploy/agg.yaml) as the base configuration.
Use [`agg.yaml`](/components/backends/vllm/deploy/agg.yaml) as the base configuration.

- **Production with Load Balancing**
Use [`agg_router.yaml`](../../../components/backends/vllm/deploy/agg_router.yaml) to enable scalable, load-balanced inference.
Use [`agg_router.yaml`](/components/backends/vllm/deploy/agg_router.yaml) to enable scalable, load-balanced inference.

- **High Performance / Disaggregated Deployment**
Use [`disagg_router.yaml`](../../../components/backends/vllm/deploy/disagg_router.yaml) for maximum throughput and modular scalability.
Use [`disagg_router.yaml`](/components/backends/vllm/deploy/disagg_router.yaml) for maximum throughput and modular scalability.


## Step 2: Customize the Template
Expand Down Expand Up @@ -90,7 +90,7 @@

The front end is launched with "python3 -m dynamo.frontend [--http-port 8000] [--router-mode kv]"
Each worker will launch `python -m dynamo.YOUR_INFERENCE_BACKEND --model YOUR_MODEL --your-flags `command.
If you are a Dynamo contributor the [dynamo run guide](../dynamo_run.md) for details on how to run this command.

Check failure on line 93 in docs/kubernetes/create_deployment.md

View workflow job for this annotation

GitHub Actions / Check for broken markdown links



## Step 3: Key Customization Points
Expand Down
2 changes: 1 addition & 1 deletion docs/kubernetes/fluxcd.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# GitOps Deployment with FluxCD

This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](../../../components/backends/vllm/README.md) to demonstrate the workflow.
This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](/components/backends/vllm/README.md) to demonstrate the workflow.

## Prerequisites

Expand Down
8 changes: 4 additions & 4 deletions docs/kubernetes/installation_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

## Quick Start Paths

Platform is installed using Dynamo Kubernetes Platform [helm chart](../../../deploy/cloud/helm/platform/README.md).

Check failure on line 24 in docs/kubernetes/installation_guide.md

View workflow job for this annotation

GitHub Actions / Check for broken markdown links

Broken link: [helm chart](../../../deploy/cloud/helm/platform/README.md) - View: https://github.com/ai-dynamo/dynamo/blob/HEAD/docs/kubernetes/installation_guide.md?plain=1#L24

**Path A: Production Install**
Install from published artifacts on your existing cluster → [Jump to Path A](#path-a-production-install)
Expand Down Expand Up @@ -173,9 +173,9 @@
```

2. **Explore Backend Guides**
- [vLLM Deployments](../../../components/backends/vllm/deploy/README.md)
- [SGLang Deployments](../../../components/backends/sglang/deploy/README.md)
- [TensorRT-LLM Deployments](../../../components/backends/trtllm/deploy/README.md)
- [vLLM Deployments](/components/backends/vllm/deploy/README.md)
- [SGLang Deployments](/components/backends/sglang/deploy/README.md)
- [TensorRT-LLM Deployments](/components/backends/trtllm/deploy/README.md)

3. **Optional:**
- [Set up Prometheus & Grafana](metrics.md)
Expand Down Expand Up @@ -215,7 +215,7 @@

## Advanced Options

- [Helm Chart Configuration](../../../deploy/cloud/helm/platform/README.md)
- [Helm Chart Configuration](/deploy/cloud/helm/platform/README.md)
- [GKE-specific setup](gke_setup.md)
- [Create custom deployments](create_deployment.md)
- [Dynamo Operator details](dynamo_operator.md)
Expand Down
4 changes: 2 additions & 2 deletions docs/kubernetes/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@ This will create two components:
- A Worker component exposing metrics on its system port

Both components expose a `/metrics` endpoint following the OpenMetrics format, but with different metrics appropriate to their roles. For details about:
- Deployment configuration: See the [vLLM README](../../components/backends/vllm/README.md)
- Available metrics: See the [metrics guide](../metrics.md)
- Deployment configuration: See the [vLLM README](/components/backends/vllm/README.md)
- Available metrics: See the [metrics guide](/docs/guides/metrics.md)

### Validate the Deployment

Expand Down
Loading