Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 29 additions & 1 deletion docs/guides/dynamo_deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Each backend has deployment examples and configuration options:

```bash
# Set same namespace from platform install
export NAMESPACE=dynamo-cloud
export NAMESPACE=<your-namespace-name>

# Deploy any example (this uses vLLM with Qwen model using aggregated serving)
kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE}
Expand All @@ -49,6 +49,34 @@ kubectl port-forward svc/agg-vllm-frontend 8000:8000 -n ${NAMESPACE}
curl http://localhost:8000/v1/models
```

### Container Images

Before deploying, you'll need to specify which container images to use in the CRD. You have several options:

**Option 1: Use Public Images**
The easiest way is to use public images from the [NVIDIA NGC Catalog](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo). Simply update the image field in your deployment YAML:

```yaml
# In your deployment YAML:
image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:<version> # vLLM
image: nvcr.io/nvidia/ai-dynamo/sglang-runtime:<version> # SGLang
image: nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:<version> # TensorRT-LLM
```

**Option 2: Build Your Own**
For customization or private deployments, build containers from source:

```bash
# Build from source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we point to container args to specify framework (vllm, sglang, trtllm) and target (runtime for prod >> I think dev for dev version)

./container/build.sh

# Tag and push to your registry
docker tag dynamo-runtime:latest your-registry/dynamo-runtime:latest
docker push your-registry/dynamo-runtime:latest
```

> **Note**: We're working to update all example YAMLs to default to public images with `:latest` tags for easier deployment.

## What's a DynamoGraphDeployment?

It's a Kubernetes Custom Resource that defines your inference pipeline:
Expand Down
Loading