Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/guides/dynamo_deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,4 +167,5 @@ Key customization points include:
- **[Examples](/examples/README.md)** - Complete working examples
- **[Create Custom Deployments](/docs/guides/dynamo_deploy/create_deployment.md)** - Build your own CRDs
- **[Operator Documentation](/docs/guides/dynamo_deploy/dynamo_operator.md)** - How the platform works
- **[Helm Charts](/deploy/helm/README.md)** - For advanced users
- **[Helm Charts](/deploy/helm/README.md)** - For advanced users
- **[GitOps Deployment with FluxCD](/docs/guides/dynamo_deploy/fluxcd.md)** - For advanced users
111 changes: 2 additions & 109 deletions docs/guides/dynamo_deploy/dynamo_operator.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,122 +19,15 @@ Dynamo operator is a Kubernetes operator that simplifies the deployment, configu
3. Kubernetes resources (Deployments, Services, etc.) are created or updated to match the CR spec.
4. Status fields are updated to reflect the current state.



## Custom Resource Definitions (CRDs)

For the complete technical API reference for Dynamo Custom Resource Definitions, see:

**📖 [Dynamo CRD API Reference](../../../deploy/cloud/operator/docs/api_reference.md)**
**📖 [Dynamo CRD API Reference](/docs/guides/dynamo_deploy/api_reference.md)**

## Installation

[See installation steps](installation_guide.md#overview)


## GitOps Deployment with FluxCD

This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](../../../components/backends/vllm/README.md) to demonstrate the workflow.

### Prerequisites

- A Kubernetes cluster with [Dynamo Cloud](installation_guide.md) installed
- [FluxCD](https://fluxcd.io/flux/installation/) installed in your cluster
- A Git repository to store your deployment configurations

### Workflow Overview

The GitOps workflow for Dynamo deployments consists of three main steps:

1. Build and push the Dynamo Operator
2. Create and commit a DynamoGraphDeployment custom resource for initial deployment
3. Update the graph by building a new version and updating the CR for subsequent updates

### Step 1: Build and Push Dynamo Cloud Operator

First, follow to [See Install Dynamo Cloud](README.md).

### Step 2: Create Initial Deployment

Create a new file in your Git repository (e.g., `deployments/llm-agg.yaml`) with the following content:

```yaml
apiVersion: nvidia.com/v1alpha1
kind: DynamoGraphDeployment
metadata:
name: llm-agg
spec:
services:
Frontend:
replicas: 1
envs:
- name: SPECIFIC_ENV_VAR
value: some_specific_value
Processor:
replicas: 1
envs:
- name: SPECIFIC_ENV_VAR
value: some_specific_value
VllmWorker:
replicas: 1
envs:
- name: SPECIFIC_ENV_VAR
value: some_specific_value
# Add PVC for model storage
pvc:
name: vllm-model-storage
mountPath: /models
size: 100Gi
```

Commit and push this file to your Git repository. FluxCD will detect the new CR and create the initial deployment in your cluster. The operator will:
- Create the specified PVCs
- Build container images for all components
- Deploy the services with the configured resources

### Step 3: Update Existing Deployment

To update your pipeline, just update the associated DynamoGraphDeployment CRD

The Dynamo operator will automatically reconcile it.

### Monitoring the Deployment

You can monitor the deployment status using:

```bash

export NAMESPACE=<namespace-with-the-dynamo-cloud-operator>

# Check the DynamoGraphDeployment status
kubectl get dynamographdeployment llm-agg -n $NAMESPACE
```

## Configuration


- **Environment Variables:**

| Name | Description | Default |
|----------------------------------------------------|--------------------------------------|--------------------------------------------------------|
| `LOG_LEVEL` | Logging verbosity level | `info` |
| `DYNAMO_SYSTEM_NAMESPACE` | System namespace | `dynamo` |

- **Flags:**
| Flag | Description | Default |
|-----------------------|--------------------------------------------|---------|
| `--natsAddr` | Address of NATS server | "" |
| `--etcdAddr` | Address of etcd server | "" |



## Troubleshooting

| Symptom | Possible Cause | Solution |
|------------------------|-------------------------------|-----------------------------------|
| Resource not created | RBAC missing | Ensure correct ClusterRole/Binding|
| Status not updated | CRD schema mismatch | Regenerate CRDs with kubebuilder |
| Image build hangs | Misconfigured DynamoComponent | Check image build logs |
[See installation steps](/docs/guides/dynamo_deploy/installation_guide.md#overview)


## Development
Expand Down
77 changes: 77 additions & 0 deletions docs/guides/dynamo_deploy/fluxcd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# GitOps Deployment with FluxCD

This section describes how to use FluxCD for GitOps-based deployment of Dynamo inference graphs. GitOps enables you to manage your Dynamo deployments declaratively using Git as the source of truth. We'll use the [aggregated vLLM example](../../../components/backends/vllm/README.md) to demonstrate the workflow.

## Prerequisites

- A Kubernetes cluster with [Dynamo Cloud](/docs/guides/dynamo_deploy/installation_guide.md) installed
- [FluxCD](https://fluxcd.io/flux/installation/) installed in your cluster
- A Git repository to store your deployment configurations

## Workflow Overview

The GitOps workflow for Dynamo deployments consists of three main steps:

1. Build and push the Dynamo Operator
2. Create and commit a DynamoGraphDeployment custom resource for initial deployment
3. Update the graph by building a new version and updating the CR for subsequent updates

## Step 1: Build and Push Dynamo Cloud Operator

First, follow to [See Install Dynamo Cloud](/docs/guides/dynamo_deploy/installation_guide.md).

## Step 2: Create Initial Deployment

Create a new file in your Git repository (e.g., `deployments/llm-agg.yaml`) with the following content:

```yaml
apiVersion: nvidia.com/v1alpha1
kind: DynamoGraphDeployment
metadata:
name: llm-agg
spec:
services:
Frontend:
replicas: 1
envs:
- name: SPECIFIC_ENV_VAR
value: some_specific_value
Processor:
replicas: 1
envs:
- name: SPECIFIC_ENV_VAR
value: some_specific_value
VllmWorker:
replicas: 1
envs:
- name: SPECIFIC_ENV_VAR
value: some_specific_value
# Add PVC for model storage
pvc:
name: vllm-model-storage
mountPath: /models
size: 100Gi
```

Commit and push this file to your Git repository. FluxCD will detect the new CR and create the initial deployment in your cluster. The operator will:
- Create the specified PVCs
- Build container images for all components
- Deploy the services with the configured resources

## Step 3: Update Existing Deployment

To update your pipeline, just update the associated DynamoGraphDeployment CRD

The Dynamo operator will automatically reconcile it.

## Monitoring the Deployment

You can monitor the deployment status using:

```bash

export NAMESPACE=<namespace-with-the-dynamo-cloud-operator>

# Check the DynamoGraphDeployment status
kubectl get dynamographdeployment llm-agg -n $NAMESPACE
```
Loading