Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
feat(bundler): add --dynamic flag for install-time values (#515)
  • Loading branch information
lockwobr committed Apr 14, 2026
commit 6bc90240bf6a436d49a8fda918b4ed9711afecfe
93 changes: 93 additions & 0 deletions docs/user/cli-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -806,6 +806,7 @@ aicr bundle [flags]
| `--deployer` | `-d` | string | Deployment method: helm (default), argocd |
| `--repo` | | string | Git repository URL for ArgoCD applications (only used with `--deployer argocd`) |
| `--set` | | string[] | Override values in bundle files (repeatable). Use `enabled` key to include/exclude components (e.g., `--set awsebscsidriver:enabled=false`) |
| `--dynamic` | | string[] | Declare value paths as install-time parameters (repeatable, format: `component:path`). See [Dynamic Install-Time Values](#dynamic-install-time-values). |
| `--data` | | string | External data directory to overlay on embedded data (see [External Data](#external-data-directory)) |
| `--system-node-selector` | | string[] | Node selector for system components (format: key=value, repeatable) |
| `--system-node-toleration` | | string[] | Toleration for system components (format: key=value:effect, repeatable) |
Expand Down Expand Up @@ -980,6 +981,98 @@ aicr bundle -r recipe.yaml \
-o ./bundles
```

**Dynamic Install-Time Values (`--dynamic`):**

The `--dynamic` flag declares value paths that are cluster-specific and should be provided at install time rather than baked into the bundle at build time. This enables building a single bundle that can be deployed to multiple clusters with different configurations.

Use `--dynamic` for values that genuinely vary per cluster — cluster names, subnet IDs, endpoint URLs, region-specific settings. For values that are static per bundle but differ from the recipe default (e.g., a specific driver version), use `--set` instead.

| Use case | Flag | Example |
|----------|------|---------|
| Cluster-specific value (varies per deployment) | `--dynamic` | `--dynamic alloy:clusterName` |
| Static override (same for all deployments of this bundle) | `--set` | `--set gpuoperator:driver.version=580.105.08` |

```shell
--dynamic component:path.to.field
```

**Format:** `component:path` where:
- `component` - Component name or override key (same keys as `--set`, e.g., `gpuoperator`, `alloy`)
- `path` - Dot-separated path to the value that varies per cluster

**Helm deployer behavior:**

Dynamic paths are removed from `values.yaml` and written to a separate `cluster-values.yaml` per component. The generated `deploy.sh` passes both files to Helm:

```shell
helm upgrade --install gpu-operator ... \
-f values.yaml \
-f cluster-values.yaml
```

Before deploying, fill in `cluster-values.yaml` with cluster-specific values.

**ArgoCD deployer behavior:**

When `--dynamic` is used with `--deployer argocd`, the deployer automatically generates a Helm chart app-of-apps instead of flat Application manifests. Each component's Application template injects values via `valuesObject`, and dynamic values are supplied at install time via `helm install --set`:

```shell
helm install aicr-bundle ./bundle \
--set alloy.clusterName=prod-east \
--set alloy.subnetName=subnet-abc123
```

Without `--dynamic`, the ArgoCD deployer produces flat manifests as usual.

**Examples:**
```shell
# Declare cluster name as install-time parameter
aicr bundle -r recipe.yaml \
--dynamic alloy:clusterName \
-o ./bundles

# Multiple dynamic paths across components
aicr bundle -r recipe.yaml \
--dynamic alloy:clusterName \
--dynamic alloy:subnetName \
-o ./bundles

# Combine with --set (static overrides + dynamic cluster-specific values)
aicr bundle -r recipe.yaml \
--set gpuoperator:driver.version=580.105.08 \
--dynamic alloy:clusterName \
-o ./bundles

# ArgoCD with dynamic values (produces Helm chart app-of-apps)
aicr bundle -r recipe.yaml \
--deployer argocd \
--dynamic alloy:clusterName \
-o ./bundles
```

**Bundle structure with `--dynamic`** (Helm deployer):
```
bundles/
├── alloy/
│ ├── values.yaml # Static values (clusterName removed)
│ └── cluster-values.yaml # Dynamic stubs (fill in before deploying)
├── gpu-operator/
│ └── values.yaml # No dynamic values, no cluster-values.yaml
├── deploy.sh # Passes -f cluster-values.yaml when present
└── ...
```

**ArgoCD Helm chart structure with `--dynamic`:**
```
bundles/
├── Chart.yaml # Helm chart metadata
├── values.yaml # All component values (dynamic paths are empty stubs)
├── templates/
│ ├── alloy.yaml # ArgoCD Application template with valuesObject
│ └── gpu-operator.yaml
└── README.md
```

**Bundle structure** (with default Helm deployer):
```
bundles/
Expand Down
93 changes: 93 additions & 0 deletions pkg/bundler/bundler.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import (
"github.com/NVIDIA/aicr/pkg/bundler/checksum"
"github.com/NVIDIA/aicr/pkg/bundler/config"
"github.com/NVIDIA/aicr/pkg/bundler/deployer/argocd"
"github.com/NVIDIA/aicr/pkg/bundler/deployer/argocdhelm"
"github.com/NVIDIA/aicr/pkg/bundler/deployer/helm"
"github.com/NVIDIA/aicr/pkg/bundler/result"
"github.com/NVIDIA/aicr/pkg/bundler/validations"
Expand Down Expand Up @@ -254,6 +255,9 @@ func (b *DefaultBundler) Make(ctx context.Context, input recipe.RecipeInput, dir
// Route based on deployer
deployer := b.Config.Deployer()
if deployer == config.DeployerArgoCD {
if b.Config.HasDynamicValues() {
return b.makeArgoCDHelmChart(ctx, recipeResult, componentValues, dir, start)
}
return b.makeArgoCD(ctx, recipeResult, componentValues, dir, start)
}
return b.makeHelmBundle(ctx, recipeResult, componentValues, dir, start)
Expand All @@ -280,6 +284,12 @@ func (b *DefaultBundler) makeHelmBundle(ctx context.Context, recipeResult *recip
"failed to copy external data files", err)
}

// Resolve dynamic values for Helm deployer
dynamicValues, dynErr := b.buildDynamicValuesMap()
if dynErr != nil {
return nil, dynErr
}

// Generate per-component bundle
generator := helm.NewGenerator()
generatorInput := &helm.GeneratorInput{
Expand All @@ -289,6 +299,7 @@ func (b *DefaultBundler) makeHelmBundle(ctx context.Context, recipeResult *recip
IncludeChecksums: b.Config.IncludeChecksums(),
ComponentManifests: componentManifests,
DataFiles: dataFiles,
DynamicValues: dynamicValues,
}

output, err := generator.Generate(ctx, generatorInput, dir)
Expand Down Expand Up @@ -351,6 +362,62 @@ func (b *DefaultBundler) makeHelmBundle(ctx context.Context, recipeResult *recip
return resultOutput, nil
}

// makeArgoCDHelmChart generates a Helm chart app-of-apps for ArgoCD with dynamic install-time values.
func (b *DefaultBundler) makeArgoCDHelmChart(ctx context.Context, recipeResult *recipe.RecipeResult, componentValues map[string]map[string]any, dir string, start time.Time) (*result.Output, error) {
dynamicValues, dynErr := b.buildDynamicValuesMap()
if dynErr != nil {
return nil, dynErr
}

slog.Debug("generating argocd helm chart app-of-apps",
"component_count", len(recipeResult.ComponentRefs),
"dynamic_components", len(dynamicValues),
"output_dir", dir,
)

generator := &argocdhelm.Generator{
RecipeResult: recipeResult,
ComponentValues: componentValues,
Version: b.Config.Version(),
RepoURL: b.Config.RepoURL(),
TargetRevision: b.Config.TargetRevision(),
IncludeChecksums: b.Config.IncludeChecksums(),
DynamicValues: dynamicValues,
}

output, err := generator.Generate(ctx, dir)
if err != nil {
return nil, errors.Wrap(errors.ErrCodeInternal,
"failed to generate argocd helm chart", err)
}

resultOutput := &result.Output{
Results: make([]*result.Result, 0),
Errors: make([]result.BundleError, 0),
TotalDuration: time.Since(start),
TotalSize: output.TotalSize,
TotalFiles: len(output.Files),
OutputDir: dir,
}

argocdResult := &result.Result{
Type: "argocd-helm-chart",
Success: true,
Files: output.Files,
Size: output.TotalSize,
Duration: output.Duration,
}
resultOutput.Results = append(resultOutput.Results, argocdResult)

resultOutput.Deployment = &result.DeploymentInfo{
Type: "ArgoCD Helm chart app-of-apps",
Steps: output.DeploymentSteps,
Notes: b.warnings,
}

return resultOutput, nil
}

// makeArgoCD generates ArgoCD Application manifests.
func (b *DefaultBundler) makeArgoCD(ctx context.Context, recipeResult *recipe.RecipeResult, componentValues map[string]map[string]any, dir string, start time.Time) (*result.Output, error) {
slog.Debug("generating argocd applications",
Expand Down Expand Up @@ -909,6 +976,32 @@ func (b *DefaultBundler) writeRecipeFile(recipeResult *recipe.RecipeResult, dir
return int64(len(recipeData)), nil
}

// buildDynamicValuesMap re-keys the config's dynamic values from user override keys
// (e.g., "gpuoperator") to component names (e.g., "gpu-operator") using the registry.
func (b *DefaultBundler) buildDynamicValuesMap() (map[string][]string, error) {
if !b.Config.HasDynamicValues() {
return make(map[string][]string), nil
}

registry, err := recipe.GetComponentRegistry()
if err != nil {
return nil, errors.Wrap(errors.ErrCodeInternal, "failed to load component registry for --dynamic resolution", err)
}

raw := b.Config.DynamicValues()
result := make(map[string][]string, len(raw))
for key, paths := range raw {
comp := registry.GetByOverrideKey(key)
if comp == nil {
return nil, errors.New(errors.ErrCodeInvalidRequest,
fmt.Sprintf("unknown component %q in --dynamic flag: not found in component registry", key))
}
result[comp.Name] = append(result[comp.Name], paths...)
}

return result, nil
}

// removeHyphens removes hyphens from a string.
func removeHyphens(s string) string {
return strings.ReplaceAll(s, "-", "")
Expand Down
141 changes: 141 additions & 0 deletions pkg/bundler/bundler_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1072,6 +1072,147 @@ func TestMake_Reproducible(t *testing.T) {
t.Logf("Reproducibility verified: both iterations produced %d identical files", len(fileHashes[0]))
}

func TestMake_DynamicValuesUnknownComponent(t *testing.T) {
cfg := config.NewConfig(
config.WithDynamicValues(map[string][]string{
"nonexistent-component": {"some.path"},
}),
)
bundler, err := New(WithConfig(cfg))
if err != nil {
t.Fatalf("New() error = %v", err)
}

recipeResult := &recipe.RecipeResult{
APIVersion: "aicr.nvidia.com/v1alpha1",
Kind: "RecipeResult",
ComponentRefs: []recipe.ComponentRef{
{
Name: "gpu-operator",
Version: "v25.3.3",
Type: "helm",
Source: "https://helm.ngc.nvidia.com/nvidia",
},
},
}

_, err = bundler.Make(context.Background(), recipeResult, t.TempDir())
if err == nil {
t.Fatal("expected error for unknown component in --dynamic, got nil")
}
if !strings.Contains(err.Error(), "nonexistent-component") {
t.Errorf("error should mention the unknown component, got: %v", err)
}
}

func TestMake_DynamicValuesValidComponent(t *testing.T) {
cfg := config.NewConfig(
config.WithDynamicValues(map[string][]string{
"gpu-operator": {"driver.version"},
}),
)
bundler, err := New(WithConfig(cfg))
if err != nil {
t.Fatalf("New() error = %v", err)
}

recipeResult := &recipe.RecipeResult{
APIVersion: "aicr.nvidia.com/v1alpha1",
Kind: "RecipeResult",
ComponentRefs: []recipe.ComponentRef{
{
Name: "gpu-operator",
Namespace: "gpu-operator",
Version: "v25.3.3",
Type: "helm",
Source: "https://helm.ngc.nvidia.com/nvidia",
Chart: "gpu-operator",
},
},
}

out, err := bundler.Make(context.Background(), recipeResult, t.TempDir())
if err != nil {
t.Fatalf("expected success for valid --dynamic component, got: %v", err)
}
if out == nil {
t.Fatal("expected non-nil output")
}
}

func TestMake_DisabledComponentWithDynamic(t *testing.T) {
t.Parallel()

cfg := config.NewConfig(
config.WithValueOverrides(map[string]map[string]string{
"awsebscsidriver": {"enabled": "false"},
}),
config.WithDynamicValues(map[string][]string{
"awsebscsidriver": {"controller.replicaCount"},
}),
)
bundler, err := New(WithConfig(cfg))
if err != nil {
t.Fatalf("New() error = %v", err)
}

recipeResult := &recipe.RecipeResult{
APIVersion: "aicr.nvidia.com/v1alpha1",
Kind: "RecipeResult",
Criteria: &recipe.Criteria{Service: "eks", Accelerator: "h100", Intent: "training"},
ComponentRefs: []recipe.ComponentRef{
{
Name: "gpu-operator",
Namespace: "gpu-operator",
Version: "v25.3.3",
Type: "helm",
Source: "https://helm.ngc.nvidia.com/nvidia",
Chart: "gpu-operator",
},
{
Name: "aws-ebs-csi-driver",
Namespace: "kube-system",
Version: "2.55.0",
Type: "helm",
Source: "https://kubernetes-sigs.github.io/aws-ebs-csi-driver",
Chart: "aws-ebs-csi-driver",
},
},
DeploymentOrder: []string{"gpu-operator", "aws-ebs-csi-driver"},
}

ctx := context.Background()
tmpDir := t.TempDir()
_, makeErr := bundler.Make(ctx, recipeResult, tmpDir)
if makeErr != nil {
t.Fatalf("Make() error = %v", makeErr)
}

// Disabled component should NOT have a directory at all
if _, statErr := os.Stat(filepath.Join(tmpDir, "aws-ebs-csi-driver")); !os.IsNotExist(statErr) {
t.Error("expected aws-ebs-csi-driver directory to NOT be created (component is disabled)")
}

// Disabled component should NOT have cluster-values.yaml
if _, statErr := os.Stat(filepath.Join(tmpDir, "aws-ebs-csi-driver", "cluster-values.yaml")); !os.IsNotExist(statErr) {
t.Error("expected aws-ebs-csi-driver/cluster-values.yaml to NOT exist (component is disabled)")
}

// Enabled component should still exist
if _, statErr := os.Stat(filepath.Join(tmpDir, "gpu-operator", "values.yaml")); os.IsNotExist(statErr) {
t.Error("expected gpu-operator/values.yaml to be created")
}

// deploy.sh should not reference the disabled component
deployScript, readErr := os.ReadFile(filepath.Join(tmpDir, "deploy.sh"))
if readErr != nil {
t.Fatalf("failed to read deploy.sh: %v", readErr)
}
if strings.Contains(string(deployScript), "aws-ebs-csi-driver") {
t.Error("deploy.sh should not contain aws-ebs-csi-driver (disabled component)")
}
}

// computeTestChecksum computes SHA256 hash for test comparison.
func computeTestChecksum(content []byte) string {
hash := make([]byte, 32)
Expand Down
Loading