ci: validate dgd in CI and pre-commit #4366

alec-flowers · 2025-11-14T21:48:06Z

Overview:

To avoid DGD's being committed to the repo that will not pass validation, added a pre-commit and CI/CD check.

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

New Features
- Added automated validation for deployment YAML files to catch configuration errors early in the development process.
- Validation now runs in CI workflows and as a pre-commit hook.
Documentation
- Added comprehensive guide for the deployment YAML validation tool, including quickstart instructions and troubleshooting steps.
Chores
- Updated deployment examples to use shared resource volumes instead of per-service configurations.

Signed-off-by: alec-flowers <[email protected]>

coderabbitai · 2025-11-14T21:50:55Z

Walkthrough

The changes introduce a comprehensive deployment validation framework for DynamoGraphDeployment YAML files, including a Python validation utility integrated into CI/CD workflows and pre-commit hooks. Additional updates include sample CRD configurations and refactored example manifests using shared PVC volumes.

Changes

Cohort / File(s)	Summary
Validation Framework & CI/CD Integration `.github/workflows/validate-deployments.yml`, `.pre-commit-config.yaml`	Adds GitHub Actions workflow and pre-commit hook to validate DynamoGraphDeployment YAML files against CRD OpenAPI v3 schema on repository changes, with triggers for recipes, examples, tests, and deploy paths.
Validation Tool Implementation `deploy/utils/validate_deployments.py`, `deploy/utils/README_VALIDATION.md`	Introduces Python validation utility with CRD schema loading, YAML parsing, error reporting with line mapping, structural linting, and file discovery logic; includes comprehensive documentation covering features, usage, integration, examples, and troubleshooting.
Sample & Example Updates `deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeployment.yaml`, `examples/basics/kubernetes/shared_frontend/shared_frontend.yaml`	Updates sample CRD to include services block with namespace configuration; refactors example manifest to replace per-service PVC blocks with shared PVC volume references and centralized PVC declarations.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

deploy/utils/validate_deployments.py: Requires careful review of schema extraction logic, JSON Schema validation, line number mapping heuristics, error formatting, and file discovery mechanisms with git integration fallback.
Error handling paths: Verify robust handling of malformed YAML, missing CRD schemas, and PyYAML/jsonschema import failures.
Public API surface: Confirm DeploymentValidator class, has_dgd_resource(), and find_deploy_files() function signatures and behaviors are well-defined.
YAML example refactoring: Verify volumeMount structural consistency across all service definitions in the shared frontend example.

Poem

🐰 A hop and a skip, validation's afoot!
Schema checks dance in the CI boot,
Pre-commit hooks catch what should not,
Shared volumes simplify, tight as they're ought,
DGD deployments now validated for the route! 🎉

Pre-merge checks

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description is incomplete; it provides an overview but lacks detailed explanation of changes, reviewer guidance, and a valid related issue reference.	Expand the Details section with specific changes (GitHub Actions workflow, pre-commit hook, validation script), and update Related Issues with an actual issue number or remove the placeholder.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: adding pre-commit and CI validation for DGD (DynamoGraphDeployment) YAML files.
Docstring Coverage	✅ Passed	Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: alec-flowers <[email protected]>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

deploy/utils/README_VALIDATION.md (1)

45-57: Tighten markdown: add fenced code languages and fix “GitHub” capitalization

To keep markdown tooling quiet and improve readability:

Add explicit languages to the non-bash fenced blocks (e.g., ```text for the “Valid files” / “Validation errors” output and the CRD path snippet).

Ensure “GitHub” is capitalized consistently where mentioned in prose.

These are small cleanups but will clear the current markdownlint/LanguageTool warnings.

Also applies to: 61-65, 81-95
.github/workflows/validate-deployments.yml (1)
18-37: Consider aligning CI path filters with the pre-commit hook

The workflow currently triggers only for YAML changes under recipes/** and examples/** (plus the validator script and CRD). The pre-commit hook, however, covers (recipes|examples|tests|deploy)/.*\.(yaml|yml)$.

If someone changes a DynamoGraphDeployment YAML only under deploy/** (or tests/**), CI won’t run this validator, even though the pre-commit hook would. To keep merge-time validation consistent, consider expanding the paths filters to include at least deploy/**/*.yml? (and optionally tests/**/*.yml?), e.g.:
   pull_request:
     paths:
       - 'recipes/**/*.yaml'
       - 'recipes/**/*.yml'
       - 'examples/**/*.yaml'
       - 'examples/**/*.yml'
+      - 'deploy/**/*.yaml'
+      - 'deploy/**/*.yml'
       - 'deploy/utils/validate_deployments.py'
       - 'deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml'
   push:
@@
     paths:
       - 'recipes/**/*.yaml'
       - 'recipes/**/*.yml'
       - 'examples/**/*.yaml'
       - 'examples/**/*.yml'
+      - 'deploy/**/*.yaml'
+      - 'deploy/**/*.yml'
       - 'deploy/utils/validate_deployments.py'
       - 'deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml'
.pre-commit-config.yaml (1)
88-97: Leverage the pre-commit Python environment instead of hard-coding python3

The new hook looks good and matches the directories you care about. One small portability improvement:

Since language: python already runs the hook inside a managed virtualenv, you can rely on that interpreter instead of hard-coding python3 in the entry. For example:
-  - id: validate-deploy-yaml
-    name: Validate DynamoGraphDeployment YAMLs
-    entry: python3 deploy/utils/validate_deployments.py
-    language: python
+  - id: validate-deploy-yaml
+    name: Validate DynamoGraphDeployment YAMLs
+    entry: deploy/utils/validate_deployments.py
+    language: python
     files: '(recipes|examples|tests|deploy)/.*\.(yaml|yml)$'
     pass_filenames: true
     additional_dependencies: [pyyaml, jsonschema]
This keeps the hook from depending on a python3 binary being on PATH and lets pre-commit fully manage the interpreter.

Please double-check this against the current pre-commit docs to ensure it matches their recommended pattern.
deploy/utils/validate_deployments.py (2)

341-349: --strict flag is currently unused

The --strict flag is parsed but never consulted when validating or reporting errors, which can confuse users and linters.

Either wire it into behavior (e.g., to distinguish warnings vs. errors if you later add warnings) or remove the flag for now to keep the CLI surface minimal.

41-58: Broad except Exception blocks are user-friendly but may mask unexpected failures

Both _load_crd_schema and validate_file use except Exception and then print a generic error (or return a generic message). This is convenient for CLI UX but can hide programming or environment errors and makes debugging harder.

If you want to tighten this up without regressing UX, consider:

Narrowing the exceptions to explicit types (FileNotFoundError, yaml.YAMLError, ValueError, OSError) where possible.

For truly unexpected exceptions, letting them propagate so CI fails loudly with a traceback.

This is non-blocking, but will make the validator more maintainable over time.

Also applies to: 71-75

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7702f49 and 7f85a7a.

📒 Files selected for processing (6)

.github/workflows/validate-deployments.yml (1 hunks)
.pre-commit-config.yaml (1 hunks)
deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeployment.yaml (1 hunks)
deploy/utils/README_VALIDATION.md (1 hunks)
deploy/utils/validate_deployments.py (1 hunks)
examples/basics/kubernetes/shared_frontend/shared_frontend.yaml (4 hunks)

🧰 Additional context used

🧠 Learnings (4)

📓 Common learnings

Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml:1233-1235
Timestamp: 2025-07-18T16:04:47.465Z
Learning: The `stopSignal` field in Kubernetes CRDs like DynamoGraphDeployment and DynamoComponentDeployment is autogenerated by controller-gen when upgrading Kubernetes library versions, and represents expected upstream API changes rather than manual code that needs custom validation.

Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:92-98
Timestamp: 2025-07-18T16:04:31.771Z
Learning: CRD schemas in files like deploy/cloud/helm/crds/templates/*.yaml are auto-generated from Kubernetes library upgrades and should not be manually modified as changes would be overwritten during regeneration.

Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:1178-1180
Timestamp: 2025-07-18T16:05:05.534Z
Learning: The stopSignal field under lifecycle in DynamoComponentDeployment CRDs is autogenerated due to Kubernetes library upgrades (k8s.io/api and k8s.io/apimachinery from v0.32.3 to v0.33.1), not a manual design decision by the user.

Learnt from: nv-hwoo
Repo: ai-dynamo/dynamo PR: 4112
File: examples/backends/sglang/deploy/agg.yaml:76-78
Timestamp: 2025-11-05T20:23:29.539Z
Learning: In DynamoGraphDeployment (DGD) YAML manifests, the volumeMounts field uses `mountPoint` (not the standard Kubernetes `mountPath`) to specify where to mount volumes. This is defined in the DGD custom resource definition API.

Learnt from: biswapanda
Repo: ai-dynamo/dynamo PR: 2872
File: examples/multimodal/deploy/agg_qwen.yaml:53-60
Timestamp: 2025-09-04T19:03:06.643Z
Learning: In the dynamo repository, Kubernetes Custom Resources use `gpu: "1"` format for GPU resource limits and requests, not the standard Kubernetes `nvidia.com/gpu: 1` format. This applies to DynamoGraphDeployment resources and other dynamo CRs.

Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 3100
File: deploy/cloud/operator/cmd/main.go:186-190
Timestamp: 2025-09-17T22:35:40.674Z
Learning: The mpiRunSecretName validation in deploy/cloud/operator/cmd/main.go is safe for upgrades because the Helm chart automatically populates dynamo-operator.dynamo.mpiRun.secretName with a default value of "mpi-run-ssh-secret" and includes SSH key generation functionality via sshKeygen.enabled: true.

📚 Learning: 2025-11-05T20:23:29.539Z

Learnt from: nv-hwoo
Repo: ai-dynamo/dynamo PR: 4112
File: examples/backends/sglang/deploy/agg.yaml:76-78
Timestamp: 2025-11-05T20:23:29.539Z
Learning: In DynamoGraphDeployment (DGD) YAML manifests, the volumeMounts field uses `mountPoint` (not the standard Kubernetes `mountPath`) to specify where to mount volumes. This is defined in the DGD custom resource definition API.

Applied to files:

examples/basics/kubernetes/shared_frontend/shared_frontend.yaml

📚 Learning: 2025-09-04T19:03:06.643Z

Learnt from: biswapanda
Repo: ai-dynamo/dynamo PR: 2872
File: examples/multimodal/deploy/agg_qwen.yaml:53-60
Timestamp: 2025-09-04T19:03:06.643Z
Learning: In the dynamo repository, Kubernetes Custom Resources use `gpu: "1"` format for GPU resource limits and requests, not the standard Kubernetes `nvidia.com/gpu: 1` format. This applies to DynamoGraphDeployment resources and other dynamo CRs.

Applied to files:

deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeployment.yaml

📚 Learning: 2025-07-18T16:04:31.771Z

Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:92-98
Timestamp: 2025-07-18T16:04:31.771Z
Learning: CRD schemas in files like deploy/cloud/helm/crds/templates/*.yaml are auto-generated from Kubernetes library upgrades and should not be manually modified as changes would be overwritten during regeneration.

Applied to files:

deploy/utils/README_VALIDATION.md

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/4366/merge) by alec-flowers.

deploy/utils/validate_deployments.py

[error] 422-422: Validation failed: AttributeError: 'list' object has no attribute 'get' (during DynamoGraphDeployment validation)

[error] 398-398: Validation failed while validating YAMLs: is_valid, errors = validator.validate_file(yaml_file)

[error] 80-80: Validation failed: DynamoGraphDeployment doc is a list, not a mapping

🪛 LanguageTool

deploy/utils/README_VALIDATION.md

[uncategorized] ~41-~41: The official name of this software platform is spelled with a capital “H”.
Context: ... ### CI Workflow Runs on every PR via .github/workflows/validate-deployments.yml. #...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)

deploy/utils/README_VALIDATION.md

46-46: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

52-52: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

62-62: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.4)

deploy/utils/validate_deployments.py

52-52: Abstract raise to an inner function

(TRY301)

52-52: Avoid specifying long messages outside the exception class

(TRY003)

55-55: Abstract raise to an inner function

(TRY301)

55-55: Avoid specifying long messages outside the exception class

(TRY003)

56-56: Do not catch blind exception: Exception

(BLE001)

73-73: Do not catch blind exception: Exception

(BLE001)

88-88: Loop control variable doc_idx not used within loop body

Rename unused doc_idx to _doc_idx

(B007)

167-167: Consider [*data_path, key] instead of concatenation

Replace with [*data_path, key]

(RUF005)

178-178: Consider [*schema_path, key] instead of concatenation

Replace with [*schema_path, key]

(RUF005)

178-178: Consider [*data_path, key] instead of concatenation

Replace with [*data_path, key]

(RUF005)

187-187: Consider [*schema_path, key, idx] instead of concatenation

Replace with [*schema_path, key, idx]

(RUF005)

188-188: Consider [*data_path, key, idx] instead of concatenation

Replace with [*data_path, key, idx]

(RUF005)

267-267: Consider moving this statement to an else block

(TRY300)

268-268: Do not catch blind exception: Exception

(BLE001)

272-272: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

299-299: Starting a process with a partial executable path

(S607)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: sglang (amd64)
GitHub Check: vllm (amd64)
GitHub Check: trtllm (amd64)
GitHub Check: operator (amd64)
GitHub Check: Build and Test - dynamo
GitHub Check: Mirror Repository to GitLab

🔇 Additional comments (2)

deploy/cloud/operator/config/samples/nvidia.com_v1alpha1_dynamographdeployment.yaml (1)

24-29: Ensure sample manifest satisfies CRD required fields

With the new validator in place, this sample DynamoGraphDeployment will now be schema-validated. Right now, the two services only set dynamoNamespace. If the CRD marks additional fields as required for a service (e.g., componentType, maybe replicas), this sample could start failing validation and break CI.

Please run the new validator against this file and, if needed, flesh out the service specs so the sample is fully CRD-compliant.

examples/basics/kubernetes/shared_frontend/shared_frontend.yaml (1)

33-41: Shared PVC + mountPoint usage looks consistent with the DGD API

The switch to a top-level pvcs declaration and per-service volumeMounts referencing dynamo-model-cache looks good, and the use of mountPoint (rather than Kubernetes mountPath) matches the DynamoGraphDeployment CRD semantics for volume mounts. This should play nicely with the new validator.

Based on learnings

Also applies to: 64-71, 89-91, 109-111

coderabbitai · 2025-11-14T22:01:56Z

deploy/utils/validate_deployments.py

+        # Find DynamoGraphDeployment documents
+        dgd_docs = [
+            (i, doc)
+            for i, doc in enumerate(docs)
+            if doc and doc.get("kind") == "DynamoGraphDeployment"
+        ]
+
+        if not dgd_docs:
+            return True, []  # No DGD resources, skip validation
+
+        # Validate each DynamoGraphDeployment
+        errors = []
+        for doc_idx, doc in dgd_docs:
+            self._file_content = content
+            errors.extend(self._validate_document(doc))
+
+        return len(errors) == 0, errors


⚠️ Potential issue | 🔴 Critical

Fix crash on non-mapping YAML documents in validate_file (root cause of CI failure)

The pre-merge pipeline error AttributeError: 'list' object has no attribute 'get' matches this list comprehension: it assumes every YAML document is a mapping and calls .get("kind") unconditionally. For YAML files whose top-level document is a sequence (e.g., a list of resources), doc is a list and this blows up, breaking CI.

Guarding on dict here makes the validator robust while still validating all proper DynamoGraphDeployment objects.

Suggested fix:

- # Find DynamoGraphDeployment documents - dgd_docs = [ - (i, doc) - for i, doc in enumerate(docs) - if doc and doc.get("kind") == "DynamoGraphDeployment" - ] + # Find DynamoGraphDeployment documents; skip non-mapping docs safely + dgd_docs = [ + (i, doc) + for i, doc in enumerate(docs) + if isinstance(doc, dict) + and doc.get("kind") == "DynamoGraphDeployment" + ] @@ - # Validate each DynamoGraphDeployment - errors = [] - for doc_idx, doc in dgd_docs: - self._file_content = content - errors.extend(self._validate_document(doc)) + # Validate each DynamoGraphDeployment + errors: List[str] = [] + for _doc_idx, doc in dgd_docs: + # Store full file content so we can map schema errors back to line numbers + self._file_content = content + errors.extend(self._validate_document(doc))

This should eliminate the list/.get crash while keeping behavior unchanged for valid DGD documents.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/4366/merge) by alec-flowers.

[error] 80-80: Validation failed: DynamoGraphDeployment doc is a list, not a mapping

🪛 Ruff (0.14.4)

88-88: Loop control variable doc_idx not used within loop body

Rename unused doc_idx to _doc_idx

(B007)

🤖 Prompt for AI Agents

In deploy/utils/validate_deployments.py around lines 76 to 92, the list comprehension assumes every YAML document is a mapping and calls doc.get("kind"), which crashes on non-mapping documents (e.g. lists); change the filter to only consider dicts before accessing .get (e.g. check isinstance(doc, dict) and then doc.get("kind") == "DynamoGraphDeployment") so non-mapping YAML documents are ignored and validation proceeds normally.

coderabbitai · 2025-11-14T22:01:56Z

deploy/utils/validate_deployments.py

+def has_dgd_resource(yaml_file: Path) -> bool:
+    """Check if a YAML file contains DynamoGraphDeployment resources."""
+    try:
+        with open(yaml_file) as f:
+            for doc in yaml.safe_load_all(f):
+                if doc and doc.get("kind") == "DynamoGraphDeployment":
+                    return True
+        return False
+    except Exception:
+        return False


⚠️ Potential issue | 🔴 Critical

Avoid the same list.get crash and silent skipping in has_dgd_resource

has_dgd_resource has the same doc.get("kind") assumption and will also throw AttributeError when a YAML document is a list. Because find_deploy_files relies on this helper to discover candidate files, a crash here can prevent the validator from even starting.

Additionally, catching a blind Exception and returning False means any parse error silently hides a file from validation.

Suggested change to address both:

def has_dgd_resource(yaml_file: Path) -> bool: """Check if a YAML file contains DynamoGraphDeployment resources.""" try: with open(yaml_file) as f: - for doc in yaml.safe_load_all(f): - if doc and doc.get("kind") == "DynamoGraphDeployment": - return True - return False - except Exception: - return False + for doc in yaml.safe_load_all(f): + if isinstance(doc, dict) and doc.get("kind") == "DynamoGraphDeployment": + return True + return False + except yaml.YAMLError as e: + # Surface YAML issues instead of silently skipping potentially relevant files + print(f"YAML parsing error while scanning {yaml_file}: {e}") + return False

This mirrors the validate_file fix (no .get on lists) and at least logs parse errors instead of silently ignoring them. If you’d prefer parse errors to fail the entire run, we can instead have this re-raise and let main() handle it.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def has_dgd_resource(yaml_file: Path) -> bool:

"""Check if a YAML file contains DynamoGraphDeployment resources."""

try:

with open(yaml_file) as f:

for doc in yaml.safe_load_all(f):

if doc and doc.get("kind") == "DynamoGraphDeployment":

return True

return False

except Exception:

return False

def has_dgd_resource(yaml_file: Path) -> bool:

"""Check if a YAML file contains DynamoGraphDeployment resources."""

try:

with open(yaml_file) as f:

for doc in yaml.safe_load_all(f):

if isinstance(doc, dict) and doc.get("kind") == "DynamoGraphDeployment":

return True

return False

except yaml.YAMLError as e:

# Surface YAML issues instead of silently skipping potentially relevant files

print(f"YAML parsing error while scanning {yaml_file}: {e}")

return False

🧰 Tools

🪛 Ruff (0.14.4)

267-267: Consider moving this statement to an else block

(TRY300)

268-268: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In deploy/utils/validate_deployments.py around lines 260 to 269, has_dgd_resource currently assumes each YAML doc is a dict and calls doc.get("kind"), which will raise AttributeError for list documents, and it also swallows all exceptions returning False; change the logic to check the document type before accessing "kind" (e.g., ensure doc is a dict) and only examine dicts for kind == "DynamoGraphDeployment", and replace the broad except that silently hides parse errors with explicit handling that logs the parse/error (or re-raises if you prefer to fail the run) so parse errors aren’t silently ignored and candidate files aren’t dropped without notice.

julienmancuso · 2025-11-14T22:12:29Z

deploy/utils/validate_deployments.py

+    # Find CRD schema
+    crd_path = args.crd or (
+        workspace_root
+        / "deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml"


we also have a bunch of DGDR if I'm not mistaken

julienmancuso · 2025-11-14T22:16:17Z

That's great, thanks @alec-flowers .
Personnally, I would have reused existing tools instead of creating a brand new script to maintain ;-)
kubeconform, kube-linter, ...

tmonty12

I pointed out some pre-existing tooling but overall this is great - I like the pre-commit hook and workflow for catching validation errors

tmonty12 · 2025-11-14T23:15:29Z

deploy/utils/validate_deployments.py

+        """Initialize validator with CRD schema."""
+        self.crd_path = crd_path
+        self.schema = self._load_crd_schema()
+        self.validator = Draft7Validator(self.schema)


Draft7Validator is for validation of JSON Schema Draft 7 which OAI v3 is an extension of - so it might miss edge cases (although this definitely suffices for the use case)

There's actually a kubernetes SIG tool built for local validation of CRs against CRDs called kubectl-validate.

An example:

(dynamo) tom:~/Dynamo/repos/dynamo (aflowers/validate-dgd) $ kubectl-validate deploy/discovery/dgd.yaml --local-crds deploy/cloud/helm/crds/templates deploy/discovery/dgd.yaml...OK

Making the dgd.yaml invalid:

(dynamo) tom:~/Dynamo/repos/dynamo (aflowers/validate-dgd) $ kubectl-validate deploy/discovery/dgd.yaml --local-crds deploy/cloud/helm/crds/templates deploy/discovery/dgd.yaml...ERROR DynamoGraphDeployment.nvidia.com "dynamo" is invalid: [spec.backendFramework: Unsupported value: "test": supported values: "sglang", "vllm", "trtllm", <nil>: Invalid value: "null": some validation rules were not checked because the object was invalid; correct the existing errors to complete validation] Error: validation failed (dynamo) tom:~/Dynamo/repos/dynamo (aflowers/validate-dgd) $

I think the logic for finding the manifests would still be needed but just wanted to point this tooling out

alec-flowers added 2 commits November 14, 2025 21:39

validate DGD as pre-commit

0c1c4cb

Signed-off-by: alec-flowers <[email protected]>

add incremental validation

87bbfa7

Signed-off-by: alec-flowers <[email protected]>

alec-flowers requested review from a team as code owners November 14, 2025 21:48

pull-request-size bot added the size/XL label Nov 14, 2025

github-actions bot added the ci Issues/PRs that reference CI build/test label Nov 14, 2025

better naming

7f85a7a

Signed-off-by: alec-flowers <[email protected]>

copy-pr-bot bot temporarily deployed to GITLAB November 14, 2025 21:50 Inactive

skip helm

c38dac4

Signed-off-by: alec-flowers <[email protected]>

copy-pr-bot bot temporarily deployed to GITLAB November 14, 2025 21:55 Inactive

copy-pr-bot bot temporarily deployed to GITLAB November 14, 2025 22:01 Inactive

coderabbitai bot reviewed Nov 14, 2025

View reviewed changes

julienmancuso reviewed Nov 14, 2025

View reviewed changes

tmonty12 approved these changes Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ci: validate dgd in CI and pre-commit #4366

ci: validate dgd in CI and pre-commit #4366

Uh oh!

alec-flowers commented Nov 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 14, 2025

Uh oh!

coderabbitai bot Nov 14, 2025

Uh oh!

julienmancuso Nov 14, 2025

Uh oh!

julienmancuso commented Nov 14, 2025

Uh oh!

tmonty12 left a comment

Uh oh!

tmonty12 Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ci: validate dgd in CI and pre-commit #4366

Are you sure you want to change the base?

ci: validate dgd in CI and pre-commit #4366

Uh oh!

Conversation

alec-flowers commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

julienmancuso Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

julienmancuso commented Nov 14, 2025

Uh oh!

tmonty12 left a comment

Choose a reason for hiding this comment

Uh oh!

tmonty12 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alec-flowers commented Nov 14, 2025 •

edited

Loading

coderabbitai bot commented Nov 14, 2025 •

edited

Loading