Skip to content

Capture cluster fingerprint in snapshot for recipe-criteria binding #752

@mchmarny

Description

@mchmarny

Parent: #750

Summary

Make aicr snapshot capture a structured cluster fingerprint — the detected service, accelerator, OS, Kubernetes version, and any other dimensions a recipe declares in its criteria block. The fingerprint is what evidence bundles use to prove "the cluster the recipe was tested on actually matched the recipe's declared criteria."

Why

Without a deterministic fingerprint in the snapshot, an evidence bundle can claim a recipe was tested but cannot prove it ran on hardware/services matching the recipe's criteria. A reviewer reading the bundle has no way to confirm the H100 / EKS / Ubuntu / Kubeflow recipe was actually validated on H100 / EKS / Ubuntu / Kubeflow versus, say, a kind cluster with mocked GPUs.

This is the first link in the trust chain for #750.

Scope — what the fingerprint must include

At minimum:

  • Service — detected provider (eks / aks / gke / oci / coreweave / kind / on-prem)
  • Accelerator — detected GPU SKU (h100 / gb200 / b200 / mi300 / none)
  • OS — node OS distribution and version (ubuntu, rhcos, bottlerocket, etc.)
  • Kubernetes — server version (e.g., 1.33.4)
  • Node count and topology — total GPU nodes, per-zone breakdown if relevant
  • Detection provenance — which signals each value came from (node label, instance metadata, kubelet, etc.) so a reviewer can audit the inference

Any field a recipe can put in criteria.* should be derivable from the snapshot fingerprint. Extending criteria later means extending the fingerprint in lockstep.

Proposed approach

  1. Audit pkg/snapshotter and pkg/collector for what's already captured. Most of these signals likely exist; the gap is structuring them as a top-level fingerprint block in the snapshot YAML rather than scattered through collector outputs.
  2. Define a Fingerprint Go type in a portable location (likely pkg/snapshotter or a new pkg/fingerprint).
  3. Add a Match(recipe) helper that compares a fingerprint against a recipe's criteria and returns a structured diff (matched / mismatched / unknown per dimension). This is the function aicr verify-evidence will call.
  4. Surface the fingerprint as a top-level field in snapshot YAML for human review.
  5. Add table-driven tests covering each detection path and the Match comparison.

Success criteria

  • aicr snapshot emits a fingerprint: block at a stable schema location.
  • Every criteria.* value a recipe can declare has a corresponding fingerprint field.
  • Fingerprint.Match(recipe) returns (matched, diff) and is exercised by unit tests.
  • Snapshot YAML round-trips cleanly through the type.
  • Documented in docs/contributor/data.md (or appropriate page) with a worked example.

Out of scope

  • Cryptographic attestation of the fingerprint — that lives in the evidence bundle (parent epic).
  • Deciding what criteria should be — the recipe spec is the source of truth; this issue makes the snapshot side match.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions