Skip to content

[Epic]: Software supply chain security: visibility, reproducibility, verification #739

@mchmarny

Description

@mchmarny

Goal

Raise AICR's software supply chain security to a level customers and internal security review can sign off on. Three maturity stages, mapped to the SLSA / NIST SSDF outcomes:

  1. Visibility — every container image AICR can deploy is enumerable, auditable, and consumable by standard tooling (CycloneDX, Trivy, Grype, Cosign).
  2. Reproducibility — the same recipe rendered today and a year from now produces the same artifacts. Recipe → image set is deterministic.
  3. Provenance & verification — every artifact AICR deploys can be cryptographically tied to a known publisher and a known build, with admission-time enforcement where chart internals make in-tree pinning impractical.

Why this matters

  • Customer security review. Enterprise and regulated customers ask for an SBOM, a signing/provenance matrix, and a reproducibility statement before deploying. The SBOM and pinning story now exist; the signing/provenance matrix is in flight.
  • Air-gap deployment. AICR pulls from 11+ public registries; customers behind a corporate firewall need a complete mirror list.
  • Compromise blast radius. Tag-mutable references mean an upstream account compromise silently changes our deployments. Digest pinning + signature verification caps the blast radius and enables detection.
  • Reproducible builds. "Same recipe → same bytes" is the foundation for diff-based audit, regression analysis, and incident response.
  • SLSA / SSDF alignment. Visibility (level 1), Provenance (level 2), Hermetic builds (level 3) — each stage below corresponds to one of these.

Current state

From the published BOM (docs/user/container-images.md, generated by make bom):

Metric Value
Components 22
Unique images 69
Distinct registries 11
Helm components without a chart-version pin 0 ✅
In-tree manifest images digest-pinned 4 / 11 (the 7 remaining are CRD-triplet refs whose schemas reject digests; tracked via 4 upstream signing requests below)
Components with documented signing/SBOM status 0

All chart versions are pinned (PR #777). Every Pod-spec image reference under recipes/components/*/manifests/ that AICR controls in-tree is now digest-pinned (PRs #778, #779); the seven CRD-triplet exemptions are documented in recipes/manifest_images_test.go::imageDigestExemptions with reasons and upstream tracking links. The digest-pin CI gate enforces sha256: specifically (PR #779). Untagged-reference cleanup landed via PR #761; CRD-style sibling triplets (NicClusterPolicy, Skyhook Package) and bare-scalar placeholders (vgpu-manager-style) are now correctly handled end-to-end (PRs #761, #776).

Child issues, by maturity stage

Stage 1 — Visibility

Stage 2 — Reproducibility

Stage 3 — Provenance & verification

Related issues (not children, but adjacent)

Why this scope is finite

In-tree digest-pinning of every chart-default sub-image (gpu-operator's ~14, etc.) is not an end goal of this epic. Most charts don't expose a digest field for sub-images, and forking + maintaining that override matrix is intractable. The right answer for chart-default sub-images is admission-time digest verification (Kyverno / Cosign) plus signed image attestations from upstream — captured in #745 and the four upstream signing requests under it. We pin what we control in-tree (#748 ✅, #749 ✅) and verify what we don't (#745).

Definition of done

  • BOM is generated automatically and consumable as CycloneDX 1.6 by Trivy/Grype/Cosign without conversion.
  • BOM is published as a versioned doc artifact and refreshed weekly to surface upstream drift.
  • Every chart in recipes/registry.yaml has a pinned version; CI fails if not.
  • Every image reference AICR controls in-tree carries an @sha256: digest; CI fails if not. (CRD-triplet exemptions documented with reasons and upstream tracking; sha256-only enforcement via test(recipes): enforce sha256 specifically in digest-pin gate (CodeRabbit follow-up to #778) #779.)
  • Every component's signing/SBOM/provenance status is documented; gaps have an owner or a deploy-time verification policy.
  • A new component cannot land without satisfying all of the above (enforced in make qualify).
  • Air-gap, security-review, and customer compliance docs reference the BOM and the provenance matrix as authoritative.

Out of scope

  • AICR's own runtime images (aicr, aicrd, validators) — covered by goreleaser/SLSA tooling.
  • Digest-pinning chart-default sub-images (gpu-operator's ~14 sub-images, kube-prometheus-stack's ~8, etc.) — most charts don't expose a digest field, so the right answer is admission-time digest verification under Supply-chain provenance audit per component #745, not in-tree pinning.

Sequencing

Stage 1 (visibility)                   Stage 2 (reproducibility) ✅          Stage 3 (verification)
─────────────────────                  ─────────────────────────              ──────────────────────
#742 ✅ ──┬─► #741 ✅ (a✅, b✅)         #740 ✅ ──┬─► #748 ✅ ──► #749 ✅       #745 (+ 4 upstream issues)
#744 ✅   └─► #743                                └─►

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions