Skip to content

feat(ci): add aggregate merge-gate workflow#651

Merged
mchmarny merged 3 commits into
mainfrom
feat/merge-gate
Apr 23, 2026
Merged

feat(ci): add aggregate merge-gate workflow#651
mchmarny merged 3 commits into
mainfrom
feat/merge-gate

Conversation

@mchmarny
Copy link
Copy Markdown
Member

@mchmarny mchmarny commented Apr 23, 2026

Summary

Replace 7 explicit required status checks with a single aggregate merge-gate workflow so every applicable CI check blocks merge and new gating jobs can be added without admin ruleset changes.

Motivation / Context

Today the main branch ruleset gates merges on only 7 named checks. Other checks (GPU tests, Docker builds, vuln-scan, malware-scan, actionlint, verify-licenses) can fail without blocking merge. PR #587 demonstrated this: it merged while non-required GPU checks were still in flight, and those jobs failed later on main.

The required-check list is also fragile — every new workflow needs a manual admin ruleset update, and we've already seen drift.

This PR addresses both problems with a single aggregate gate that becomes the only required check.

Fixes: #605
Related: #559

Type of Change

  • Build/CI/tooling

Component(s) Affected

  • Other: CI workflows (.github/workflows/)

Implementation Notes

Architecture: real + skip job pairs

A check-paths job at the top of the workflow classifies what changed:

  • code — true when any changed file is NOT docs/md/LICENSE (uses predicate-quantifier: 'every' with dorny/paths-filter to detect docs-only PRs, then inverts)
  • actions — true when .github/workflows/** or .github/actions/** changed
  • deps — true when go.mod, go.sum, or vendor/** changed

Each gating check has a real job (runs when its path condition matches) and a companion -skip job (runs when it doesn't). Exactly one of each pair always runs, so the final gate job's needs: always resolves.

Gating set:

Check Conditional on
Test, Lint, CLI E2E, E2E, Security Scan code == true (via qualification.yaml)
CodeQL (analyze) code == true
Vulnerability scan (vuln-scan) code == true
Malware scan (malware-scan) code == true
Actionlint actions == true
Verify licenses deps == true

Advisory (not gated): GPU smoke, GPU inference, GPU training — per discussion on #605, these stay advisory until flake rate is assessed. They can be added later with the same real/skip pattern.

Deleted files: docs-only.yaml and docs-only-checks.yaml are superseded — the merge-gate handles docs-only PRs natively via skip jobs.

Existing standalone workflows unchanged: on-push.yaml, codeql.yaml, vuln-scan.yaml, actionlint.yaml, verify-licenses.yaml keep their current triggers. They continue running independently for push/schedule/code-scanning purposes.

Path to merge queue (#559): Once this merges, enabling merge queue requires only adding merge_group: types: [checks_requested] to merge-gate.yaml + one admin UI setting.

IMPORTANT

Admin action required post-merge: Update ruleset main (id 12304487) to replace the 7 current required checks with single required check: gate. This can only be done after the first successful run on main.

Testing

CI will validate the workflow runs correctly on this PR. The merge-gate workflow triggers on pull_request to main, so it will execute on this PR itself.

Key scenarios verified by design:

  • Code PR: code=true -> all code checks run, actionlint/licenses skip unless their paths match
  • Docs-only PR: code=false -> all code checks skip, skip jobs report success
  • Mixed PR (code + docs): code=true -> code checks run (same as code-only)
  • Workflow-only PR: code=true (workflow files are not docs), actions=true -> actionlint also runs

Risk Assessment

  • Low — Isolated change, well-tested, easy to revert

Rollout notes: The workflow runs alongside existing checks on this PR. The admin ruleset change (switching required checks to gate) happens post-merge as a separate step. Until then, the current 7 required checks continue gating merges — no disruption during rollout. If issues arise, the workflow can be deleted and the ruleset left unchanged.

Checklist

  • Tests pass locally (make test with -race)
  • Linter passes (make lint)
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality
  • I updated docs if user-facing behavior changed
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

Replace 7 explicit required status checks with a single aggregate
merge-gate workflow. Each gating check has a real + skip job pair
controlled by path classification, and a final gate job aggregates
all results.

Gating set: qualification (Test/Lint/CLI E2E/E2E/Security Scan),
CodeQL, vuln-scan, malware-scan, actionlint, verify-licenses.
GPU tests remain advisory until flake rate is assessed.

Deletes docs-only.yaml and docs-only-checks.yaml — the merge-gate
handles docs-only PRs natively via skip jobs.

Fixes #605
Related to #559
@mchmarny mchmarny requested a review from a team as a code owner April 23, 2026 13:03
@mchmarny mchmarny self-assigned this Apr 23, 2026
@coderabbitai

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@github-advanced-security

This comment was marked as resolved.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 23, 2026

Coverage Report ✅

Metric Value
Coverage 74.5%
Threshold 70%
Status Pass
Coverage Badge
![Coverage](https://img.shields.io/badge/coverage-74.5%25-green)

No Go source files changed in this PR.

mchmarny and others added 2 commits April 23, 2026 06:26
If check-paths fails, all downstream jobs are skipped (not failed),
which would silently pass the gate. Add check-paths to the gate
needs list and verify it succeeded before checking other results.
coderabbitai[bot]

This comment was marked as resolved.

@mchmarny mchmarny enabled auto-merge (squash) April 23, 2026 13:48
Copy link
Copy Markdown
Contributor

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - A nice step forward

@@ -0,0 +1,338 @@
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were living in 2050 already, we are the future!

@mchmarny mchmarny disabled auto-merge April 23, 2026 14:00
@mchmarny mchmarny merged commit 142c0d2 into main Apr 23, 2026
43 checks passed
@mchmarny mchmarny deleted the feat/merge-gate branch April 23, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Require all merge-gating CI to pass

3 participants