Skip to content

Conversation

@auto-submit
Copy link
Contributor

@auto-submit auto-submit bot commented Jan 24, 2025

Reverts: #4171

Initiated by: yjbanov

Reason for reverting: the new code seems to be landing on red check runs

Original PR Author: yjbanov

Reviewed By: {jtmcdole}

This change reverts the following previous change:
Implement the new break glass behavior for the monorepo world.

Fixes flutter/flutter#159898
Fixes flutter/flutter#132811

Below is an updated copy of flutter/flutter#159898 (comment):

Pull request

Two-state Merge Queue Guard

The guard can only be in two states:

  • Pending (yellow): this is the initial state, and the guard stays in this state for as long as the pull request is not allowed to enter the merge queue.
  • Success (green): the guard enters this state when the infra decides that the PR is allowed to enter the merge queue.

The state can only go from pending to success. There are no other state transitions.

Deciding when the guard goes green

Going from pending to green is the only transition we need to worry about. This is how we decide it:

  • Normal case: the normal workflow for landing a PR is for all the checks to go green. Once this happens, Cocoon closes the guard (by making it green). This will allow the autosubmit label to enqueue the PR, and it will allow the author to press "Merge When Ready".
  • Emergency case: the a PR must be landed despite failing checks (typically on a red tree status). The author adds the emergency label. Cocoon immediately unlocks the guard, ignoring any pending or failed checks. In conjunction with the autosubmit label, Cocoon will also automatically enqueue the PR after all tests pass even if the tree status is red. With an emergency Cocoon will also let the PR to jump the queue. However, if the PR must be landed right away, the author can use the "Merge When Ready" button manually as soon as Cocoon unlocks the merge guard.

Explainer

The above system makes everything simpler. There are fewer states (just pending and success), fewer transitions (just pending => success), and fewer situations to consider (normal and emergency). We don't need to do anything special w.r.t. the guard for the purposes of retrying failed flaky tests. It simply stays pending while the author is doing whatever is necessary to make the PR green enough to land: push fixes, retry check runs, rebase, get approvals, etc.

Importantly, it covers all situations we need to handle in PR management.

Why not have a third "failed" state?

The guard's job is not to tell you whether any tests are failing. You can already tell which tests are failing by looking at the status of respective check runs (this may change in the future, but when that time comes we'll find a new visual cue). The guard's job is only to tell you whether your PR is allowed to be enqueued. Permission to proceed never "fails". It's either granted (green), or not yet granted (pending). Once granted the PR is typically enqueued or landed immediately, so performing any other state transitions after green is mostly meaningless. Green is the final state of the guard, that's all.

The other reason for keeping the guard in two states is for simpler state management. Once green, the PR can immediately be enqueued. There's no transition from pending to failed, from failed to pending, or from failed to green. We can remove some of the existing Cocoon code around this, and we don't have to add anything new to support normal and emergency PR landing.

Merge group

There are two possible outcomes for a merge group:

  • Land: all check runs are green => Cocoon completes the merge guard as success and GitHub lands the merge group.
  • Fail: some check runs failed => Cocoon fails the merge guard and GitHub pulls the merge group (and the corresponding PR) out of the queue.

@auto-submit auto-submit bot added the revert of Bot Only: Tracking label for bot. Tracks new revert of pull requests. label Jan 24, 2025
@auto-submit auto-submit bot merged commit b9bd02b into main Jan 24, 2025
3 checks passed
auto-submit bot pushed a commit that referenced this pull request Jan 24, 2025
This relands commit b9bd02b.

The previous attempt was reverted in #4175 because of the suspicion that it was landing PRs with failed checks. However, that suspicion seems to have been wrong. This is what actually happened:

* 12:25PM: @polina-c creates the [PR](flutter/flutter#162106); Cocoon starts testing it and unlocks the Merge Queue Guard after all engine builds are done (old behavior).
* 1:02PM: I deploy the new Cocoon with the new behavior.
* 1:10PM: @polina-c hits the "Merge When Ready" button, and GitHub allows the PR to enter the queue.

GitHub allowed that last action because the guard was already unlocked by the old behavior. It wasn't the new code misbehaving. In fact, this is exactly the problem that the new code is aiming to fix.
@guidezpl guidezpl deleted the revert_331f9c015f3db9a6c1f424ecc544040a591c9fa5 branch November 18, 2025 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

revert of Bot Only: Tracking label for bot. Tracks new revert of pull requests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Merge Queue break glass behavior Remove the need to retrieve pull requests with the Graphql API

2 participants