-
Notifications
You must be signed in to change notification settings - Fork 65
channels: Add 'feeder' and 'tombstones' properties #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
channels: Add 'feeder' and 'tombstones' properties #75
Conversation
8c32599 to
0c41d57
Compare
|
Example output (as of 0c41d57): $ hack/graph-util.py stabilization-changes
INFO: considering promotions from candidate-4.2 to fast-4.2 after PT24H
INFO: recommended: 4.1.18 (81 days, 1:33:57.131135)
DEBUG: channels/fast-4: Promote 4.1.18 to fast-4.2
DEBUG: It was promoted the feeder candidate-4.2 by 2c088bc147 (channels: Convert to Eric's proposed channel format, 2019-12-05).
INFO: recommended: 4.1.20 (81 days, 0:53:55.131135)
DEBUG: channels/fast-4: Promote 4.1.20 to fast-4.2
DEBUG: It was promoted the feeder candidate-4.2 by b4db5a619e (channels: Catch up with production, 2019-12-05).
... |
|
Example output (as of dbbbbd7a8fb): $ hack/graph-util.py stabilization-changes
INFO: considering promotions from candidate-4.2 to fast-4.2 after PT24H
INFO: recommended: 4.1.34 (12 days, 13:34:53.517678)
DEBUG: channels/fast-4.2: Promote 4.1.34 to fast-4.2
DEBUG: It was promoted the feeder candidate-4.2 by 8e0a12f267 (candidate-4.1: add 4.1.34, 2020-02-12).
INFO: recommended: 4.2.20 (5 days, 13:04:12.517678)
DEBUG: channels/fast-4.2: Promote 4.2.20 to fast-4.2
DEBUG: It was promoted the feeder candidate-4.2 by 3eb8b2e776 (candidate-4.2: add 4.2.20, 2020-02-19).
... |
0c41d57 to
9cb2c0d
Compare
LalatenduMohanty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spelling of licensing needs to be fixed in The [contributing documentation](CONTRIBUTING.md) covers licencing and the usual Git flow
| Edit the appropriate file in `channels/`. | ||
| For example, to add a release to stable-4.2 you would edit `channels/stable-4.2.yaml`. | ||
| Channel semantics are documented [here][channel-semantics]. | ||
| For example, to add a release to candidate-4.2 you would edit [`channels/candidate-4.2.yaml`](channels/candidate-4.2.yaml). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/would/need to/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preexisting language; file a separate PR?
| which declares the intention that nodes and edges will be considered for promotion into `fast-4.2` after cooking for 24 hours in `candidate-4.2`. | ||
| The `delay` value is an [ISO 8601][rfc-3339-p13] [duration][iso-8601-durations]. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| which declares the intention that nodes and edges will be considered for promotion into `fast-4.2` after cooking for 24 hours in `candidate-4.2`. | ||
| The `delay` value is an [ISO 8601][rfc-3339-p13] [duration][iso-8601-durations]. | ||
|
|
||
| This is the expected delay, but it does not mean that promotion will happen at that moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By which automation is this delay used and how would one override it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By which automation is this delay used...
By a periodic that runs stabilization-changes and creates PRs for any recommended promotions. Then eventually, when we trust our automatic edge-pulling, we give the robot creds to approve their own PRs so they will auto-merge.
... and how would one override it?
Manually create pull requests if you don't want to wait the configured delay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If / when robots were auto-merging these PRs, how would release admins override the bot from promoting something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most important action is ART publishing the errata, which is when a release is declared supported, regardless of what we do in Cincinnati. So you could tombstone a release if you didn't want it auto-promoted. But if the release is only in candidate, you had better also tell ART to not ship the errata. And if the errata has shipped, tombstoning fast or stable promotions won't drop any supported-ness.
| _Note:_ Once we have phased release rollouts, we will drop the fast/stable distinction from this repository and promote to a unified fast/stable channel with a start time and rollout duration. | ||
| Until then, we are using fast channels to feed stable channels with a delay, just like candidate channels feed fast channels. | ||
|
|
||
| In this repository, the intended promotion flow is reflected by a `feeder` property in the channel declaration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need the feeder property if we reduce the chain to being candidate --> stable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its still useful, just in case the names would be less clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need the feeder property if we reduce the chain to being candidate --> stable?
We also have prerelease -> stable for 4.1. And we could theoretically also make a plural feeders and have multiple candidate-4.* feeding into candidate (#112). I like this being explicit, but don't really care if folks would rather bake the feeder directions into the recommending script (currently the stabilization-changes subcommand) instead of tracking it in the channel YAML.
| For example, [`channels/candidate-4.2.yaml`](channels/candidate-4.2.yaml) has: | ||
|
|
||
| ```yaml | ||
| tombstones: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose naming it frozen instead of tombstones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveej frozen does not communicate its intent. These are the nodes which can not have any future edge. So tombstones makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that tombstones doesn't communicate it well either, since these versions are still in use somewhere, which is the reason why we need to give them an additional property. Also, It would be nice to stick to a more technical or neutral term here.
How about immobilized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like tombstones more. These builds are similar to ghosts we don't speak of since they are dead to us and we don't want them coming back :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tombstone has an existing technical usage, which is similar to the usage here.
|
@eparis do we have comments on this PR. I am planning to merge this soon. |
There is a change of plan as discussed with OTA team. We are planning to deprecate the python script so not planning to add new functionality to the script. |
I second this. I see the scope of this PR to discuss the data model changes and their intended usage, therefore please see my review comments. |
|
/hold |
We are planning to deprecate the push-to-quay logic, which currently lives in the Python script. This promotion-recommendation logic is distinct; it's about when to create pull requests for this repository. I think we want it, and I don't think we want to include it in Cincinnati. I don't care if folks want to rewrite it in Rust or not. I can also put it in a separate Python script, instead of having it live in a subcommand of the existing script, if that would make folks more comfortable. Thoughts? |
As described in 17232e7 (Blocking edges to 4.3.11, 2020-04-13, openshift#170).
…rc.*, 4.4.0, and 4.4.2
We didn't want the Quay load of many clusters updating to the new release (which usually happens after fast/stable promotion) while the Quay folks were sorting out some capacity issues, so we decided to tombstone these releases and wait for the next week's.
count by (internal,channel) (
id_version_ebs_account_internal:cluster_subscribed
+ on (_id) group_left (channel)
topk by (_id) (1, cluster_version_available_updates)
+ on (_id) group_left (blah)
group by (_id) (cluster_version{type="completed",version="4.7.2",from_version="4.7.1"})
)
and similar for other releases show that we get the bulk of
fast-channel updates within the first 48h of adding a release to the
fast channels. Slow down stable a bit so we get that
Insights/Telemetry feeback before we add a new release to stable.
I forget why, but for some reason we didn't like this one and we recylced its errata for 4.4.23: $ curl -sH accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=candidate-4.4' | jq -r '.nodes[] | select(.version == "4.4.22" or .version == "4.4.23") | .version + " " + .metadata.url' 4.4.23 https://access.redhat.com/errata/RHBA-2020:3715 4.4.22 https://access.redhat.com/errata/RHBA-2020:3715
I forget why, but for some reason we didn't like this one and we recylced its errata for 4.4.26: $ curl -sH accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=candidate-4.4' | jq -r '.nodes[] | select(.metadata.url == "https://access.redhat.com/errata/RHBA-2020:3764").version' | sort -V 4.4.24 4.4.25 4.4.26
I forget why, but for some reason we didn't like this one and we recylced its errata for 4.4.29: $ curl -sH accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=candidate-4.4' | jq -r '.nodes[] | select(.metadata.url == "https://access.redhat.com/errata/RHBA-2020:4224").version' | sort -V 4.4.28 4.4.29
I forget why, but for some reason we didn't like this one and we recylced its errata for 4.5.1: $ curl -sH accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=candidate-4.5' | jq -r '.nodes[] | select(.metadata.url == "https://access.redhat.com/errata/RHBA-2020:2409").version' | sort -V 4.5.0 4.5.0-rc.1 ... 4.5.0-rc.7 4.5.1 4.5.1-rc.0 No need to tombstone the RCs, because they don't match the .feeder.filter regexp.
We've been evolving our hotfix process, and this one has no errata and was never promoted into fast. No need to promote it now. $ curl -sH accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=candidate-4.5' | jq -r '.nodes[] | select(.version == "4.5.0-0.hotfix-2020-11-28-021842")' { "version": "4.5.0-0.hotfix-2020-11-28-021842", "payload": "quay.io/openshift-release-dev/ocp-release@sha256:d49f94e6c6350b936f073ad91d9121f64012cebded6a9c059184f84997a9e621", "metadata": { "io.openshift.upgrades.graph.release.channels": "candidate-4.5,candidate-4.6", "io.openshift.upgrades.graph.release.manifestref": "sha256:d49f94e6c6350b936f073ad91d9121f64012cebded6a9c059184f84997a9e621" } }
I forget why, but for some reason we didn't like these and recycled their errata: $ curl -sH accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=candidate-4.5' | jq -r '.nodes as $n | [$n[].metadata.url] | unique[] | . as $url | ([$n[] | select(.metadata.url == $url).version]) as $versions | select(($versions | length) > 1) | $url + " " + ($versions | join(" "))' | sort 4.5.0-0.hotfix-2020-11-28-021842 4.5.0-0.hotfix-2020-08-24-185832 https://access.redhat.com/errata/RHBA-2020:0581 4.4.0-rc.11 4.4.0-rc.13 4.4.0-rc.12 4.4.0-rc.2 4.4.0-rc.0 4.4.0-rc.1 4.4.0 4.4.0-rc.4 4.4.3 4.4.2 4.4.0-rc.6 4.4.0-rc.7 4.4.0-rc.8 4.4.0-rc.9 4.4.0-rc.10 https://access.redhat.com/errata/RHBA-2020:2409 4.5.0-rc.4 4.5.0-rc.5 4.5.0-rc.2 4.5.0-rc.6 4.5.0-rc.7 4.5.1 4.5.1-rc.0 4.5.0 4.5.0-rc.1 https://access.redhat.com/errata/RHBA-2020:3715 4.4.22 4.4.23 https://access.redhat.com/errata/RHBA-2020:3719 4.5.10 4.5.11 https://access.redhat.com/errata/RHBA-2020:3760 4.5.13 4.5.12 https://access.redhat.com/errata/RHBA-2020:3764 4.4.25 4.4.26 4.4.24 https://access.redhat.com/errata/RHBA-2020:4224 4.4.29 4.4.28 https://access.redhat.com/errata/RHBA-2021:0033 4.5.25 4.5.27 https://access.redhat.com/errata/RHBA-2021:0231 4.5.30 4.5.29
I forget why, but for some reason we didn't like these and recycled their errata: $ curl -sH accept:application/json 'https://api.openshift.com/api/upgrades_info/v1/graph?channel=candidate-4.6' | jq -r '.nodes as $n | [$n[].metadata.url] | unique[] | . as $url | ([$n[] | select(.metadata.url == $url).version] | sort) as $versions | select(($versions | length) > 1) | $url + " " + ($versions | join(" "))' | sort | grep '4\.6\.[02]' 4.5.0-0.hotfix-2020-08-24-185832 4.5.0-0.hotfix-2020-11-28-021842 4.6.0-fc.0 4.6.0-fc.1 4.6.0-fc.2 4.6.0-fc.3 4.6.0-fc.4 4.6.0-fc.5 4.6.0-fc.7 4.6.0-fc.8 4.6.5 4.6.6 https://access.redhat.com/errata/RHBA-2020:4196 4.6.0 4.6.0-rc.0 4.6.0-rc.1 4.6.0-rc.2 4.6.0-rc.3 4.6.0-rc.4 4.6.1 https://access.redhat.com/errata/RHBA-2020:4339 4.6.2 4.6.3
These were supposed to be tombstoned in candidate based on recycled errata [1,2]. But they snuck into fast-4.6 with afa003f (Populate 4.5.z into candidate-4.6 and fast-4.6, 2020-11-05, openshift#526). And then I moved them from there into fast-4.5 with 1d9fe0f (channels: More backfilling (e.g. 4.3.0 in candidate-4.4), 2020-11-12, openshift#549). And they went into stable-4.6 with 1fb8768 (Add 4.5 to 4.6 upgrades to stable-4.6, 2020-12-14, openshift#585). This commit completes their accidental journey, since tombstoning them in fast would violate [3]: While the fast-4.5 channel contains releases as soon as their errata are published, releases are added to the stable-4.5 channel after a delay [1]: openshift#438 (comment) [2]: openshift#456 (comment) [3]: https://docs.openshift.com/container-platform/4.5/updating/updating-cluster-between-minor.html#stable-4-5-channel
| - 4.5.7 | ||
| - 4.5.8 | ||
| - 4.5.9 | ||
| - 4.5.10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're retroactively adding those to stable - is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, see discussion in 845ac82.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intended - 845ac82
|
/hold Just wanted to make sure I take a final look before we merge this. |
|
/hold cancel |
LalatenduMohanty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: LalatenduMohanty, vrutkovs, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
So we can automatically propose stabilizing channel promotions.
The implementation uses
git blame --first-parent ...to determine when releases landed in the feeder channel. Picking the versions out of the blame data by using a '- ' prefix is a bit of a hack, but it gets the job done for now.