-
Notifications
You must be signed in to change notification settings - Fork 65
blocked-edges: Details on bugs for 4.3.2 and 4.3.3 #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blocked-edges: Details on bugs for 4.3.2 and 4.3.3 #119
Conversation
f9b5e72 to
8dcce54
Compare
As described in 8dcce54 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
As described in 8dcce54 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
76d1f57 to
0c3d18a
Compare
|
Also block 4.1 -> 4.2 until 4.2.21, for the ingress operator metric port removal in 4.1 -> 4.2. |
0c3d18a to
2f347a8
Compare
As described in 8dcce54 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
2f347a8 to
1b6af73
Compare
As described in 8dcce54 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
|
@sdodson dropped UpdateBlocker from the memory-leak bug and it seems like a non-regression that is not update-specific, so I've pushed 2f347a8 -> 1b6af73 to remove references to this series from |
1b6af73 to
b10bb03
Compare
As described in 8dcce54 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
|
Multus series was only a minute or so of unreachable workloads, after which the issue resolves automatically. I dropped UpgradeBlocker from the 4.3 bug here, because we have been accepting brief workload unreachability since #40. I've updated this PR to remove Multus references with 1b6af73 -> b10bb03. That means that with this PR, the only remaining issue is the container port series. |
This is tricky. We found one issue, somehow we did a wrong diagnosis that it is an upgrade blocker. We blocked the edges as per the wrong diagnosis. Now we want to correct this in the files. I am sure this will happen in future too. |
LalatenduMohanty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
b10bb03 to
ce7fc11
Compare
As described in 8dcce54 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
ce7fc11 to
d9af1c2
Compare
As described in 8dcce54 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
As described in 3a903e1 (blocked-edges: Details on bugs for 4.3.2 and 4.3.3, 2020-03-14, openshift#119), for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2.
d9af1c2 to
5252c62
Compare
This was originally from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, #87), but upon further review it is not severe enough to warrant blocked edges or promotions [1]. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1805444#c9
This was originally from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, #87), but upon further review it is not a regression and so does not warrant blocked edges or promotions [1]. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1808429#c6
Expanding on the link from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, #87). Coarse blocks landed in 1161d00 (Added 4.3.2->4.3.3 to channels, 2020-03-06, openshift#100), but the leak only affects 4.y -> 4.(y+1) updates, since those are the only transitions where manifests dropped container ports [1]. Block 4.2 -> 4.3 for 4.3 before 4.3.5 because of the port bug. 4.2 -> 4.3.1 had been blocked before, although that block was lifted in c641bbd (channels/fast-4.2: Promote 4.2.18 (and 4.2.18+amd64 to fast-4.3), 2020-02-19, #60). I think 1161d00's broader .* blocks (which I'm relaxing to 4.2.* blocks) on the earlier 4.3 releases were because the Multus bug affects all updates, but, as explained in a5f394d (blocked-edges/4.3.*: Drop references to Multus bug 1805444, 2020-03-20, openshift#119), the Multus bug is actually not an update blocker. Also, no need to talk about these in the channel YAML files, since we aren't tombstoning the releases (we discovered the bug after marking the releases supported by tagging them into fast channels). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35
As described in b86ce4d (blocked-edges/4.3.*: Clarify effect of container-port leak (bug 1801300), 2020-03-13, openshift#119) and [1,2], for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1802248#c3
8be4df4 to
0e97e6a
Compare
|
Nothing in 4.3 -> 4.3.2 in Telemetry in the past week, so no known problems there. #100 doesn't actually describe 4.3 -> 4.3.2 or 4.3.3 failures that would call for blocking the edges. 4.3.2 was added to candidate in #51, which mentions concerns with 4.2 -> 4.3.2, but nothing about 4.3 -> 4.3.2. 4.3 -> 4.3.2 CI only covered 4.3.1 -> 4.3.2, leaving out 4.3.0 -> 4.3.2: $ oc adm release info quay.io/openshift-release-dev/ocp-release:4.3.2-x86_64 | grep Upgrades
Upgrades: 4.2.19, 4.3.0, 4.3.14.3.1 -> 4.3.2 CI had two successes (both AWS) and four failures (all Boskos CI flakes, so no bearing on update success). We could launch more CI coverage, but... There have been only two 4.3 -> 4.3.3 updates in Telemetry in the past week, both successful (4.3.2->4.3.3 and, !, 4.3.5 -> 4.3.3). Updates to 4.3.3 have good CI coverage, discussed in #57. Lots of bugs linked from discussion there. Going through them again, rhbz#1809296 seams severe enough to warrant blocking 4.3->4.3 until the fix landed in 4.3.5. So I've reshuffled here to: $ git --no-pager log --format='%h %s' origin/master..origin/pr/119
8be4df4 blocked-edges/4.2.*: Clarify effect of container-port leak (bug 1801300)
fcd2173 blocked-edges/4.3.*: Clarify effect of container-port leak (bug 1801300)
da5fe88 blocked-edges: Drop references to memory-leak bug 1808429
c239177 blocked-edges/4.3.*: Drop references to Multus bug 1805444
d14b4ad blocked-edges/4.3*: Block 4.3 -> 4.3 until 4.3 -> 4.3.5 for bug 1809296That's enough so that this PR will no longer expose any new edges (it only removes edges). /hold cancel |
|
/approve |
LalatenduMohanty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eparis, LalatenduMohanty, vrutkovs, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I'm not clear on when this was introduced, but we saw it in 4.3.2 -> 4.3.3 CI [1] and it wasn't fixed until 4.3.5 [2]. It's not clear to me if this is a regression or not, and we usually don't pull edges/releases unless an issue is a regression. But there is some concern about restoring 4.3 -> 4.3 edges [3], and leaning on this fixed-in-4.3.5 issue gives us some grounds for continuing to block 4.3 -> 4.3.2 and 4.3 -> 4.3.3. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774212#c7 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1809296#c10 [3]: openshift#119 (review)
Expanding on the link from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, openshift#87). Coarse blocks landed in 1161d00 (Added 4.3.2->4.3.3 to channels, 2020-03-06, openshift#100), but the leak only affects 4.y -> 4.(y+1) updates, since those are the only transitions where manifests dropped container ports [1]. Block 4.2 -> 4.3 for 4.3 before 4.3.5 because of the port bug. 4.2 -> 4.3.1 had been blocked before, although that block was lifted in c641bbd (channels/fast-4.2: Promote 4.2.18 (and 4.2.18+amd64 to fast-4.3), 2020-02-19, openshift#60). I think 1161d00's broader .* blocks (which I'm relaxing to 4.2.* blocks) on the earlier 4.3 releases were because the Multus bug affects all updates, but, as explained in a5f394d (blocked-edges/4.3.*: Drop references to Multus bug 1805444, 2020-03-20, openshift#119), the Multus bug is actually not an update blocker. Also, no need to talk about these in the channel YAML files, since we aren't tombstoning the releases (we discovered the bug after marking the releases supported by tagging them into fast channels). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35
As described in b86ce4d (blocked-edges/4.3.*: Clarify effect of container-port leak (bug 1801300), 2020-03-13, openshift#119) and [1,2], for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1802248#c3
I'm not clear on when this was introduced, but we saw it in 4.3.2 -> 4.3.3 CI [1] and it wasn't fixed until 4.3.5 [2]. It's not clear to me if this is a regression or not, and we usually don't pull edges/releases unless an issue is a regression. But there is some concern about restoring 4.3 -> 4.3 edges [3], and leaning on this fixed-in-4.3.5 issue gives us some grounds for continuing to block 4.3 -> 4.3.2 and 4.3 -> 4.3.3. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774212#c7 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1809296#c10 [3]: openshift#119 (review)
Expanding on the link from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, openshift#87). Coarse blocks landed in 1161d00 (Added 4.3.2->4.3.3 to channels, 2020-03-06, openshift#100), but the leak only affects 4.y -> 4.(y+1) updates, since those are the only transitions where manifests dropped container ports [1]. Block 4.2 -> 4.3 for 4.3 before 4.3.5 because of the port bug. 4.2 -> 4.3.1 had been blocked before, although that block was lifted in c641bbd (channels/fast-4.2: Promote 4.2.18 (and 4.2.18+amd64 to fast-4.3), 2020-02-19, openshift#60). I think 1161d00's broader .* blocks (which I'm relaxing to 4.2.* blocks) on the earlier 4.3 releases were because the Multus bug affects all updates, but, as explained in a5f394d (blocked-edges/4.3.*: Drop references to Multus bug 1805444, 2020-03-20, openshift#119), the Multus bug is actually not an update blocker. Also, no need to talk about these in the channel YAML files, since we aren't tombstoning the releases (we discovered the bug after marking the releases supported by tagging them into fast channels). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35
As described in b86ce4d (blocked-edges/4.3.*: Clarify effect of container-port leak (bug 1801300), 2020-03-13, openshift#119) and [1,2], for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1802248#c3
I'm not clear on when this was introduced, but we saw it in 4.3.2 -> 4.3.3 CI [1] and it wasn't fixed until 4.3.5 [2]. It's not clear to me if this is a regression or not, and we usually don't pull edges/releases unless an issue is a regression. But there is some concern about restoring 4.3 -> 4.3 edges [3], and leaning on this fixed-in-4.3.5 issue gives us some grounds for continuing to block 4.3 -> 4.3.2 and 4.3 -> 4.3.3. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774212#c7 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1809296#c10 [3]: openshift#119 (review)
Expanding on the link from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, openshift#87). Coarse blocks landed in 1161d00 (Added 4.3.2->4.3.3 to channels, 2020-03-06, openshift#100), but the leak only affects 4.y -> 4.(y+1) updates, since those are the only transitions where manifests dropped container ports [1]. Block 4.2 -> 4.3 for 4.3 before 4.3.5 because of the port bug. 4.2 -> 4.3.1 had been blocked before, although that block was lifted in c641bbd (channels/fast-4.2: Promote 4.2.18 (and 4.2.18+amd64 to fast-4.3), 2020-02-19, openshift#60). I think 1161d00's broader .* blocks (which I'm relaxing to 4.2.* blocks) on the earlier 4.3 releases were because the Multus bug affects all updates, but, as explained in a5f394d (blocked-edges/4.3.*: Drop references to Multus bug 1805444, 2020-03-20, openshift#119), the Multus bug is actually not an update blocker. Also, no need to talk about these in the channel YAML files, since we aren't tombstoning the releases (we discovered the bug after marking the releases supported by tagging them into fast channels). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35
As described in b86ce4d (blocked-edges/4.3.*: Clarify effect of container-port leak (bug 1801300), 2020-03-13, openshift#119) and [1,2], for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1802248#c3
I'm not clear on when this was introduced, but we saw it in 4.3.2 -> 4.3.3 CI [1] and it wasn't fixed until 4.3.5 [2]. It's not clear to me if this is a regression or not, and we usually don't pull edges/releases unless an issue is a regression. But there is some concern about restoring 4.3 -> 4.3 edges [3], and leaning on this fixed-in-4.3.5 issue gives us some grounds for continuing to block 4.3 -> 4.3.2 and 4.3 -> 4.3.3. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774212#c7 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1809296#c10 [3]: openshift#119 (review)
Expanding on the link from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, openshift#87). Coarse blocks landed in 1161d00 (Added 4.3.2->4.3.3 to channels, 2020-03-06, openshift#100), but the leak only affects 4.y -> 4.(y+1) updates, since those are the only transitions where manifests dropped container ports [1]. Block 4.2 -> 4.3 for 4.3 before 4.3.5 because of the port bug. 4.2 -> 4.3.1 had been blocked before, although that block was lifted in c641bbd (channels/fast-4.2: Promote 4.2.18 (and 4.2.18+amd64 to fast-4.3), 2020-02-19, openshift#60). I think 1161d00's broader .* blocks (which I'm relaxing to 4.2.* blocks) on the earlier 4.3 releases were because the Multus bug affects all updates, but, as explained in a5f394d (blocked-edges/4.3.*: Drop references to Multus bug 1805444, 2020-03-20, openshift#119), the Multus bug is actually not an update blocker. Also, no need to talk about these in the channel YAML files, since we aren't tombstoning the releases (we discovered the bug after marking the releases supported by tagging them into fast channels). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35
As described in b86ce4d (blocked-edges/4.3.*: Clarify effect of container-port leak (bug 1801300), 2020-03-13, openshift#119) and [1,2], for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1802248#c3
I'm not clear on when this was introduced, but we saw it in 4.3.2 -> 4.3.3 CI [1] and it wasn't fixed until 4.3.5 [2]. It's not clear to me if this is a regression or not, and we usually don't pull edges/releases unless an issue is a regression. But there is some concern about restoring 4.3 -> 4.3 edges [3], and leaning on this fixed-in-4.3.5 issue gives us some grounds for continuing to block 4.3 -> 4.3.2 and 4.3 -> 4.3.3. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1774212#c7 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1809296#c10 [3]: openshift#119 (review)
Expanding on the link from b1465b7 (Information on why 4.3.2 and 4.3.3 not in fast and stable channels of 4.3, 2020-03-03, openshift#87). Coarse blocks landed in 1161d00 (Added 4.3.2->4.3.3 to channels, 2020-03-06, openshift#100), but the leak only affects 4.y -> 4.(y+1) updates, since those are the only transitions where manifests dropped container ports [1]. Block 4.2 -> 4.3 for 4.3 before 4.3.5 because of the port bug. 4.2 -> 4.3.1 had been blocked before, although that block was lifted in c641bbd (channels/fast-4.2: Promote 4.2.18 (and 4.2.18+amd64 to fast-4.3), 2020-02-19, openshift#60). I think 1161d00's broader .* blocks (which I'm relaxing to 4.2.* blocks) on the earlier 4.3 releases were because the Multus bug affects all updates, but, as explained in a5f394d (blocked-edges/4.3.*: Drop references to Multus bug 1805444, 2020-03-20, openshift#119), the Multus bug is actually not an update blocker. Also, no need to talk about these in the channel YAML files, since we aren't tombstoning the releases (we discovered the bug after marking the releases supported by tagging them into fast channels). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35
As described in b86ce4d (blocked-edges/4.3.*: Clarify effect of container-port leak (bug 1801300), 2020-03-13, openshift#119) and [1,2], for the ingress operator metric port removal from 4.1 -> 4.2. Also SemVer sort the RC in candidate-4.2. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1801300#c35 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=1802248#c3
Expanding on the links from b1465b7 (#87).
Port bug:
4.4.0: rhbz#1801300, Bug 1801300: lib/resourcemerge: remove ports which are no longer required cluster-version-operator#322 which made it into 4.4.0-rc.0:
4.3.z: rbhz#1802710 Bug 1802710: lib/resourcemerge: remove ports which are no longer required cluster-version-operator#323 which made it into 4.3.3:
4.2.z: rhbz#1802248 Bug 1802248: lib/resourcemerge: remove ports which are no longer required cluster-version-operator#325 which made it into 4.2.21:
autoscaler and machine-API operator both removed their metrics port in 4.2 -> 4.3. So 4.2 clusters which update to 4.3 < 4.3.5 will hit this.
ingress operator removed its metrics port in 4.1 -> 4.2, so 4.1 clusters which update to 4.2 < 4.2.21 will hit this.
Multus bug:
4.5.0: rhbz#1805987
4.4.0: rhbz#1805774 Bug 1805774: Use the readiness indicator file option for Multus cluster-network-operator#507 Bug 1805774: Exposes readinessindicatorfile and uses wait.PollImmediate [backport 4.4] multus-cni#49 which made it into 4.4.0-rc.0:
4.3.z: https://bugzilla.redhat.com/show_bug.cgi?id=1805444 Bug 1805444: Uses the readiness indicator file option for Multus [backport 4.3] cluster-network-operator#485 Bug 1805444: Exposes readinessindicatorfile and uses wait.PollImmediate [backport 4.3] multus-cni#48 which made it into 4.3.5 (4.3.4 doesn't exist):
Not backported to 4.2.z. It's not clear to me what the situation on 4.2 is. Also not clear to me what sort of updates would trigger this issue; whether it is all *-> into broken releases? Just 4.2 -> broken 4.3? Broken 4.2 -> any 4.3? For now I'll just assume that 4.2 is not affected and * -> broken 4.3 is the only vulnerable case.
Memory leak:
4.5.0: MCO rhbz#1800319 Bug 1800319: kubelet: add kube reservation for kubernetes components machine-config-operator#1450 Bug 1800319: kubelet: bump CPU reservation back to 500m machine-config-operator#1476 origin rhbz#1802687 Bug 1802687: UPSTREAM: 88251: Partially fix incorrect configuration of kubepods.slice unit by kubelet origin#24568
4.4.0: origin rhbz#1806786 [release-4.4] Bug 1806786: UPSTREAM: 88251: Partially fix incorrect configuration of kubepods.slice unit by kubelet origin#24596 . The 4.5 MCO changes were not backported to 4.4. The origin change made it into 4.4.0-rc.0:
4.3.z: origin rhbz#1808429 [release-4.3] Bug 1808429: UPSTREAM: 88251: Partially fix incorrect configuration of kubepods.slice unit by kubelet origin#24611 rhbz#1801824 ASSIGNED (I'm not sure what this one is; possibly a dup of 1808429). The origin change did not make it into 4.3.5:
4.2.z: rhbz#1801826 Closed WONTFIX. rhbz#1810136 s380x NEW. Not sure if that's "does not affect 4.2.z" or "not important enough to fix on 4.2.z".
It's not clear to me that this is a regression. Does this actually affect update edges at all?
Also, no need to talk about these in the channel YAML files, since we aren't tombstoning the releases (we discovered the bug after marking the releases supported by tagging them into fast channels).
Also block * -> 4.3 for 4.3 before 4.3.5, now that we have the Multus bug to worry about. 4.2 -> 4.3.1 had been blocked before, although that block was lifted in c641bbd (#60). This isolates the two 4.3 RCs, but we don't commit to providing updates out of candidate releases anway. If we get pushback on that (unlikely now that 4.3 has been GA for so long), we can add rc -> 4.3.6 update edges or some such.