-
Notifications
You must be signed in to change notification settings - Fork 426
OCPBUGS-33896: add alert data to upgrade health in oc adm upgrade status #1740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-33896: add alert data to upgrade health in oc adm upgrade status #1740
Conversation
|
@PratikMahajan: This pull request references OTA-1157 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
7089ac4 to
2a052f4
Compare
| scopeKindClusterOperator allowedScopeKind = "ClusterOperator" | ||
| scopeKindNode allowedScopeKind = "Node" | ||
| scopeKindMachineConfigPool allowedScopeKind = "MachineConfigPool" | ||
| scopeKindAlert allowedScopeKind = "Alert" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we need this. This is used to tie insights to resources in the cluster - either we can do that and we will know the exact kind, or we can and we can have the list of related resources to be empty.
d1f9fe4 to
6349223
Compare
707e13b to
054436e
Compare
| var alertData AlertData | ||
| alertBytes, err := o.getAlerts(ctx) | ||
| if err != nil { | ||
| fmt.Println("Unable to fetch alerts from thanos, ignoring alerts in 'Update Health': ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "thanos" -> "Thanos", to match the casing preferred by the Thanos folks, for example:
Join users and companies that are using Thanos in production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although actually, this call-site has no idea that we're reaching out to Thanos; that's an implementation detail. I'd expect inspectalerts to be providing sufficient context for debugging failed requests if we fail to fetch alerts, and here we can probably just do:
fmt.Fprintf(o.ErrOut, "warning: Unable to fetch alerts, ignoring alerts in 'Update Health': %v\n", err)which will also get us logging this warning to ErrOut, as we already do here.
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| insights := parseAlertDataToInsights(tt.alertData, tt.startedAt) | ||
| if got := len(insights); got != tt.expectedCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's easier to write this kind of test, all you're testing is "are we skipping this alert or not?". It may be worth an expected []updateInsight for comparison with the returned value, so we can also check on things like "does the updateInsightImpact have the property values we expect for this alert?".
3b6bd3d to
d309f99
Compare
0e7e82f to
d5f52ff
Compare
|
/lgtm |
|
/label acknowledge-critical-fixes-only This is a part of a TechPreview feature that we want to deliver in 4.16, all functionality gated behind an opt-in envvar |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: petr-muller, PratikMahajan, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: petr-muller, PratikMahajan, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retitle OCPBUGS-33896: add alert data to upgrade health in oc adm upgrade status |
|
@PratikMahajan: This pull request references Jira Issue OCPBUGS-33896, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
|
@PratikMahajan: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@PratikMahajan: Jira Issue OCPBUGS-33896: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-33896 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[ART PR BUILD NOTIFIER] This PR has been included in build openshift-enterprise-cli-container-v4.17.0-202405180411.p0.g3214619.assembly.stream.el9 for distgit openshift-enterprise-cli. |
|
/cherry-pick release-4.16 |
|
@petr-muller: new pull request created: #1771 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Catching the docs up with 10c41f5 (add alerts to update health in oc adm upgrade status, 2024-04-25, openshift#1740). Also drop the enumeration that started when the file was created in 57e9844 (`adm upgrade status`: add fixture test, 2023-11-09, are can count for themselves, and this content is easier to maintain without having to bump later entry numbers when inserting an earlier entry.
Catching the docs up with 10c41f5 (add alerts to update health in oc adm upgrade status, 2024-04-25, openshift#1740). Also drop the enumeration that started when the file was created in 57e9844 (`adm upgrade status`: add fixture test, 2023-11-09, openshift#1595). Folks who care how many inputs and outputs there currently are can count for themselves, and this content is easier to maintain without having to bump later entry numbers when inserting an earlier entry.
adds alerts that fire during the upgrade to
upgrade healthsection.by default all the alerts that started firing after initiating the
upgrade will appear in the upgrade health section
we also have allowed alerts that will show alerts that started
firing before the upgrade was started.