Skip to content

Conversation

@ssierrab
Copy link
Contributor

No description provided.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 24, 2025
@openshift-ci-robot
Copy link
Contributor

@ssierrab: This pull request references Jira Issue OCPBUGS-14246, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 24, 2025

Hi @ssierrab. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 24, 2025
@DavidHurta
Copy link
Contributor

/cc

@openshift-ci openshift-ci bot requested a review from DavidHurta October 25, 2025 00:35
@DavidHurta
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 27, 2025
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 27, 2025
@DavidHurta
Copy link
Contributor

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 27, 2025
@openshift-ci-robot
Copy link
Contributor

@DavidHurta: This pull request references Jira Issue OCPBUGS-14246, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @jiajliu

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from jiajliu October 27, 2025 12:13
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 27, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DavidHurta, ssierrab

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 27, 2025
@DavidHurta
Copy link
Contributor

DavidHurta commented Oct 27, 2025

@jiajliu, currently assigned as the QA contact for the bug, there was a PR that partly addressed the OCPBUGS-14246 - #1036

There is a comment from Petr on that PR:

Although this partially addresses OCPBUGS-14246, this PR does not change any code. It can be tested when OCPBUGS-14246 is fully fixed.

I believe we can now verify the entire "All Critical Alert Rules must have runbbok_url annotation" bug with this PR.

This PR has flown in mid-sprint without notice in the planning. Its priority is Normal. Feel free to verify the PR based on your discretion given the priority, Shift Week, and so on. I believe it is not urgent, thus it can wait a bit, no worries.

@jiajliu
Copy link

jiajliu commented Oct 28, 2025

cc @dis016 for ClusterVersionOperatorDown alert's runbook URL

@jiajliu
Copy link

jiajliu commented Oct 28, 2025

There is a comment from Petr on that PR:

Although this partially addresses OCPBUGS-14246, this PR does not change any code. It can be tested when OCPBUGS-14246 is fully fixed.

We will check both of prs in this sprint, but maybe after shift week:)

@dis016
Copy link

dis016 commented Nov 3, 2025

Test Scenario: Verify runbook url is available for alert ClusterVersionOperatorDown

  1. Install a 4.21 cluster with the PR.
launch 4.21.0-0.nightly-2025-10-30-060549,openshift/cluster-version-operator#1250 gcp
dinesh@Dineshs-MacBook-Pro ~ % oc get clusterversion 
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.21.0-0-2025-11-03-161951-test-ci-ln-44d537b-latest   True        False         4m20s   Cluster version is 4.21.0-0-2025-11-03-161951-test-ci-ln-44d537b-latest
dinesh@Dineshs-MacBook-Pro ~ % 
  1. Scale Down the CVO to generate the alert.
dinesh@Dineshs-MacBook-Pro ~ % oc get pods -n openshift-cluster-version
NAME                                       READY   STATUS    RESTARTS   AGE
cluster-version-operator-cf9ffc987-mkxqr   1/1     Running   0          45m
dinesh@Dineshs-MacBook-Pro ~ % 


dinesh@Dineshs-MacBook-Pro ~ % oc scale deployment cluster-version-operator --replicas=0 -n openshift-cluster-version
deployment.apps/cluster-version-operator scaled
dinesh@Dineshs-MacBook-Pro ~ % 

dinesh@Dineshs-MacBook-Pro ~ % oc get pods -n openshift-cluster-version                                              
No resources found in openshift-cluster-version namespace.
dinesh@Dineshs-MacBook-Pro ~ %
  1. Verify alert is generated and runbook url is available.
dinesh@Dineshs-MacBook-Pro ~ % token=$(oc -n openshift-monitoring create token prometheus-k8s)   
dinesh@Dineshs-MacBook-Pro ~ % route=`oc get route prometheus-k8s -n openshift-monitoring -ojsonpath='{.status.ingress[].host}'`  
dinesh@Dineshs-MacBook-Pro ~ % curl -s -k -H "Authorization: Bearer $token" https://$route/api/v1/alerts | jq -r '.data.alerts[]| select(.labels.alertname == "ClusterVersionOperatorDown")'   
{
  "labels": {
    "alertname": "ClusterVersionOperatorDown",
    "namespace": "openshift-cluster-version",
    "severity": "critical"
  },
  "annotations": {
    "description": "The operator may be down or disabled. The cluster will not be kept up to date and upgrades will not be possible. Inspect the openshift-cluster-version namespace for events or changes to the cluster-version-operator deployment or pods to diagnose and repair.  For more information refer to https://console-openshift-console.apps.ci-ln-44d537b-72292.gcp-2.ci.openshift.org/k8s/cluster/projects/openshift-cluster-version.",
    "runbook_url": "https://github.com/openshift/runbooks/blob/master/alerts/cluster-version-operator/ClusterVersionOperatorDown.md",
    "summary": "Cluster version operator has disappeared from Prometheus target discovery."
  },
  "state": "firing",
  "activeAt": "2025-11-03T17:36:03.011708277Z",
  "value": "1e+00"
}
dinesh@Dineshs-MacBook-Pro ~ % 

@dis016
Copy link

dis016 commented Nov 3, 2025

/verified by @dis016

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 3, 2025
@openshift-ci-robot
Copy link
Contributor

@dis016: This PR has been marked as verified by @dis016.

In response to this:

/verified by @dis016

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 3, 2025

@ssierrab: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit ac84d90 into openshift:main Nov 3, 2025
16 checks passed
@openshift-ci-robot
Copy link
Contributor

@ssierrab: Jira Issue OCPBUGS-14246: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-14246 has been moved to the MODIFIED state.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

DavidHurta added a commit to DavidHurta/origin that referenced this pull request Nov 4, 2025
The missing ClusterOperatorDown runbook URL was addressed in [1].
The missing ClusterVersionOperatorDown runbook URL was addressed in [2].

[1]: openshift/cluster-version-operator#1036
[2]: openshift/cluster-version-operator#1250
@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.21.0-0.nightly-2025-11-05-234508

@ssierrab ssierrab deleted the cvo_down_runbook branch November 7, 2025 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants