Skip to content

OCPBUGS-66145: fix(e2e): Ensure release rollout in NodePool tests#7284

Draft
devguyio wants to merge 1 commit intoopenshift:mainfrom
devguyio:node-pool-tests-stability
Draft

OCPBUGS-66145: fix(e2e): Ensure release rollout in NodePool tests#7284
devguyio wants to merge 1 commit intoopenshift:mainfrom
devguyio:node-pool-tests-stability

Conversation

@devguyio
Copy link
Contributor

@devguyio devguyio commented Nov 25, 2025

What this PR does / why we need it:

Some node pool e2e tests modify a NodePool's configuration in Run() , which triggers rollouts that might affect CVO conditions.

This might cause failures in the after() validations of the e2e framework which expect that a HostedCluster is not progressing such as this failure

This PR introduces a change to wait for the HostedCluster to stabilize after running each NodePool test

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 25, 2025
@openshift-ci-robot
Copy link

@devguyio: This pull request explicitly references no jira issue.

Details

In response to this:

What this PR does / why we need it:

Some node pool e2e tests modify a NodePool's configuration in Run() , which triggers rollouts that might affect CVO conditions.

This might cause failures in the after() validations of the e2e framework which expect that a HostedCluster is not progressing such as this failure

This PR introduces a change to wait for the HostedCluster to stabilize after running each NodePool test

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

After running all NodePool test cases for a HostedCluster, the test re-fetches the NodePool and, if replicas indicate workers exist, waits up to 30 minutes for HostedCluster conditions to stabilize by calling e2eutil.ValidateHostedClusterConditions(...) before continuing with NodePool condition validation and final assertions.

Changes

Cohort / File(s) Change Summary
Test synchronization & post-test validation
test/e2e/nodepool_test.go
After nodePoolTest.Run() the code re-fetches the NodePool, determines hasWorkers from nodePool.Spec.Replicas, and when true calls e2eutil.ValidateHostedClusterConditions(..., true, 30*time.Minute) to wait for HostedCluster stabilization; then proceeds to existing validateNodePoolConditions and remaining assertions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify error handling when re-fetching the NodePool.
  • Confirm logic using nodePool.Spec.Replicas correctly indicates presence of workers.
  • Review CI/runtime implications of the 30-minute stabilization timeout.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 25, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 25, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@devguyio
Copy link
Contributor Author

/test e2e-aws

@openshift-ci openshift-ci bot added the area/testing Indicates the PR includes changes for e2e testing label Nov 25, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 25, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: devguyio
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@devguyio
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 34cc9c7 and 40979d8.

📒 Files selected for processing (1)
  • test/e2e/nodepool_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/nodepool_test.go
🧬 Code graph analysis (1)
test/e2e/nodepool_test.go (1)
test/e2e/util/util.go (1)
  • WaitForImageRollout (569-610)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Red Hat Konflux / hypershift-cli-mce-211-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-release-mce-211-on-pull-request

@devguyio devguyio force-pushed the node-pool-tests-stability branch from 40979d8 to e7209d7 Compare November 25, 2025 13:28
@devguyio
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/e2e/nodepool_test.go (1)

372-375: Consider deriving hasWorkerNodes from post‑test state instead of hard‑coding true

ValidateHostedClusterConditions treats hasWorkerNodes as a semantic switch: when false, it expects a different ClusterVersion condition set for clusters without workers (see the helper in test/e2e/util/util.go). Hard‑coding true here assumes every NodePool test always leaves the HostedCluster with workers; if any current or future test intentionally ends with zero workers, this wait can sit for the full 30‑minute timeout.

If that assumption isn’t strictly guaranteed, consider:

  • Computing hasWorkerNodes from the current cluster state after Run (e.g., listing worker nodes again), or
  • Threading an explicit “expects workers at end” flag from each NodePoolTest, or
  • At minimum, documenting in a comment that all NodePool tests must restore worker nodes before returning so this call remains valid.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 40979d8 and e7209d7.

📒 Files selected for processing (1)
  • test/e2e/nodepool_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/nodepool_test.go
🧬 Code graph analysis (1)
test/e2e/nodepool_test.go (1)
test/e2e/util/util.go (1)
  • ValidateHostedClusterConditions (2874-2903)

@devguyio devguyio force-pushed the node-pool-tests-stability branch from e7209d7 to fba0b87 Compare November 25, 2025 13:57
@devguyio
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between e7209d7 and fba0b87.

📒 Files selected for processing (1)
  • test/e2e/nodepool_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/nodepool_test.go
🧬 Code graph analysis (1)
test/e2e/nodepool_test.go (1)
test/e2e/util/util.go (1)
  • ValidateHostedClusterConditions (2874-2903)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Red Hat Konflux / hypershift-cli-mce-211-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-operator-main-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-release-mce-211-on-pull-request

@devguyio devguyio force-pushed the node-pool-tests-stability branch from fba0b87 to bb7159d Compare November 25, 2025 14:11
@devguyio
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@devguyio
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@devguyio
Copy link
Contributor Author

/test verify

@devguyio
Copy link
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

✅ Actions performed

Full review triggered.

@devguyio
Copy link
Contributor Author

@coderabbitai summary

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

✅ Actions performed

Summary regeneration triggered.

@devguyio
Copy link
Contributor Author

/test e2e-aws

@devguyio devguyio force-pushed the node-pool-tests-stability branch from bb7159d to a820809 Compare November 26, 2025 13:12
Some node pool e2e tests modify a NodePool's configuration in Run() ,
which triggers rollouts that might affect CVO conditions.

This might cause failures in the after() validations of the e2e
framework which expect that a HostedCluster is not progressing.

Wait for the HostedCluster to stabilize after running each NodePool test

Assisted-by: Claude (via Claude Code)
Signed-off-by: Ahmed Abdalla <aabdelre@redhat.com>
@devguyio devguyio force-pushed the node-pool-tests-stability branch from a820809 to 2541cc8 Compare November 26, 2025 13:13
@devguyio
Copy link
Contributor Author

/test e2e-aws

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 29, 2025
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@csrwng csrwng changed the title NO-JIRA: fix(e2e): Ensure release rollout in NodePool tests OCPBUGS-66145: fix(e2e): Ensure release rollout in NodePool tests Dec 15, 2025
@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Dec 15, 2025
@openshift-ci-robot
Copy link

@devguyio: This pull request references Jira Issue OCPBUGS-66145, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

What this PR does / why we need it:

Some node pool e2e tests modify a NodePool's configuration in Run() , which triggers rollouts that might affect CVO conditions.

This might cause failures in the after() validations of the e2e framework which expect that a HostedCluster is not progressing such as this failure

This PR introduces a change to wait for the HostedCluster to stabilize after running each NodePool test

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@csrwng
Copy link
Contributor

csrwng commented Dec 15, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 15, 2025
@openshift-ci-robot
Copy link

@csrwng: This pull request references Jira Issue OCPBUGS-66145, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (yli2@redhat.com), skipping review request.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 15, 2025

@devguyio: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws 2541cc8 link true /test e2e-aws
ci/prow/e2e-aws-4-21 2541cc8 link true /test e2e-aws-4-21
ci/prow/e2e-aks-4-21 2541cc8 link true /test e2e-aks-4-21

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/testing Indicates the PR includes changes for e2e testing do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants