Skip to content

Conversation

@sallyom
Copy link
Contributor

@sallyom sallyom commented Apr 14, 2020

e2e to delete clusteroperators' operand ns, ensure cluster recovery,
that is, all COs are Available=True Degraded=False Progressing=False,
except "kube-apiserver", deemed healthy with Progressing=True to meet e2e time limit.

This test deletes operand namespaces and waits for each to be terminated before deleting another. The recovery happens in parallel. This test is more to check that individual cluster operators recover from operand namespace deletion, rather than recovery from deletion of all operand namespaces at once.

This test is necessary because deletion of an operand namespace has an effect on another
clusteroperator's health. For example, deleting openshift-service-ca namespace causes openshift-console
to progress, degrade (if service-ca not recovered). kube-controller-manager co also affects other COs.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 14, 2020
@sallyom sallyom force-pushed the e2e-degrade-operators-check branch 9 times, most recently from 68816f7 to f448206 Compare April 16, 2020 14:53
@sallyom sallyom force-pushed the e2e-degrade-operators-check branch 2 times, most recently from 17a2448 to 79363b2 Compare April 17, 2020 22:18
@soltysh
Copy link
Contributor

soltysh commented Apr 23, 2020

/assign

@sallyom sallyom changed the title WIP: add extended clusteroperators delete operand namespace recover test add extended clusteroperators delete operand namespace recover test Apr 23, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 23, 2020
@sallyom sallyom force-pushed the e2e-degrade-operators-check branch 4 times, most recently from 80076a6 to 694fe3a Compare May 4, 2020 15:48
@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from 694fe3a to 0fae074 Compare June 4, 2020 20:06
@sallyom
Copy link
Contributor Author

sallyom commented Jun 18, 2020

/retest

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit and you're good to go.

o.Expect(err).ToNot(o.HaveOccurred(), fmt.Sprintf("Operators never became available: %s\n%s", strings.Join(operatorNames(operators.Items).Difference(operatorNames(healthy)).List(), ", "), buf.String()))
}
})
g.It("when operator-owned operand namespace is deleted [Disruptive]", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tags should be at the top next to g.Describe, also add there one more unique tag so we can ensure we select only these in that separate test-suite we want to add. Something like [OperatorRecovery], for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following suit w/ the other test-case in this file:

g.It("when operator-owned objects are deleted [Disruptive]", func() {
- I'll move the [Disruptive] for both above, in the g.Describe though.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2020
@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from 0fae074 to 304daed Compare July 6, 2020 15:09
@openshift-ci-robot
Copy link

@sallyom: The following commands are available to trigger jobs:

  • /test artifacts
  • /test e2e-aws-csi
  • /test e2e-aws-disruptive
  • /test e2e-aws-fips
  • /test e2e-aws-image-registry
  • /test e2e-aws-jenkins
  • /test e2e-aws-multitenant
  • /test e2e-aws-ovn
  • /test e2e-aws-serial
  • /test e2e-azure
  • /test e2e-cmd
  • /test e2e-conformance-k8s
  • /test e2e-gcp
  • /test e2e-gcp-builds
  • /test e2e-gcp-image-ecosystem
  • /test e2e-gcp-upgrade
  • /test e2e-vsphere
  • /test images
  • /test unit
  • /test verify
  • /test extended_gssapi
  • /test extended_ldap_groups
  • /test extended_networking

Use /test all to run the following jobs:

  • pull-ci-openshift-origin-master-e2e-aws-csi
  • pull-ci-openshift-origin-master-e2e-aws-fips
  • pull-ci-openshift-origin-master-e2e-aws-serial
  • pull-ci-openshift-origin-master-e2e-cmd
  • pull-ci-openshift-origin-master-e2e-gcp
  • pull-ci-openshift-origin-master-e2e-gcp-builds
  • pull-ci-openshift-origin-master-e2e-gcp-upgrade
  • pull-ci-openshift-origin-master-images
  • pull-ci-openshift-origin-master-unit
  • pull-ci-openshift-origin-master-verify
Details

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sallyom
Copy link
Contributor Author

sallyom commented Jul 8, 2020

/retest

@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from a739186 to a35e51c Compare January 27, 2021 23:30
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sallyom, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sallyom
Copy link
Contributor Author

sallyom commented Jan 27, 2021

/test e2e-aws-disruptive

@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from a35e51c to 54f2df0 Compare January 28, 2021 20:37
@sallyom
Copy link
Contributor Author

sallyom commented Jan 28, 2021

/test e2e-aws-disruptive

@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from 54f2df0 to 71f24ae Compare January 29, 2021 17:53
@sallyom
Copy link
Contributor Author

sallyom commented Jan 29, 2021

/test e2e-aws-disruptive

@sallyom
Copy link
Contributor Author

sallyom commented Jan 29, 2021

these are failing, and I think the var TEST_LIMIT_START_TIME="$(date +%s) is set for the rehearse job, that means the conformance tests should not fail due to conditions during disruptive test run.

Failing tests:
[sig-instrumentation] Prometheus when installed on the cluster should have important platform topology metrics [Suite:openshift/conformance/parallel]
[sig-instrumentation] Prometheus when installed on the cluster should provide ingress metrics [Suite:openshift/conformance/parallel]
[sig-instrumentation] Prometheus when installed on the cluster should start and expose a secured proxy and unsecured metrics [Suite:openshift/conformance/parallel]
[sig-instrumentation] Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early] [Suite:openshift/conformance/parallel]

@sallyom
Copy link
Contributor Author

sallyom commented Feb 2, 2021

/test e2e-aws-disruptive

@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from 71f24ae to 9706bb9 Compare February 2, 2021 16:59
@sallyom sallyom force-pushed the e2e-degrade-operators-check branch 2 times, most recently from 846548c to 7fa1a1a Compare February 16, 2021 22:45
@sallyom
Copy link
Contributor Author

sallyom commented Feb 17, 2021

/test e2e-gcp-disruptive
/test e2e-aws-disruptive

@sallyom
Copy link
Contributor Author

sallyom commented Feb 17, 2021

this e2e will be added to disruptive suite
#25774 fixes the etcd quorum restore disruptive test so rather than add a commit to skip that this can wait for #25774 to merge

@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from 7fa1a1a to 7f7bc40 Compare March 4, 2021 21:53
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 3, 2021
sallyom added 2 commits July 14, 2021 15:02
e2e to delete clusteroperators' operand ns, ensure cluster recovery,
that is, all COs are Available=True Degraded=False Progressing=False,
except "kube-apiserver", deemed healthy with Progressing=True to meet
e2e time limit.

This test is necessary because deletion of an operand namespace has an effect on another
clusteroperator's health.  For example, deleting openshift-service-ca namespace causes openshift-console
to progress, degrade (if service-ca not recovered).  kube-controller-manager co also affects other COs.
@sallyom sallyom force-pushed the e2e-degrade-operators-check branch from 7f7bc40 to 66b4b16 Compare July 14, 2021 19:08
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 14, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sallyom, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 14, 2021

@sallyom: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-disruptive 7f7bc40 link /test e2e-aws-disruptive
ci/prow/e2e-gcp-disruptive 7f7bc40 link /test e2e-gcp-disruptive
ci/prow/e2e-aws-jenkins 7f7bc40 link /test e2e-aws-jenkins
ci/prow/e2e-agnostic-cmd 66b4b16 link /test e2e-agnostic-cmd
ci/prow/e2e-gcp-csi 66b4b16 link /test e2e-gcp-csi
ci/prow/e2e-gcp-upgrade 66b4b16 link /test e2e-gcp-upgrade
ci/prow/e2e-metal-ipi-ovn-ipv6 66b4b16 link /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 14, 2021
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 13, 2021

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot closed this Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants