ci-operator/jobs/openshift/ovn-kubernetes: Move master presubmits to build01 #12386

wking · 2020-10-01T20:35:57Z

We're seeing errors like:

INFO: Unexpected error listing nodes: Get "https://api.ci-op-t47nsmsc-99b10.origin-ci-int-gce.dev.openshift.com:6443/api/v1/nodes?fieldSelector=spec.unschedulable%3Dfalse&resourceVersion=0": dial tcp 35.231.18.254:6443: i/o timeout

in CI jobs running on build02, and are wondering if these are related to the host cluster. Move some highly-impacted jobs over to build01 to see if that impacts the error rate.

Generated with:

$ sed -i 's/cluster: build02/cluster: build01/' ci-operator/jobs/openshift/ovn-kubernetes/openshift-ovn-kubernetes-master-presubmits.yaml

trozet · 2020-10-01T21:53:41Z

e2e-gcp-ovn is the job where we see the i/o timeout the most. I would say 80-90% of the time. So focus on that one.

…build01 We're seeing errors like [1]: INFO: Unexpected error listing nodes: Get "https://api.ci-op-t47nsmsc-99b10.origin-ci-int-gce.dev.openshift.com:6443/api/v1/nodes?fieldSelector=spec.unschedulable%3Dfalse&resourceVersion=0": dial tcp 35.231.18.254:6443: i/o timeout in CI jobs running on build02, and are wondering if these are related to the host cluster. Move some highly-impacted jobs over to build01 to see if that impacts the error rate. Generated by editing core-services/sanitize-prow-jobs/_config.yaml and running 'make jobs'. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/297/pull-ci-openshift-ovn-kubernetes-master-e2e-gcp-ovn/1311628791914696704#1:build-log.txt%3A19

openshift-ci-robot · 2020-10-01T23:24:26Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wking
To complete the pull request process, please assign bbguimaraes, dcbw after the PR has been reviewed.
You can assign the PR to them by writing /assign @bbguimaraes @dcbw in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

wking · 2020-10-01T23:27:13Z

From the first round of rehearsals, gcp-ovn passed. The only failures were ovn-hybrid-step-registry, operator-with-custom-vxlan-port, and openstack, which all died on install. I dunno how reliable those tests are outside of the hypothetical build02 connectivity issue.

I've pushed b2271d3986 -> 2f517e1, pivoting from sed to core-services/sanitize-prow-jobs/_config.yaml so we could land this. As a bonus, that kicks off a fresh round of rehearsals.

trozet · 2020-10-02T01:51:51Z

@wking the ovn-hybrid-step-registry job is AWS and usually stable. openstack never usually passes, and e2e-operator-with-custom-vxlan-port takes a few tries. I wouldn't be concerned about those last 2.

trozet · 2020-10-02T01:53:22Z

The latest GCP run "failed" but it actually looks pretty good. No i/o timeout:
https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/12386/rehearse-12386-pull-ci-openshift-ovn-kubernetes-master-e2e-gcp-ovn/1311809513501757440

trozet · 2020-10-02T01:54:21Z

The e2e-aws-ovn failure is odd though. I've never seen this before:
error: stat /tmp/admin.kubeconfig: no such file or directory

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/12386/rehearse-12386-pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn/1311809513090715648

trozet · 2020-10-02T01:54:32Z

/test e2e-gcp-ovn

trozet · 2020-10-02T01:54:41Z

/test e2e-aws-ovn

openshift-ci-robot · 2020-10-02T01:54:47Z

@trozet: The specified target(s) for /test were not found.
The following commands are available to trigger jobs:

/test app-ci-config-dry
/test build01-dry
/test build02-dry
/test ci-operator-config
/test ci-operator-config-metadata
/test ci-operator-registry
/test ci-testgrid-allow-list
/test config
/test core-dry
/test core-valid
/test correctly-sharded-config
/test generated-config
/test generated-dashboards
/test ordered-prow-config
/test owners
/test pj-rehearse
/test prow-config
/test prow-config-filenames
/test prow-config-semantics
/test release-controller-config
/test services-dry
/test services-valid
/test step-registry-metadata
/test step-registry-shellcheck
/test vsphere-dry
/test pylint
/test yamllint

Use /test all to run the following jobs:

pull-ci-openshift-release-master-app-ci-config-dry
pull-ci-openshift-release-master-build01-dry
pull-ci-openshift-release-master-build02-dry
pull-ci-openshift-release-master-ci-operator-config
pull-ci-openshift-release-master-ci-operator-config-metadata
pull-ci-openshift-release-master-ci-operator-registry
pull-ci-openshift-release-master-config
pull-ci-openshift-release-master-core-dry
pull-ci-openshift-release-master-core-valid
pull-ci-openshift-release-master-correctly-sharded-config
pull-ci-openshift-release-master-generated-config
pull-ci-openshift-release-master-generated-dashboards
pull-ci-openshift-release-master-ordered-prow-config
pull-ci-openshift-release-master-owners
pull-ci-openshift-release-master-pj-rehearse
pull-ci-openshift-release-master-prow-config
pull-ci-openshift-release-master-prow-config-filenames
pull-ci-openshift-release-master-prow-config-semantics
pull-ci-openshift-release-master-release-controller-config
pull-ci-openshift-release-master-services-dry
pull-ci-openshift-release-master-services-valid
pull-ci-openshift-release-master-step-registry-metadata
pull-ci-openshift-release-master-step-registry-shellcheck
pull-ci-openshift-release-master-vsphere-dry
pull-ci-openshift-release-yamllint

Details

In response to this:

/test e2e-gcp-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot · 2020-10-02T01:54:55Z

@trozet: The specified target(s) for /test were not found.
The following commands are available to trigger jobs:

/test app-ci-config-dry
/test build01-dry
/test build02-dry
/test ci-operator-config
/test ci-operator-config-metadata
/test ci-operator-registry
/test ci-testgrid-allow-list
/test config
/test core-dry
/test core-valid
/test correctly-sharded-config
/test generated-config
/test generated-dashboards
/test ordered-prow-config
/test owners
/test pj-rehearse
/test prow-config
/test prow-config-filenames
/test prow-config-semantics
/test release-controller-config
/test services-dry
/test services-valid
/test step-registry-metadata
/test step-registry-shellcheck
/test vsphere-dry
/test pylint
/test yamllint

Use /test all to run the following jobs:

pull-ci-openshift-release-master-app-ci-config-dry
pull-ci-openshift-release-master-build01-dry
pull-ci-openshift-release-master-build02-dry
pull-ci-openshift-release-master-ci-operator-config
pull-ci-openshift-release-master-ci-operator-config-metadata
pull-ci-openshift-release-master-ci-operator-registry
pull-ci-openshift-release-master-config
pull-ci-openshift-release-master-core-dry
pull-ci-openshift-release-master-core-valid
pull-ci-openshift-release-master-correctly-sharded-config
pull-ci-openshift-release-master-generated-config
pull-ci-openshift-release-master-generated-dashboards
pull-ci-openshift-release-master-ordered-prow-config
pull-ci-openshift-release-master-owners
pull-ci-openshift-release-master-pj-rehearse
pull-ci-openshift-release-master-prow-config
pull-ci-openshift-release-master-prow-config-filenames
pull-ci-openshift-release-master-prow-config-semantics
pull-ci-openshift-release-master-release-controller-config
pull-ci-openshift-release-master-services-dry
pull-ci-openshift-release-master-services-valid
pull-ci-openshift-release-master-step-registry-metadata
pull-ci-openshift-release-master-step-registry-shellcheck
pull-ci-openshift-release-master-vsphere-dry
pull-ci-openshift-release-yamllint

Details

In response to this:

/test e2e-aws-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

trozet · 2020-10-02T01:56:58Z

/retest

openshift-ci-robot · 2020-10-02T03:04:38Z

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/rehearse/openshift/ovn-kubernetes/master/e2e-aws-ovn	`2f517e1`	link	`/test pj-rehearse`
ci/rehearse/openshift/ovn-kubernetes/master/e2e-openstack	`2f517e1`	link	`/test pj-rehearse`
ci/rehearse/openshift/ovn-kubernetes/master/e2e-operator-with-custom-vxlan-port	`2f517e1`	link	`/test pj-rehearse`
ci/rehearse/openshift/ovn-kubernetes/master/e2e-gcp-ovn	`2f517e1`	link	`/test pj-rehearse`
ci/rehearse/openshift/ovn-kubernetes/master/e2e-azure-ovn	`2f517e1`	link	`/test pj-rehearse`
ci/rehearse/openshift/ovn-kubernetes/master/e2e-gcp-ovn-upgrade	`2f517e1`	link	`/test pj-rehearse`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

trozet · 2020-10-02T04:13:51Z

/test pj-rehearse

dcbw · 2020-10-02T04:57:40Z

/retest

wking · 2020-10-02T21:56:47Z

Covered by #12391.

/close

openshift-ci-robot · 2020-10-02T21:57:04Z

@wking: Closed this PR.

Details

In response to this:

Covered by #12391.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot requested review from JacobTanenbaum and dcbw October 1, 2020 20:36

wking force-pushed the ovn-kube-build01 branch from 02ed594 to b2271d3 Compare October 1, 2020 20:37

wking force-pushed the ovn-kube-build01 branch from b2271d3 to 2f517e1 Compare October 1, 2020 23:23

wking mentioned this pull request Oct 1, 2020

ci-operator/jobs/openshift/release: Move 4.6 periodics to build02 #12383

Closed

tssurya mentioned this pull request Oct 2, 2020

[release-4.5] Bug 1878624: Invalid egressCIDR value causes sdn pods to fail on startup openshift/sdn#187

Merged

stbenjam mentioned this pull request Oct 2, 2020

Mandate e2e-metal-ipi on cluster-network-operator and ovn-kubernetes #12374

Merged

openshift-ci-robot closed this Oct 2, 2020

wking deleted the ovn-kube-build01 branch October 2, 2020 21:57

ci-operator/jobs/openshift/ovn-kubernetes: Move master presubmits to build01 #12386

ci-operator/jobs/openshift/ovn-kubernetes: Move master presubmits to build01 #12386

Uh oh!

Conversation

wking commented Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trozet commented Oct 1, 2020

Uh oh!

openshift-ci-robot commented Oct 1, 2020

Uh oh!

wking commented Oct 1, 2020

Uh oh!

trozet commented Oct 2, 2020

Uh oh!

trozet commented Oct 2, 2020

Uh oh!

trozet commented Oct 2, 2020

Uh oh!

trozet commented Oct 2, 2020

Uh oh!

trozet commented Oct 2, 2020

Uh oh!

openshift-ci-robot commented Oct 2, 2020

Uh oh!

openshift-ci-robot commented Oct 2, 2020

Uh oh!

trozet commented Oct 2, 2020

Uh oh!

openshift-ci-robot commented Oct 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trozet commented Oct 2, 2020

Uh oh!

dcbw commented Oct 2, 2020

Uh oh!

wking commented Oct 2, 2020

Uh oh!

openshift-ci-robot commented Oct 2, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wking commented Oct 1, 2020 •

edited

Loading

openshift-ci-robot commented Oct 2, 2020 •

edited

Loading