Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented Oct 1, 2020

We're seeing errors like:

INFO: Unexpected error listing nodes: Get "https://api.ci-op-t47nsmsc-99b10.origin-ci-int-gce.dev.openshift.com:6443/api/v1/nodes?fieldSelector=spec.unschedulable%3Dfalse&resourceVersion=0": dial tcp 35.231.18.254:6443: i/o timeout

in CI jobs running on build02, and are wondering if these are related to the host cluster. Move some highly-impacted jobs over to build01 to see if that impacts the error rate.

Generated with:

$ sed -i 's/cluster: build02/cluster: build01/' ci-operator/jobs/openshift/ovn-kubernetes/openshift-ovn-kubernetes-master-presubmits.yaml

@trozet
Copy link
Contributor

trozet commented Oct 1, 2020

e2e-gcp-ovn is the job where we see the i/o timeout the most. I would say 80-90% of the time. So focus on that one.

…build01

We're seeing errors like [1]:

  INFO: Unexpected error listing nodes: Get "https://api.ci-op-t47nsmsc-99b10.origin-ci-int-gce.dev.openshift.com:6443/api/v1/nodes?fieldSelector=spec.unschedulable%3Dfalse&resourceVersion=0": dial tcp 35.231.18.254:6443: i/o timeout

in CI jobs running on build02, and are wondering if these are related
to the host cluster.  Move some highly-impacted jobs over to build01
to see if that impacts the error rate.

Generated by editing core-services/sanitize-prow-jobs/_config.yaml and
running 'make jobs'.

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/297/pull-ci-openshift-ovn-kubernetes-master-e2e-gcp-ovn/1311628791914696704#1:build-log.txt%3A19
@wking wking force-pushed the ovn-kube-build01 branch from b2271d3 to 2f517e1 Compare October 1, 2020 23:23
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wking
To complete the pull request process, please assign bbguimaraes, dcbw after the PR has been reviewed.
You can assign the PR to them by writing /assign @bbguimaraes @dcbw in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wking
Copy link
Member Author

wking commented Oct 1, 2020

From the first round of rehearsals, gcp-ovn passed. The only failures were ovn-hybrid-step-registry, operator-with-custom-vxlan-port, and openstack, which all died on install. I dunno how reliable those tests are outside of the hypothetical build02 connectivity issue.

I've pushed b2271d3986 -> 2f517e1, pivoting from sed to core-services/sanitize-prow-jobs/_config.yaml so we could land this. As a bonus, that kicks off a fresh round of rehearsals.

@trozet
Copy link
Contributor

trozet commented Oct 2, 2020

@wking the ovn-hybrid-step-registry job is AWS and usually stable. openstack never usually passes, and e2e-operator-with-custom-vxlan-port takes a few tries. I wouldn't be concerned about those last 2.

@trozet
Copy link
Contributor

trozet commented Oct 2, 2020

@trozet
Copy link
Contributor

trozet commented Oct 2, 2020

The e2e-aws-ovn failure is odd though. I've never seen this before:
error: stat /tmp/admin.kubeconfig: no such file or directory

https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/12386/rehearse-12386-pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn/1311809513090715648

@trozet
Copy link
Contributor

trozet commented Oct 2, 2020

/test e2e-gcp-ovn

@trozet
Copy link
Contributor

trozet commented Oct 2, 2020

/test e2e-aws-ovn

@openshift-ci-robot
Copy link
Contributor

@trozet: The specified target(s) for /test were not found.
The following commands are available to trigger jobs:

  • /test app-ci-config-dry
  • /test build01-dry
  • /test build02-dry
  • /test ci-operator-config
  • /test ci-operator-config-metadata
  • /test ci-operator-registry
  • /test ci-testgrid-allow-list
  • /test config
  • /test core-dry
  • /test core-valid
  • /test correctly-sharded-config
  • /test generated-config
  • /test generated-dashboards
  • /test ordered-prow-config
  • /test owners
  • /test pj-rehearse
  • /test prow-config
  • /test prow-config-filenames
  • /test prow-config-semantics
  • /test release-controller-config
  • /test services-dry
  • /test services-valid
  • /test step-registry-metadata
  • /test step-registry-shellcheck
  • /test vsphere-dry
  • /test pylint
  • /test yamllint

Use /test all to run the following jobs:

  • pull-ci-openshift-release-master-app-ci-config-dry
  • pull-ci-openshift-release-master-build01-dry
  • pull-ci-openshift-release-master-build02-dry
  • pull-ci-openshift-release-master-ci-operator-config
  • pull-ci-openshift-release-master-ci-operator-config-metadata
  • pull-ci-openshift-release-master-ci-operator-registry
  • pull-ci-openshift-release-master-config
  • pull-ci-openshift-release-master-core-dry
  • pull-ci-openshift-release-master-core-valid
  • pull-ci-openshift-release-master-correctly-sharded-config
  • pull-ci-openshift-release-master-generated-config
  • pull-ci-openshift-release-master-generated-dashboards
  • pull-ci-openshift-release-master-ordered-prow-config
  • pull-ci-openshift-release-master-owners
  • pull-ci-openshift-release-master-pj-rehearse
  • pull-ci-openshift-release-master-prow-config
  • pull-ci-openshift-release-master-prow-config-filenames
  • pull-ci-openshift-release-master-prow-config-semantics
  • pull-ci-openshift-release-master-release-controller-config
  • pull-ci-openshift-release-master-services-dry
  • pull-ci-openshift-release-master-services-valid
  • pull-ci-openshift-release-master-step-registry-metadata
  • pull-ci-openshift-release-master-step-registry-shellcheck
  • pull-ci-openshift-release-master-vsphere-dry
  • pull-ci-openshift-release-yamllint
Details

In response to this:

/test e2e-gcp-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@trozet: The specified target(s) for /test were not found.
The following commands are available to trigger jobs:

  • /test app-ci-config-dry
  • /test build01-dry
  • /test build02-dry
  • /test ci-operator-config
  • /test ci-operator-config-metadata
  • /test ci-operator-registry
  • /test ci-testgrid-allow-list
  • /test config
  • /test core-dry
  • /test core-valid
  • /test correctly-sharded-config
  • /test generated-config
  • /test generated-dashboards
  • /test ordered-prow-config
  • /test owners
  • /test pj-rehearse
  • /test prow-config
  • /test prow-config-filenames
  • /test prow-config-semantics
  • /test release-controller-config
  • /test services-dry
  • /test services-valid
  • /test step-registry-metadata
  • /test step-registry-shellcheck
  • /test vsphere-dry
  • /test pylint
  • /test yamllint

Use /test all to run the following jobs:

  • pull-ci-openshift-release-master-app-ci-config-dry
  • pull-ci-openshift-release-master-build01-dry
  • pull-ci-openshift-release-master-build02-dry
  • pull-ci-openshift-release-master-ci-operator-config
  • pull-ci-openshift-release-master-ci-operator-config-metadata
  • pull-ci-openshift-release-master-ci-operator-registry
  • pull-ci-openshift-release-master-config
  • pull-ci-openshift-release-master-core-dry
  • pull-ci-openshift-release-master-core-valid
  • pull-ci-openshift-release-master-correctly-sharded-config
  • pull-ci-openshift-release-master-generated-config
  • pull-ci-openshift-release-master-generated-dashboards
  • pull-ci-openshift-release-master-ordered-prow-config
  • pull-ci-openshift-release-master-owners
  • pull-ci-openshift-release-master-pj-rehearse
  • pull-ci-openshift-release-master-prow-config
  • pull-ci-openshift-release-master-prow-config-filenames
  • pull-ci-openshift-release-master-prow-config-semantics
  • pull-ci-openshift-release-master-release-controller-config
  • pull-ci-openshift-release-master-services-dry
  • pull-ci-openshift-release-master-services-valid
  • pull-ci-openshift-release-master-step-registry-metadata
  • pull-ci-openshift-release-master-step-registry-shellcheck
  • pull-ci-openshift-release-master-vsphere-dry
  • pull-ci-openshift-release-yamllint
Details

In response to this:

/test e2e-aws-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@trozet
Copy link
Contributor

trozet commented Oct 2, 2020

/retest

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 2, 2020

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/rehearse/openshift/ovn-kubernetes/master/e2e-aws-ovn 2f517e1 link /test pj-rehearse
ci/rehearse/openshift/ovn-kubernetes/master/e2e-openstack 2f517e1 link /test pj-rehearse
ci/rehearse/openshift/ovn-kubernetes/master/e2e-operator-with-custom-vxlan-port 2f517e1 link /test pj-rehearse
ci/rehearse/openshift/ovn-kubernetes/master/e2e-gcp-ovn 2f517e1 link /test pj-rehearse
ci/rehearse/openshift/ovn-kubernetes/master/e2e-azure-ovn 2f517e1 link /test pj-rehearse
ci/rehearse/openshift/ovn-kubernetes/master/e2e-gcp-ovn-upgrade 2f517e1 link /test pj-rehearse

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@trozet
Copy link
Contributor

trozet commented Oct 2, 2020

/test pj-rehearse

@dcbw
Copy link
Contributor

dcbw commented Oct 2, 2020

/retest

@wking
Copy link
Member Author

wking commented Oct 2, 2020

Covered by #12391.

/close

@openshift-ci-robot
Copy link
Contributor

@wking: Closed this PR.

Details

In response to this:

Covered by #12391.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the ovn-kube-build01 branch October 2, 2020 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants