ci-operator/jobs/openshift/cluster-version-operator: Move master presubmits to GCP #10046

wking · 2020-07-02T18:04:23Z

AWS is currently at CI-capacity limits, so move over to GCP to get more elbow room. Generated by adjusting the config/ entry by hand and then running:

$ make update

I've make the names generic e2e and e2e-upgrade, because we don't care (from the CVO side) what platform these run on. Generic names will mean smaller diffs if we pivot the workflow and cluster_profile to a different provider in the future.

I've dropped e2e-aws-upi, because the CVO shouldn't care about that level of provisioning detail. We just need "a cluster". We've had the optional UPI job since ce9361b (#7125), where it was motivated by a desire to test the cluster_installer metric. Now that that metric is well-established, we can drop the option. And we can always regrow it if we need it again later.

LalatenduMohanty · 2020-07-02T18:25:45Z

I've dropped e2e-aws-upi, because the CVO shouldn't care about that level of provisioning detail. We just need "a cluster". We've had the optional UPI job since ce9361b (#7125), where it was motivated by a desire to test the cluster_installer metric. Now that that metric is well-established, we can drop the option. And we can always regrow it if we need it again later.

Can we continue e2e-aws-upi on AWS and move other tests to GCP ? Because I am not sure if we should take Now that that metric is well-established, we can drop the option for granted.

wking · 2020-07-02T20:41:17Z

Because I am not sure if we should take Now that that metric is well-established, we can drop the option for granted.

How often do you use a CVO UPI presubmit to check that metric? I expect anyone concerned about it can use a promotion informer. Or launch a job with cluster-bot. But searching in the CVO repo and popping open the merge-time report on run presubmits, this job seems to have only ever been used in openshift/cluster-version-operator#319 (merged Feb. 12) and openshift/cluster-version-operator#332 (merged Feb. 28). And again, if I'm wrong, it's easy to add it back.

sdodson · 2020-07-06T23:53:15Z

I'm in favor of switching entirely to gcp. I don't have an opinion on whether or not we should keep the optional upi test, i see no jobs in prow history today but that doesn't mean it's without value. I don't think we should change the name to drop the cloud provider, those are uniform across almost all presubmits as far as I'm aware and provides immediate context to anyone looking at job names.

wking · 2020-07-07T14:15:46Z

I'm in favor of switching entirely to gcp.

So for all releases back through 4.1? Can you explain why you want a single platform across all releases?

I don't think we should change the name to drop the cloud provider, those are uniform across almost all presubmits as far as I'm aware and provides immediate context to anyone looking at job names.

I disagree that we need this context, or that if we need it we should get it from the job name. But I'll reroll to punt for this PR and file a follow-up with generic names where we can kick that around.

sdodson · 2020-07-07T14:39:56Z

So for all releases back through 4.1? Can you explain why you want a single platform across all releases?

Sorry, I just meant I'm in favor of the change you're making but I'd also be in favor of switching 4.5 and 4.4 too. The number of clusters we start on AWS is really high and CVO is quite isolated from platform so it seems like we're easy to move. Even over the weekend we had 50 AWS, during the week we're maxed out 12 hours a day.

sdodson · 2020-07-07T14:40:42Z

I disagree that we need this context, or that if we need it we should get it from the job name. But I'll reroll to punt for this PR and file a follow-up with generic names where we can kick that around.

We don't need this context, but it's the norm for all jobs that provision clusters and we should follow the norm.

…er presubmits to GCP AWS is currently at CI-capacity limits, so move over to GCP to get more elbow room. Generated by adjusting the config/ entry by hand and then running: $ make update I'd initially just shifted master, because the bulk of our CI is on master-targeted pulls. But Scott asked for 4.4 and later [1], so that's what I'm doing now. I'd like to move to generic names like "e2e" and "e2e-upgrade", because we don't care (from the CVO side) what platform these run on. Generic names will mean smaller diffs if we pivot the workflow and cluster_profile to a different provider in the future. But Scott is not (yet ;) ready to break with tradition here [2], so punting on that for now. I've dropped "e2e-aws-upi", because the CVO shouldn't care about that level of provisioning detail. We just need "a cluster". We've had the optional UPI job since ce9361b (cluster-version-operator: add aws-upi optional job, 2020-02-11, openshift#7125), where it was motivated [3] by a desire to test the cluster_installer metric [4]. Now that that metric is well-established, we can drop the option. And we can always regrow it if we need it again later. [1]: openshift#10046 (comment) [2]: openshift#10046 (comment) [3]: openshift#7125 (comment) [4]: openshift/cluster-version-operator#319

wking · 2020-07-07T16:17:02Z

Ok, rerolled with 1ea50c8d91 -> 7ee21db to:

Make the same change for 4.4+, as requested here.
Punt on the generic job names, as requested here and here.

openshift-ci-robot · 2020-07-07T20:17:08Z

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/rehearse/openshift/cluster-version-operator/master/e2e	1ea50c8d9182968912c17bdd60ded78c2707df24	link	`/test pj-rehearse`
ci/rehearse/openshift/cluster-version-operator/master/e2e-upgrade	1ea50c8d9182968912c17bdd60ded78c2707df24	link	`/test pj-rehearse`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

LalatenduMohanty · 2020-07-08T15:10:56Z

/retest

LalatenduMohanty · 2020-07-08T15:12:27Z

/lgtm

openshift-ci-robot · 2020-07-08T15:12:48Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LalatenduMohanty, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~ci-operator/config/openshift/cluster-version-operator/OWNERS~~ [LalatenduMohanty,wking]
~~ci-operator/jobs/openshift/cluster-version-operator/OWNERS~~ [LalatenduMohanty,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2020-07-08T15:30:48Z

@wking: Updated the following 15 configmaps:

ci-operator-4.4-configs configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.4.yaml using file ci-operator/config/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.4.yaml
job-config-4.4 configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.4-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.4-presubmits.yaml
job-config-4.7 configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.7-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.7-presubmits.yaml
ci-operator-4.6-configs configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.6.yaml using file ci-operator/config/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.6.yaml
job-config-master configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-master-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-master-presubmits.yaml
job-config-4.7 configmap in namespace ci at cluster api.ci using the following files:
- key openshift-cluster-version-operator-release-4.7-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.7-presubmits.yaml
job-config-4.5 configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.5-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.5-presubmits.yaml
job-config-4.6 configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.6-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.6-presubmits.yaml
ci-operator-master-configs configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-master.yaml using file ci-operator/config/openshift/cluster-version-operator/openshift-cluster-version-operator-master.yaml
ci-operator-4.7-configs configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.7.yaml using file ci-operator/config/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.7.yaml
job-config-4.5 configmap in namespace ci at cluster api.ci using the following files:
- key openshift-cluster-version-operator-release-4.5-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.5-presubmits.yaml
job-config-4.6 configmap in namespace ci at cluster api.ci using the following files:
- key openshift-cluster-version-operator-release-4.6-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.6-presubmits.yaml
ci-operator-4.5-configs configmap in namespace ci at cluster app.ci using the following files:
- key openshift-cluster-version-operator-release-4.5.yaml using file ci-operator/config/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.5.yaml
job-config-master configmap in namespace ci at cluster api.ci using the following files:
- key openshift-cluster-version-operator-master-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-master-presubmits.yaml
job-config-4.4 configmap in namespace ci at cluster api.ci using the following files:
- key openshift-cluster-version-operator-release-4.4-presubmits.yaml using file ci-operator/jobs/openshift/cluster-version-operator/openshift-cluster-version-operator-release-4.4-presubmits.yaml

Details

In response to this:

AWS is currently at CI-capacity limits, so move over to GCP to get more elbow room. Generated by adjusting the config/ entry by hand and then running:
$ make update
I've make the names generic e2e and e2e-upgrade, because we don't care (from the CVO side) what platform these run on. Generic names will mean smaller diffs if we pivot the workflow and cluster_profile to a different provider in the future.

I've dropped e2e-aws-upi, because the CVO shouldn't care about that level of provisioning detail. We just need "a cluster". We've had the optional UPI job since ce9361b (#7125), where it was motivated by a desire to test the cluster_installer metric. Now that that metric is well-established, we can drop the option. And we can always regrow it if we need it again later.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

…p -> e2e for 4.4+ In 7ee21db (ci-operator/jobs/openshift/cluster-version-operator: Move 4.4 and later presubmits to GCP, 2020-07-02, openshift#10046), I'd mentioned generic names as a way for component teams to say "we don't care which platform, other folks can pick whatever they want to balance platform volume vs. capacity". Scott pushed back based on lack of precedent [1,2]. And we currently do rely on platforms in job-names for monitored success rates. But these are presubmits, and presubmits are noisy (e.g. if you pull-request some buggy code, your presubmits will fail, but the delivered product is not affected by your in-flight bugs). So in the presubmit case we are not constrained by job-name-based health reporting. And we run a lot of presubmit volume, so even if this approach only works in the presubmit case, it gives us a lot of rebalancing flexibility. Benefits for component teams (the CVO maintainers in the case of this commit) include not having to care about where their jobs get scheduled. And with the job names not changing during rebalances, they don't have to remember to /skip or /refresh or whatever, because Prow would understand that it's the same effective test regardless of the underlying platform. Benefits to rebalancing admins is that they don't have to ask component teams to opt-in at rebalance time. Teams can opt in with this pattern, and rebalancing admins can just search for e2e steps that do not include a platform slug in their 'as' config name. Generated by manually changing ci-operator/config/... and then running: $ make update [1]: openshift#10046 (comment) [2]: openshift#10046 (comment)

openshift-ci-robot requested review from LalatenduMohanty and sdodson July 2, 2020 18:04

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 2, 2020

wking force-pushed the aws-capacity-move-to-gcp branch from 1ea50c8 to 7ee21db Compare July 7, 2020 16:15

openshift-ci-robot assigned LalatenduMohanty Jul 8, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 8, 2020

openshift-merge-robot merged commit ac48d7a into openshift:master Jul 8, 2020

wking deleted the aws-capacity-move-to-gcp branch July 9, 2020 00:08

wking mentioned this pull request Jul 9, 2020

ci-operator/config/openshift/cluster-version-operator: Generic e2e-gcp -> e2e for 4.4+ #10152

Merged

ci-operator/jobs/openshift/cluster-version-operator: Move master presubmits to GCP #10046

ci-operator/jobs/openshift/cluster-version-operator: Move master presubmits to GCP #10046

Uh oh!

Conversation

wking commented Jul 2, 2020

Uh oh!

LalatenduMohanty commented Jul 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wking commented Jul 2, 2020

Uh oh!

sdodson commented Jul 6, 2020

Uh oh!

wking commented Jul 7, 2020

Uh oh!

sdodson commented Jul 7, 2020

Uh oh!

sdodson commented Jul 7, 2020

Uh oh!

wking commented Jul 7, 2020

Uh oh!

openshift-ci-robot commented Jul 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LalatenduMohanty commented Jul 8, 2020

Uh oh!

LalatenduMohanty commented Jul 8, 2020

Uh oh!

openshift-ci-robot commented Jul 8, 2020

Uh oh!

openshift-ci-robot commented Jul 8, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

LalatenduMohanty commented Jul 2, 2020 •

edited

Loading

openshift-ci-robot commented Jul 7, 2020 •

edited

Loading