Skip to content

Conversation

@vikaschoudhary16
Copy link
Contributor

@vikaschoudhary16 vikaschoudhary16 commented Apr 26, 2019

lib does not remove containers from existing based on Required while merging.
Not sure, if this is by design or it is a bug. This PR is more for a discussion/clarification.

@openshift-ci-robot openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 26, 2019
@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from 1630720 to f698d4f Compare April 26, 2019 10:54
@abhinavdahiya abhinavdahiya added the 4.2 Openshift 4.2 label Apr 29, 2019
@smarterclayton
Copy link
Contributor

This surprises me that we dont do this

@vikaschoudhary16
Copy link
Contributor Author

ping @abhinavdahiya @crawford

@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from f698d4f to b7e52bc Compare June 6, 2019 11:46
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 6, 2019
}
}
}
existing.Containers = requiredContainers
Copy link
Member

@wking wking Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this (and the existing logic) square with the PodSpec docs':

Containers cannot currently be added or removed.

Is there wrapping logic that removes the previous pod if modified and adds a new Pod with the new PodSpec?

Copy link
Contributor Author

@vikaschoudhary16 vikaschoudhary16 Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new pod will get created as part of deployment redeploy. So container will not be deleted from existing pod. Yes, the wrapping logic is deployment rollout.

@vikaschoudhary16 vikaschoudhary16 changed the title lib/resourcemerge: Remove containers from existing based on required Bug 1710172: lib/resourcemerge: Remove containers from existing based on required Jun 7, 2019
@openshift-ci-robot
Copy link
Contributor

@vikaschoudhary16: This pull request references a valid Bugzilla bug.

Details

In response to this:

Bug 1710172: lib/resourcemerge: Remove containers from existing based on required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jun 7, 2019
@abhinavdahiya
Copy link
Contributor

abhinavdahiya commented Jun 7, 2019

Can you add a unit test for this new behavior. After that we can merge this

@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from b7e52bc to 8853dd0 Compare June 10, 2019 18:50
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 10, 2019
expected: corev1.PodSpec{},
},
{
name: "remove regular containers from existing but dont remove init containers",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not clear on why we want this. If "modified" means "launch a replacement Pod", then I'd expect we want to set modified and remove entries from InitContainers that weren't in the new required PodSpec. I've floated 8c83b0c with an implementation that you can squash into your branch if that makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was trying to play safe and do minimal changes which are required for the issue that i have been trying to fix.
Updated with changes from your commit. Thanks a lot!

@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from 8853dd0 to 4f4a65a Compare June 11, 2019 04:00
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 11, 2019
@wking
Copy link
Member

wking commented Jun 11, 2019

/approve
/hold

I'll wait for @abhinavdahiya to confirm that we want the InitContainers merging too, but 4f4a65a looks good to me.

@openshift-ci-robot openshift-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 11, 2019

if !equality.Semantic.DeepEqual(test.existing, test.expected) {
t.Errorf("mismatch PodSpec got: %v want: %v", test.existing, test.expected)
t.Errorf("mismatch PodSpec got:\n%s\nwant:\n%s", yamlOrDie(test.existing), yamlOrDie(test.expected))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to using the k8s lib (which we already vendor).

@abhinavdahiya
Copy link
Contributor

/lgtm
/hold cancel

@vikaschoudhary16
Copy link
Contributor Author

@wking @abhinavdahiya Following seems to be the problem:

  "type": "RetrievedUpdates",
        "status": "False",
        "lastTransitionTime": "2019-06-13T14:13:27Z",
        "reason": "RemoteFailed",
        "message": "Unable to retrieve available updates: currently installed version 0.0.1-2019-06-13-135753 not found in the \"stable-4.2\" channel"

Does not seem to be related to the changes. Thoughts?

@vikaschoudhary16
Copy link
Contributor Author

/retest

@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from 4f4a65a to 12b9310 Compare June 21, 2019 10:30
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jun 21, 2019
@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from 12b9310 to 7be5d44 Compare June 21, 2019 13:05
@vikaschoudhary16
Copy link
Contributor Author

/test e2e-aws-upgrade

@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch 3 times, most recently from 8a250cf to 2f055c4 Compare June 22, 2019 04:51
@vikaschoudhary16
Copy link
Contributor Author

/test integration

@vikaschoudhary16
Copy link
Contributor Author

integration job is failing unrelated to the PR changes, #211

}

func yamlOrDie(data interface{}) string {
bytes, err := yaml.Marshal(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is no longer required. Right?

for j, existingContainer := range *existing {
if existingContainer.Name == requiredContainer.Name {
existingCurr = &(*existing)[j]
ensureContainer(modified, existingCurr, requiredContainer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, Won't these containers already be ensured from previous loop??

Copy link
Contributor Author

@vikaschoudhary16 vikaschoudhary16 Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. But above, it dint work because existingContainer is a local variable and changes to that does not update the slice existing. Will fix, thanks!

@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from 2f055c4 to cfaef5f Compare June 24, 2019 04:17
@vikaschoudhary16 vikaschoudhary16 force-pushed the remove-containers-in-update branch from cfaef5f to 3d1ad76 Compare June 24, 2019 06:33
@abhinavdahiya
Copy link
Contributor

/approve

Working on getting integration tests fixed. so keeping the hold.

@vikaschoudhary16
Copy link
Contributor Author

/test integration

@abhinavdahiya
Copy link
Contributor

/retest

@abhinavdahiya
Copy link
Contributor

/hold cancel

/lgtm

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jun 26, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, vikaschoudhary16, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [abhinavdahiya,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 70c8b8f into openshift:master Jun 26, 2019
wking added a commit to wking/cluster-version-operator that referenced this pull request Dec 17, 2019
Avoid:

  $ go test ./lib/resourcemerge/
  panic: runtime error: index out of range [recovered]
  			 panic: runtime error: index out of range

  goroutine 38 [running]:
  testing.tRunner.func1(0xc0001ab000)
    .../sdk/go1.12.9/src/testing/testing.go:830 +0x392
  panic(0xccb520, 0x163f880)
    .../sdk/go1.12.9/src/runtime/panic.go:522 +0x1b5
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensureContainers(0xc0000bbd57, 0xc0001d4040, 0xc0001cd760, 0x1, 0x1)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:69 +0x840
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensurePodSpec(0xc0001c5d57, 0xc0001d4010, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc0001cd760, 0x1, ...)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:28 +0xc6
  github.com/openshift/cluster-version-operator/lib/resourcemerge.TestEnsurePodSpec.func1(0xc0001ab000)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core_test.go:276 +0xc7
  testing.tRunner(0xc0001ab000, 0xc0001d8770)
    .../sdk/go1.12.9/src/testing/testing.go:865 +0xc0
  created by testing.(*T).Run
    .../sdk/go1.12.9/src/testing/testing.go:916 +0x35a
  FAIL		github.com/openshift/cluster-version-operator/lib/resourcemerge	0.010s

(with the core_test.go but the old core.go) when removing an entry
mutated the existing slice without re-entering the:

  for i, whatever := range *existing

With this commit, we iterate from the back of the existing slice, so
any removals affect indexes that we've already covered.  For both
containers and service ports, any appends happen later in the
function, so we don't need to worry about slice expansion at this
point.

The buggy logic was originally from 3d1ad76 (Remove containers if
requested in Update, 2019-04-26, openshift#178).
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/cluster-version-operator that referenced this pull request Feb 4, 2020
Avoid:

  $ go test ./lib/resourcemerge/
  panic: runtime error: index out of range [recovered]
  			 panic: runtime error: index out of range

  goroutine 38 [running]:
  testing.tRunner.func1(0xc0001ab000)
    .../sdk/go1.12.9/src/testing/testing.go:830 +0x392
  panic(0xccb520, 0x163f880)
    .../sdk/go1.12.9/src/runtime/panic.go:522 +0x1b5
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensureContainers(0xc0000bbd57, 0xc0001d4040, 0xc0001cd760, 0x1, 0x1)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:69 +0x840
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensurePodSpec(0xc0001c5d57, 0xc0001d4010, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc0001cd760, 0x1, ...)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:28 +0xc6
  github.com/openshift/cluster-version-operator/lib/resourcemerge.TestEnsurePodSpec.func1(0xc0001ab000)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core_test.go:276 +0xc7
  testing.tRunner(0xc0001ab000, 0xc0001d8770)
    .../sdk/go1.12.9/src/testing/testing.go:865 +0xc0
  created by testing.(*T).Run
    .../sdk/go1.12.9/src/testing/testing.go:916 +0x35a
  FAIL		github.com/openshift/cluster-version-operator/lib/resourcemerge	0.010s

(with the core_test.go but the old core.go) when removing an entry
mutated the existing slice without re-entering the:

  for i, whatever := range *existing

With this commit, we iterate from the back of the existing slice, so
any removals affect indexes that we've already covered.  For both
containers and service ports, any appends happen later in the
function, so we don't need to worry about slice expansion at this
point.

The buggy logic was originally from 3d1ad76 (Remove containers if
requested in Update, 2019-04-26, openshift#178).
wking added a commit to wking/cluster-version-operator that referenced this pull request Feb 6, 2020
Avoid:

  $ go test ./lib/resourcemerge/
  panic: runtime error: index out of range [recovered]
  			 panic: runtime error: index out of range

  goroutine 38 [running]:
  testing.tRunner.func1(0xc0001ab000)
    .../sdk/go1.12.9/src/testing/testing.go:830 +0x392
  panic(0xccb520, 0x163f880)
    .../sdk/go1.12.9/src/runtime/panic.go:522 +0x1b5
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensureContainers(0xc0000bbd57, 0xc0001d4040, 0xc0001cd760, 0x1, 0x1)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:69 +0x840
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensurePodSpec(0xc0001c5d57, 0xc0001d4010, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc0001cd760, 0x1, ...)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:28 +0xc6
  github.com/openshift/cluster-version-operator/lib/resourcemerge.TestEnsurePodSpec.func1(0xc0001ab000)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core_test.go:276 +0xc7
  testing.tRunner(0xc0001ab000, 0xc0001d8770)
    .../sdk/go1.12.9/src/testing/testing.go:865 +0xc0
  created by testing.(*T).Run
    .../sdk/go1.12.9/src/testing/testing.go:916 +0x35a
  FAIL		github.com/openshift/cluster-version-operator/lib/resourcemerge	0.010s

(with the core_test.go but the old core.go) when removing an entry
mutated the existing slice without re-entering the:

  for i, whatever := range *existing

With this commit, we iterate from the back of the existing slice, so
any removals affect indexes that we've already covered.  For both
containers and service ports, any appends happen later in the
function, so we don't need to worry about slice expansion at this
point.

The buggy logic was originally from 3d1ad76 (Remove containers if
requested in Update, 2019-04-26, openshift#178).

This commit cherry-picks e8e80c2 (lib/resourcemerge/core: Fix panic
on container/port removal, 2019-12-16, openshift#282) back to the 4.2 branch.
But since f268b6d (Add support to EnsureServicePorts, 2019-11-25,
branch, I've only kept the container portion of e8e80c2 in this
commit.
wking added a commit to wking/cluster-version-operator that referenced this pull request Feb 6, 2020
Avoid:

  $ go test ./lib/resourcemerge/
  panic: runtime error: index out of range [recovered]
  			 panic: runtime error: index out of range

  goroutine 38 [running]:
  testing.tRunner.func1(0xc0001ab000)
    .../sdk/go1.12.9/src/testing/testing.go:830 +0x392
  panic(0xccb520, 0x163f880)
    .../sdk/go1.12.9/src/runtime/panic.go:522 +0x1b5
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensureContainers(0xc0000bbd57, 0xc0001d4040, 0xc0001cd760, 0x1, 0x1)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:69 +0x840
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensurePodSpec(0xc0001c5d57, 0xc0001d4010, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc0001cd760, 0x1, ...)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:28 +0xc6
  github.com/openshift/cluster-version-operator/lib/resourcemerge.TestEnsurePodSpec.func1(0xc0001ab000)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core_test.go:276 +0xc7
  testing.tRunner(0xc0001ab000, 0xc0001d8770)
    .../sdk/go1.12.9/src/testing/testing.go:865 +0xc0
  created by testing.(*T).Run
    .../sdk/go1.12.9/src/testing/testing.go:916 +0x35a
  FAIL		github.com/openshift/cluster-version-operator/lib/resourcemerge	0.010s

(with the core_test.go but the old core.go) when removing an entry
mutated the existing slice without re-entering the:

  for i, whatever := range *existing

With this commit, we iterate from the back of the existing slice, so
any removals affect indexes that we've already covered.  For both
containers and service ports, any appends happen later in the
function, so we don't need to worry about slice expansion at this
point.

The buggy logic was originally from 3d1ad76 (Remove containers if
requested in Update, 2019-04-26, openshift#178).

This commit cherry-picks e8e80c2 (lib/resourcemerge/core: Fix panic
on container/port removal, 2019-12-16, openshift#282) back to the 4.2 branch.
But since f268b6d (Add support to EnsureServicePorts, 2019-11-25, openshift#272)
and its EnsureServicePorts was never backported to the 4.2 branch,
I've only kept the container portion of e8e80c2 in this commit.
wking added a commit to wking/cluster-version-operator that referenced this pull request Feb 24, 2020
Avoid:

  $ go test ./lib/resourcemerge/
  panic: runtime error: index out of range [recovered]
  			 panic: runtime error: index out of range

  goroutine 38 [running]:
  testing.tRunner.func1(0xc0001ab000)
    .../sdk/go1.12.9/src/testing/testing.go:830 +0x392
  panic(0xccb520, 0x163f880)
    .../sdk/go1.12.9/src/runtime/panic.go:522 +0x1b5
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensureContainers(0xc0000bbd57, 0xc0001d4040, 0xc0001cd760, 0x1, 0x1)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:69 +0x840
  github.com/openshift/cluster-version-operator/lib/resourcemerge.ensurePodSpec(0xc0001c5d57, 0xc0001d4010, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc0001cd760, 0x1, ...)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core.go:28 +0xc6
  github.com/openshift/cluster-version-operator/lib/resourcemerge.TestEnsurePodSpec.func1(0xc0001ab000)
    .../lib/go/src/github.com/openshift/cluster-version-operator/lib/resourcemerge/core_test.go:276 +0xc7
  testing.tRunner(0xc0001ab000, 0xc0001d8770)
    .../sdk/go1.12.9/src/testing/testing.go:865 +0xc0
  created by testing.(*T).Run
    .../sdk/go1.12.9/src/testing/testing.go:916 +0x35a
  FAIL		github.com/openshift/cluster-version-operator/lib/resourcemerge	0.010s

(with the core_test.go but the old core.go) when removing an entry
mutated the existing slice without re-entering the:

  for i, whatever := range *existing

With this commit, we iterate from the back of the existing slice, so
any removals affect indexes that we've already covered.  For both
containers and service ports, any appends happen later in the
function, so we don't need to worry about slice expansion at this
point.

The buggy logic was originally from 3d1ad76 (Remove containers if
requested in Update, 2019-04-26, openshift#178).

This commit cherry-picks e8e80c2 (lib/resourcemerge/core: Fix panic
on container/port removal, 2019-12-16, openshift#282) back to the 4.2 branch.
But since f268b6d (Add support to EnsureServicePorts, 2019-11-25, openshift#272)
and its EnsureServicePorts was never backported to the 4.2 branch,
I've only kept the container portion of e8e80c2 in this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4.2 Openshift 4.2 approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants