Skip to content

Conversation

@Colvin-Y
Copy link

fix #1737
In production scenarios, the fields of pods are usually quite complex. Testing has shown that in our case, the descheduler's memory usage has been reduced by 2/5.

@k8s-ci-robot k8s-ci-robot requested a review from damemi August 25, 2025 09:16
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign a7i for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from jklaw90 August 25, 2025 09:16
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 25, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @Colvin-Y!

It looks like this is your first PR to kubernetes-sigs/descheduler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/descheduler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 25, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @Colvin-Y. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@googs1025
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 26, 2025

func newDescheduler(ctx context.Context, rs *options.DeschedulerServer, deschedulerPolicy *api.DeschedulerPolicy, evictionPolicyGroupVersion string, eventRecorder events.EventRecorder, sharedInformerFactory, namespacedSharedInformerFactory informers.SharedInformerFactory) (*descheduler, error) {
podInformer := sharedInformerFactory.Core().V1().Pods().Informer()
podInformer.SetTransform(preserveNeeded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using SetTransform seems to be very helpful for memory optimization, especially in large clusters. We can delete the unused pod node field. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

cc @ingvagabund

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this move here instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this move here instead?

ditto

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 26, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 29, 2025
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@Colvin-Y Colvin-Y force-pushed the enhancement/memoryusage branch from 3509437 to fb4bbfb Compare August 29, 2025 06:32
@a7i
Copy link
Contributor

a7i commented Aug 29, 2025

Nice change @Colvin-Y - do you have any before/after results on memory footprint you can share?

@ingvagabund How does a global change like this impact a plugin? For example, if I write my custom Descheduler plugin which relies on some of the omitted fields, can I override this behavior?

@Colvin-Y
Copy link
Author

Colvin-Y commented Aug 29, 2025

不错的改变@Colvin-Y - 您能否分享关于内存占用的前后结果?

hi @a7i ,
In our prod env, save 47% memory on avg

@Colvin-Y
Copy link
Author

Colvin-Y commented Aug 29, 2025

If I write my custom Descheduler plugin which relies on some of the omitted fields, can I override this behavior?

hi @a7i

Before cache start, you can override it. But maybe best practice is to modify func preserveNeeded because there are many plugins but using same cache.

@ingvagabund
Copy link
Contributor

ingvagabund commented Sep 15, 2025

Each plugin might tell which fields are necessary. I.e. through a new method in the similar fashion as scheduler's EventsToRegister. E.g. FieldsToPreserve or similar:

func (d *PodLifeTime) FieldsToPreserve() ([]fwk.ObjectFields, error) {
	return []fwk.ObjectFields{
		{Resource: fwk.Pod, Fields: {...}},
		{Resource: fwk.Node, Fields: {...}},
	}
}

This might be tricky to implement as each plugin will need different fields. So a simple clearing functions as in this PR will not suffice.

@Colvin-Y
Copy link
Author

/retest-required

1 similar comment
@ingvagabund
Copy link
Contributor

/retest-required

@ingvagabund
Copy link
Contributor

A naive solution would be to enumerate all fields. E.g.:

c.Command = nil
c.Args = nil
c.WorkingDir = ""
c.Ports = nil
...

into

const (
  POD_COMMAND = iota
  POD_ARGS
  POD_WORKING_DIR
  POD_PORTS
  ...

The same for nodes. Then, have each plugin to define a list of fields to preserve:

func func (pl *Plugin) PreservedField() [] uint {
  return []uint {POD_COMMAND, POD_ARGS, ...}
}

Then iterate through all plugins, make a union of all PreservedField() lists and use the list to tell which fields can be cleared:

if _, in := preservedFieldsMap[POD_COMMAND]; !in {
  c.Command = nil
}
if _, in := preservedFieldsMap[POD_ARGS]; !in {
  c.Args = nil
}
...

@Colvin-Y
Copy link
Author

Colvin-Y commented Nov 5, 2025

A naive solution would be to enumerate all fields. E.g.:

c.Command = nil
c.Args = nil
c.WorkingDir = ""
c.Ports = nil
...

into

const (
  POD_COMMAND = iota
  POD_ARGS
  POD_WORKING_DIR
  POD_PORTS
  ...

The same for nodes. Then, have each plugin to define a list of fields to preserve:

func func (pl *Plugin) PreservedField() [] uint {
  return []uint {POD_COMMAND, POD_ARGS, ...}
}

Then iterate through all plugins, make a union of all PreservedField() lists and use the list to tell which fields can be cleared:

if _, in := preservedFieldsMap[POD_COMMAND]; !in {
  c.Command = nil
}
if _, in := preservedFieldsMap[POD_ARGS]; !in {
  c.Args = nil
}
...

ditto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Optimize memory usage

5 participants