-
Notifications
You must be signed in to change notification settings - Fork 480
Add support for DRA devices in kueue workloads #5873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for DRA devices in kueue workloads #5873
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
|
Hi @alaypatel07. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
b9b6085 to
891d3c8
Compare
|
/ok-to-test |
6821c91 to
e54fb3f
Compare
| } | ||
|
|
||
| func (r *DynamicResourceAllocationConfigReconciler) SetupWithManager(mgr ctrl.Manager) error { | ||
| if !features.Enabled(features.DynamicResourceAllocation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check the feature gate inside the reconciler since the controller installation is guarded by the feature gate already.
0bbec13 to
6e6e2cf
Compare
bc817d9 to
1e4d9bd
Compare
Signed-off-by: Alay Patel <[email protected]>
Signed-off-by: Alay Patel <[email protected]>
Signed-off-by: Alay Patel <[email protected]>
… function Signed-off-by: Alay Patel <[email protected]>
Signed-off-by: Alay Patel <[email protected]>
Signed-off-by: Alay Patel <[email protected]>
bbae069 to
bbeac13
Compare
29d9c09 to
a8732e2
Compare
…gate check in event handlers Signed-off-by: Alay Patel <[email protected]>
a8732e2 to
badf257
Compare
|
/test pull-kueue-test-e2e-main-1-32 |
| wl.Spec.PodSets[0].Template.Spec.ResourceClaims = []corev1.PodResourceClaim{ | ||
| {Name: "device", ResourceClaimName: ptr.To("test-rc-large")}, | ||
| } | ||
| wl.Spec.PodSets[0].Template.Spec.Containers[0].Resources.Claims = []corev1.ResourceClaim{ | ||
| {Name: "device"}, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could replace this PodSets[0]... = with ResourceClaim() wrapper. Also in other places.
| "DynamicResourceAllocation=true", | ||
| }, | ||
| APIServerRuntimeConfig: []string{ | ||
| "resource.k8s.io/v1beta2=true", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Folks, can we please start with supporting v1 only? My thinking:
- we already have k8s v1.34 with v1 released, so the path for "early adopters" already exists
- v1beta2 is disabled by default in the core k8s as of 1.33 so users on most clouds (like GKE where the control-plane manifest is locked) would not benefit from this, just on-prem. The on prem users are often early adopters who can afford using v1.34 (see the k8s feature gate enablement history)
- we will need to migrate to v1 eventually, which just delays the process, and may slow down the integration in Kueue if we need some new fields (basically adoption of new DRA features)
- I have not yet had an explicit user request to support of DRA-Kueue integration, which suggests even the early adopters are probably ok to wait for the integration a little bit (and if they really really need, then we can suggest upgrading clusters to 1.34, which on clouds would need to be our answer anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with supporting 1.34 and on for this feature.
This may mess up our k8s support though.
I think we are saying that DRA + Kueue would only be available on 1.34 and on (so this feature would not be valid for 1.31, 1.32 or 1.33).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are saying that DRA + Kueue would only be available on 1.34 and on (so this feature would not be valid for 1.31, 1.32 or 1.33).
Yes, we can document that. It would not be valid for 1.31 anyway as 1.31 is v1beta1.
For 1.32 and 1.33, yes, but note that while DRA is Beta it remains disabled, which makes it anyway unsable on most clouds where the feature gates are locked: https://github.com/kubernetes/kubernetes/blob/69aca29e6def5873779cbd392bd9a1bad124c586/pkg/features/kube_features.go#L1236-L1240
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, as I synced with @alaypatel07 it turns out that some cloud providers have offering which enables DRA in 1.33 despite it is disabled in the OSS: EKS, and GKE, despite the feature being disabled in the core.
@johnbelamaric do you know why it was decided to go with this approach - disabled in core, but enabled by providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, for example it remains unclear if supported by Azure.
I think in Kueue we generally have the approach of supporting what is enabled in the core. OTOH, if there are some users waiting for the DRA-Kueue integration in 1.33, then I don't want to block anyone's business, so feel free to speak up.
I'm basically concerned with the future maintanence cost, if we don't have a clear justification for supporting v1beta2. Suppose scenario when the Kueue-DRA integration is in Beta, and we have adoption, but then we need to support a new field which is only in v1, then I'm afraid we will need to choose between breaking existing users in Beta, or moving forward fast and supporting the new feature (field).
So, I would like to make at least the support for "only v1" the graduation criteria for Kueue into Beta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GKE allows it but the user has to explicitly enable it. EKS I believe may enable it by default in order to support the NVIDIA GB200 on 1.33.
All new endpoints in K8s are disabled at beta by default. Feature gates are enabled by default at beta. That means that beta fields in existing v1 APIs will become available. But v1betaX endpoints are disabled by default, because the encoding in the API object meta data means massive migrations for customers when they write manifests that way.
I think it's OK to do v1 only, unless we have specific use cases like GB200 that people need to do in 1.33.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm EKS added it to support GB200 on 1.33
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have synced with @mwysokin and it seems we don't have users for which Kueue-DRA in 1.33 is a blocker.
Actually, we observe for now limited interest as the integration does not support yet TAS (Topology-Aware Scheduling) which is a major investment for Kueue, and recommended solution to consume new big hardware.
So, it seems for the adoption of the DRA-Kueue integration we need to support TAS in the picture which is maybe 0.15, but most likely 0.16+ (depends on the prioritizing of the effort).
Let's assume we support TAS+DRA in 0.16, this is Jan'26, by then k8s 1.34 will be much more widely used and we will have k8s 1.35. So, unlikely v1 would be a blocker for early adopters.
OTOH, I understand the temptation to support v1beta2 "just in case".
I have two open-ended proposals:
- if you see some potential adoption in 1.33, or just want to "stay safe" feel free to keep v1beta2, but please add graduation criteria to the KEP update for Beta to "migrate for using v1"
- otherwise let's just migrate to v1 in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont see any potential users for v1beta2 yet, just waiting for @jackfrancis to confirm. With that I would be happy to move to v1 in a follow-up PR
|
Thank you @alaypatel07, amazing work, and everyone contributing review comments 👍 While I still have some open comments, this comment. I believe it is ready to go as Alpha, we still have time to address this comment before the release. The PR is getting large, with complex history, so I believe it will be much more productive to merge and address follow ups, rather than resolving conflicts constantly. I double checked the new code is well isolated by the check for the feature gate enablement. /lgtm |
|
LGTM label has been added. Git tree hash: cd83bf17096a6e3315877a204fa11cbc914f1d59
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alaypatel07, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/release-note-edit |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR implements the support for DRA devices in Kueue workloads.
Which issue(s) this PR fixes:
Fixes #2941
Special notes for your reviewer:
Work Item Checklist for reviewers:
Nice to haves:
Notes:
Does this PR introduce a user-facing change?