OCPBUGS-50603: Create dedicated identity for KAS bootstrap container#6827
OCPBUGS-50603: Create dedicated identity for KAS bootstrap container#6827smrtrfszm wants to merge 2 commits intoopenshift:mainfrom
Conversation
|
@smrtrfszm: This pull request references Jira Issue OCPBUGS-50603, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis change introduces a new kas-bootstrap-container identity with dedicated RBAC bindings, PKI signers, certificates, and kubeconfig configuration. It extends the control plane operator and kas-bootstrap components to provision the bootstrap container with authentication credentials and appropriate cluster permissions. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @smrtrfszm. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/deployment.yaml (1)
115-135: Add HCCO kubeconfig env + mount to kas-bootstrap container (critical)kas-bootstrap reads KUBECONFIG_HCCO (kas-bootstrap/kas_boostrap.go) but the bootstrap container in control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/deployment.yaml does not set it or mount the hcco-localhost-kubeconfig secret — add the env var, volumeMount and volume as shown below.
- name: KUBECONFIG value: /var/secrets/localhost-kubeconfig/kubeconfig + - name: KUBECONFIG_HCCO + value: /var/secrets/hcco-localhost-kubeconfig/kubeconfig ... volumeMounts: - mountPath: /work name: bootstrap-manifests - mountPath: /var/secrets/localhost-kubeconfig name: localhost-kubeconfig + - mountPath: /var/secrets/hcco-localhost-kubeconfig + name: hcco-localhost-kubeconfig ... volumes: - emptyDir: {} name: bootstrap-manifests - name: localhost-kubeconfig secret: defaultMode: 0640 secretName: localhost-kubeconfig + - name: hcco-localhost-kubeconfig + secret: + defaultMode: 0640 + secretName: hcco-localhost-kubeconfigcontrol-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (1)
183-207: Wire HCCO kubeconfig into bootstrap for TPNU path too.kas-bootstrap reads KUBECONFIG_HCCO (kas-bootstrap/kas_boostrap.go:59); this fixture only sets KUBECONFIG and mounts localhost-kubeconfig — add the KUBECONFIG_HCCO env var, mount, and volume for hcco-localhost-kubeconfig. File: control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml — env block around lines 187-190, volumeMounts around 200-206, volumes around 479-562.
Apply this diff:
- name: KUBECONFIG value: /var/secrets/localhost-kubeconfig/kubeconfig + - name: KUBECONFIG_HCCO + value: /var/secrets/hcco-localhost-kubeconfig/kubeconfig ... volumeMounts: - mountPath: /work name: bootstrap-manifests - mountPath: /var/secrets/localhost-kubeconfig name: localhost-kubeconfig + - mountPath: /var/secrets/hcco-localhost-kubeconfig + name: hcco-localhost-kubeconfig ... volumes: - name: localhost-kubeconfig secret: defaultMode: 0640 secretName: localhost-kubeconfig + - name: hcco-localhost-kubeconfig + secret: + defaultMode: 0640 + secretName: hcco-localhost-kubeconfigcontrol-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (1)
183-207: Add HCCO kubeconfig mount + KUBECONFIG_HCCO env to bootstrap container (critical)kas-bootstrap reads KUBECONFIG_HCCO (kas-bootstrap/kas_boostrap.go:59); this fixture's bootstrap container neither mounts the hcco-localhost-kubeconfig Secret nor sets KUBECONFIG_HCCO, so the HCCO client cannot be used and FeatureGate updates will fall back to the default client.
File: control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml Lines: 183-207 (also duplicate at 477-583)
Apply this diff to wire it:
- name: KUBECONFIG value: /var/secrets/localhost-kubeconfig/kubeconfig + - name: KUBECONFIG_HCCO + value: /var/secrets/hcco-localhost-kubeconfig/kubeconfig ... volumeMounts: - mountPath: /work name: bootstrap-manifests - mountPath: /var/secrets/localhost-kubeconfig name: localhost-kubeconfig + - mountPath: /var/secrets/hcco-localhost-kubeconfig + name: hcco-localhost-kubeconfig ... volumes: - emptyDir: {} name: bootstrap-manifests - name: localhost-kubeconfig secret: defaultMode: 0640 secretName: localhost-kubeconfig + - name: hcco-localhost-kubeconfig + secret: + defaultMode: 0640 + secretName: hcco-localhost-kubeconfig
🧹 Nitpick comments (4)
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_hcco_localhost_kubeconfig_secret.yaml (1)
1-20: Secret fixture contains fake material; mark as test-only to quiet scanners.Gitleaks flags this. Add a comment/annotation indicating test fixture and fake data, or exclude testdata path in CI scanning rules.
Example annotation:
metadata: creationTimestamp: null + annotations: + security.openshift.io/test-fixture: "true"control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_hcco_localhost_kubeconfig_secret.yaml (1)
1-20: Same: add test-only marker or scanner exclusion.Mirror the annotation/exclusion here to avoid false-positive secret leaks in CI.
kas-bootstrap/kas_boostrap_test.go (1)
384-386: Consider simulating server-side apply more accurately in tests.The fake Apply client uses
Createinstead ofPatchwith Apply. While this works for testing, consider usingPatchwithclient.Applyto better simulate the actual server-side apply behavior once controller-runtime is upgraded.func (c *fakeApplyClient) Apply(ctx context.Context, obj *unstructured.Unstructured, opts ...client.PatchOption) error { - return c.client.Create(ctx, obj) + // TODO: Use Patch with client.Apply once controller-runtime is upgraded to >= 0.22 + // For now, Create simulates the apply behavior in tests + return c.client.Create(ctx, obj) }control-plane-operator/controllers/hostedcontrolplane/v2/kas/kubeconfig.go (1)
99-112: Consider refactoring to reduce code duplication.The
adaptHCCOLocalhostKubeconfigSecretfunction is nearly identical toadaptLocalhostKubeconfigSecretexcept for the client certificate secret used. Consider extracting a common helper function to reduce duplication.+func adaptLocalhostKubeconfigSecretWithClientCert(cpContext component.WorkloadContext, secret *corev1.Secret, clientCertSecret *corev1.Secret) error { + apiServerPort := util.KASPodPort(cpContext.HCP) + localhostURL := fmt.Sprintf("https://localhost:%d", apiServerPort) + kubeconfig, err := GenerateKubeConfig(cpContext, clientCertSecret, localhostURL) + if err != nil { + return fmt.Errorf("failed to generate kubeconfig: %w", err) + } + + if secret.Data == nil { + secret.Data = map[string][]byte{} + } + secret.Data[KubeconfigKey] = kubeconfig + return nil +} + func adaptLocalhostKubeconfigSecret(cpContext component.WorkloadContext, secret *corev1.Secret) error { - apiServerPort := util.KASPodPort(cpContext.HCP) - localhostURL := fmt.Sprintf("https://localhost:%d", apiServerPort) - kubeconfig, err := GenerateKubeConfig(cpContext, manifests.SystemAdminClientCertSecret(cpContext.HCP.Namespace), localhostURL) - if err != nil { - return fmt.Errorf("failed to generate kubeconfig: %w", err) - } - - if secret.Data == nil { - secret.Data = map[string][]byte{} - } - secret.Data[KubeconfigKey] = kubeconfig - return nil + return adaptLocalhostKubeconfigSecretWithClientCert(cpContext, secret, manifests.SystemAdminClientCertSecret(cpContext.HCP.Namespace)) } func adaptHCCOLocalhostKubeconfigSecret(cpContext component.WorkloadContext, secret *corev1.Secret) error { - apiServerPort := util.KASPodPort(cpContext.HCP) - localhostURL := fmt.Sprintf("https://localhost:%d", apiServerPort) - kubeconfig, err := GenerateKubeConfig(cpContext, manifests.HCCOClientCertSecret(cpContext.HCP.Namespace), localhostURL) - if err != nil { - return fmt.Errorf("failed to generate kubeconfig: %w", err) - } - - if secret.Data == nil { - secret.Data = map[string][]byte{} - } - secret.Data[KubeconfigKey] = kubeconfig - return nil + return adaptLocalhostKubeconfigSecretWithClientCert(cpContext, secret, manifests.HCCOClientCertSecret(cpContext.HCP.Namespace)) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (12)
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_hcco_localhost_kubeconfig_secret.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_controlplanecomponent.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_hcco_localhost_kubeconfig_secret.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_controlplanecomponent.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/deployment.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/hcco-localhost-kubeconfig.yaml(1 hunks)control-plane-operator/controllers/hostedcontrolplane/v2/kas/component.go(1 hunks)control-plane-operator/controllers/hostedcontrolplane/v2/kas/kubeconfig.go(1 hunks)kas-bootstrap/kas_boostrap.go(4 hunks)kas-bootstrap/kas_boostrap_test.go(5 hunks)
🧰 Additional context used
🪛 Gitleaks (8.27.2)
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_hcco_localhost_kubeconfig_secret.yaml
[high] 2-4: Possible Kubernetes Secret detected, posing a risk of leaking credentials/tokens from your deployments
(kubernetes-secret-yaml)
🔇 Additional comments (14)
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (1)
357-371: Filename prefix change looks good; ensures RBAC precedes FeatureGate.Using 98_ for the CRB and 99_ for FeatureGate provides deterministic ordering. LGTM.
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (1)
357-371: Consistent 98_ CRB ordering for TechPreviewNoUpgrade.Same rationale as non-TPNU path. Looks good.
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_controlplanecomponent.yaml (1)
50-52: New resource reference is correct and in expected position.Adding hcco-localhost-kubeconfig after hcco-kubeconfig matches generation flow.
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_controlplanecomponent.yaml (1)
50-52: TPNU resource list updated consistently.Entry for hcco-localhost-kubeconfig is correct.
control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/deployment.yaml (1)
244-258: Good: CRB file renamed to 98_ to enforce ordering.Matches test fixtures and intended RBAC-before-FeatureGate sequencing.
control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/hcco-localhost-kubeconfig.yaml (1)
1-10: Asset stub looks correct for controller fill-in.Empty data with label “kubeconfig: local” is appropriate; controller populates bytes.
kas-bootstrap/kas_boostrap.go (4)
59-67: LGTM! Proper HCCO kubeconfig setup for authentication.The addition of HCCO client configuration properly addresses the authentication issue where the Validating Admission Policy only permits changes from username
system:hosted-cluster-config.
201-257: Well-implemented client routing for FeatureGate resources.The implementation correctly:
- Routes FeatureGate resources to the HCCO client for proper authentication
- Uses server-side apply with appropriate field ownership tracking
- Converts objects to unstructured format for apply operations
186-191: Add/link a tracker issue to upgrade controller-runtime to >= v0.22 and remove this temporary Apply workaround.go.mod shows sigs.k8s.io/controller-runtime v0.20.4 with a replace -> v0.19.7, so the workaround in kas-bootstrap/kas_boostrap.go:186-191 remains necessary; add a tracking issue/PR (or link an existing one) with an ETA and remove this interface when upgraded.
59-59: Validate KUBECONFIG_HCCO at runtime and ensure deployments set it.kas-bootstrap/kas_boostrap.go:59 reads os.Getenv("KUBECONFIG_HCCO") without validation — add an explicit empty-check (fail fast or provide a safe default) and add KUBECONFIG_HCCO to the relevant deployment/CI manifests (repo search found no other occurrences).
kas-bootstrap/kas_boostrap_test.go (2)
375-394: Good test implementation with fake Apply client.The test properly simulates the dual-client setup with separate fake clients for the default and HCCO paths, which aligns well with the production implementation.
439-445: Excellent test coverage for client isolation.The test correctly verifies that:
- FeatureGate is created via the HCCO client
- The same resource is not present on the primary client
This ensures proper isolation between the two client paths.control-plane-operator/controllers/hostedcontrolplane/v2/kas/component.go (1)
57-60: LGTM! Proper placement of HCCO localhost kubeconfig adapter.The new adapter is correctly positioned after the local kubeconfig adapter and uses the appropriate adaptation function.
control-plane-operator/controllers/hostedcontrolplane/v2/kas/kubeconfig.go (1)
99-112: HCCO client cert secret is reconciled — no action required.manifests.HCCOClientCertSecret is defined in control-plane-operator/controllers/hostedcontrolplane/manifests/pki.go and is reconciled in control-plane-operator/controllers/hostedcontrolplane/kas_pki_setup.go (reconcileSub(..., pki.ReconcileHCCOClientCertSecret)); GenerateKubeConfig will fail if the secret is missing at runtime.
|
/jira refresh |
|
@jeffnowicki: This pull request references Jira Issue OCPBUGS-50603, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (xiuwang+1@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/ok-to-test |
a51dbe9 to
962cf62
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
kas-bootstrap/kas_boostrap.go (2)
209-255: Deterministic apply order required (ensure 98_ applies before 99_feature-gate.yaml).*filepath.Walk does not guarantee lexical order; if 99_feature-gate.yaml is visited before 98_hcco-rolebinding.yaml, the FeatureGate apply will fail and prevent the RBAC from ever being created, looping indefinitely. Collect, sort, then apply.
Apply this diff:
- // Walk the filesPath and apply the resources. - err := filepath.Walk(filesPath, func(path string, info os.FileInfo, err error) error { - if err != nil { - return fmt.Errorf("error accessing path %s: %w", path, err) - } - if info.IsDir() { - logger.Info("Skipping dir", "path", path) - return nil - } - if filepath.Ext(path) != ".yaml" { - logger.Info("Skipping non-yaml file", "path", path) - return nil - } - logger.Info("Processing file", "path", path) - content, err := os.ReadFile(path) - if err != nil { - return fmt.Errorf("failed to read file %s: %w", path, err) - } - obj, _, err := configCodecs.UniversalDeserializer().Decode(content, nil, nil) - if err != nil { - return fmt.Errorf("failed to decode file %s: %w", path, err) - } - unstructuredObj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(obj) - if err != nil { - return fmt.Errorf("failed to convert to unstructured: %w", err) - } - clientToUse := c - if _, ok := obj.(*configv1.FeatureGate); ok { - logger.Info("Using HCCO kubeconfig") - clientToUse = hccoClient - } - fieldOwner := filepath.Base(os.Args[0]) - if err := clientToUse.Apply(ctx, &unstructured.Unstructured{Object: unstructuredObj}, client.FieldOwner(fieldOwner), client.ForceOwnership); err != nil { - return fmt.Errorf("failed to apply file %s: %w", path, err) - } - return nil - }) - return err + // Collect candidate files, sort, then apply in lexical order. + var files []string + err := filepath.WalkDir(filesPath, func(path string, d fs.DirEntry, err error) error { + if err != nil { + return fmt.Errorf("error accessing path %s: %w", path, err) + } + if d.IsDir() { + return nil + } + ext := filepath.Ext(path) + if ext == ".yaml" || ext == ".yml" { + files = append(files, path) + } + return nil + }) + if err != nil { + return err + } + sort.Strings(files) + for _, path := range files { + logger.Info("Processing file", "path", path) + content, err := os.ReadFile(path) + if err != nil { + return fmt.Errorf("failed to read file %s: %w", path, err) + } + obj, _, err := configCodecs.UniversalDeserializer().Decode(content, nil, nil) + if err != nil { + return fmt.Errorf("failed to decode file %s: %w", path, err) + } + unstructuredObj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(obj) + if err != nil { + return fmt.Errorf("failed to convert to unstructured: %w", err) + } + clientToUse := c + if _, ok := obj.(*configv1.FeatureGate); ok { + logger.Info("Using HCCO kubeconfig") + clientToUse = hccoClient + } + if err := clientToUse.Apply(ctx, &unstructured.Unstructured{Object: unstructuredObj}, client.FieldOwner(fieldManagerName), client.ForceOwnership); err != nil { + return fmt.Errorf("failed to apply file %s: %w", path, err) + } + } + return nilAdd missing imports:
// imports (add) import ( "io/fs" "sort" )
151-158: Guard against empty renderedFeatureGate.Status.FeatureGates.Indexing position 0 panics when the rendered manifest lacks status entries (several fixtures show status: null). Guard and short‑circuit.
Apply this diff:
- desiredFeatureGates := renderedFeatureGate.Status.FeatureGates - currentVersion := renderedFeatureGate.Status.FeatureGates[0].Version + desiredFeatureGates := renderedFeatureGate.Status.FeatureGates + if len(renderedFeatureGate.Status.FeatureGates) == 0 { + logger.Info("Rendered FeatureGate has no status.featureGates; skipping status reconcile") + return nil + } + currentVersion := renderedFeatureGate.Status.FeatureGates[0].Version
🧹 Nitpick comments (9)
kas-bootstrap/kas_boostrap.go (3)
248-251: Stabilize SSA field manager name.Using os.Args[0] can vary across invocations; prefer a constant to keep ownership stable.
Apply this diff:
- fieldOwner := filepath.Base(os.Args[0]) - if err := clientToUse.Apply(ctx, &unstructured.Unstructured{Object: unstructuredObj}, client.FieldOwner(fieldOwner), client.ForceOwnership); err != nil { + if err := clientToUse.Apply(ctx, &unstructured.Unstructured{Object: unstructuredObj}, client.FieldOwner(fieldManagerName), client.ForceOwnership); err != nil {And add near the top-level declarations:
const fieldManagerName = "kas-bootstrap"
232-235: Multi‑document YAML not supported (only first doc decoded).If any manifest contains multiple docs (---), only the first is applied. Consider streaming decode via util/yaml.NewYAMLOrJSONDecoder and applying each doc.
Would you like a follow‑up patch to stream-decode multi-doc manifests?
186-189: Temporary SSA workaround: add TODO for upgrade tracking.Good stopgap. Add a TODO with an issue link to remove Apply shim once controller-runtime ≥ 0.22 is adopted.
kas-bootstrap/kas_boostrap_test.go (3)
384-387: Test Apply uses Create; acceptable here but note semantic differences.Create is not idempotent like SSA; fine for single-pass tests. Add a brief comment to avoid future confusion.
442-446: Assert NotFound explicitly for clarity.Strengthen the expectation to check apierrors.IsNotFound.
Apply this diff:
- err = c.Get(t.Context(), client.ObjectKey{Name: "cluster"}, featureGate2) - g.Expect(err).To(HaveOccurred()) + err = c.Get(t.Context(), client.ObjectKey{Name: "cluster"}, featureGate2) + g.Expect(apierrors.IsNotFound(err)).To(BeTrue())Add import:
import apierrors "k8s.io/apimachinery/pkg/api/errors"
396-453: Add a unit test to enforce deterministic apply order.Introduce a test that simulates unsorted enumeration (FeatureGate before CRB) and proves ordering logic applies 98_* before 99_*, preventing a permanent loop.
I can add a test that collects files, shuffles order, and asserts success only when sorted apply is used. Want me to push that?
control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/deployment.yaml (1)
137-139: Harden secret mount with readOnly: true.Mount the kubeconfig secret read-only.
Apply this diff:
- - mountPath: /var/secrets/hcco-localhost-kubeconfig - name: hcco-localhost-kubeconfig + - mountPath: /var/secrets/hcco-localhost-kubeconfig + name: hcco-localhost-kubeconfig + readOnly: truecontrol-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (1)
207-209: Align fixture with readOnly mount.Update to reflect readOnly: true for the secret mount.
Apply this diff:
- - mountPath: /var/secrets/hcco-localhost-kubeconfig - name: hcco-localhost-kubeconfig + - mountPath: /var/secrets/hcco-localhost-kubeconfig + name: hcco-localhost-kubeconfig + readOnly: truecontrol-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (1)
207-209: Align TechPreview fixture with readOnly mount.Apply this diff:
- - mountPath: /var/secrets/hcco-localhost-kubeconfig - name: hcco-localhost-kubeconfig + - mountPath: /var/secrets/hcco-localhost-kubeconfig + name: hcco-localhost-kubeconfig + readOnly: true
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (5)
control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml(5 hunks)control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml(5 hunks)control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/deployment.yaml(4 hunks)kas-bootstrap/kas_boostrap.go(4 hunks)kas-bootstrap/kas_boostrap_test.go(5 hunks)
🔇 Additional comments (9)
control-plane-operator/controllers/hostedcontrolplane/v2/assets/kube-apiserver/deployment.yaml (3)
123-125: LGTM: KUBECONFIG_HCCO wired to secret path.Please confirm the hcco-localhost-kubeconfig Secret data points to https://localhost:6443 and uses the HCCO client certs.
248-250: Renamed to 98_hcco-rolebinding.yaml — good.Couple this with deterministic sorting in kas-bootstrap to guarantee RBAC precedes FeatureGate.
340-343: LGTM: hcco-localhost-kubeconfig volume added.control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (3)
190-192: Fixture updated with KUBECONFIG_HCCO — good.
361-376: Fixture rename to 98_hcco-rolebinding.yaml — good.
487-491: Fixture volume for hcco-localhost-kubeconfig — good.control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_deployment.yaml (3)
190-192: TechPreview fixture updated with KUBECONFIG_HCCO — good.
361-376: TechPreview fixture rename to 98_hcco-rolebinding.yaml — good.
490-493: TechPreview fixture volume for hcco-localhost-kubeconfig — good.
kas-bootstrap/kas_boostrap.go
Outdated
| hccoCFG, err := clientcmd.BuildConfigFromFlags("", os.Getenv("KUBECONFIG_HCCO")) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get HCCO config: %w", err) | ||
| } | ||
| hccoClient, err := client.New(hccoCFG, client.Options{Scheme: configScheme}) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to create client with HCCO kubeconfig: %w", err) | ||
| } |
There was a problem hiding this comment.
Prevent unintended fallback to KUBECONFIG; require KUBECONFIG_HCCO explicitly.
If KUBECONFIG_HCCO is unset/empty, BuildConfigFromFlags may load the default KUBECONFIG, causing the HCCO client to use system:admin and fail Admission on FeatureGate forever. Enforce presence and readability of KUBECONFIG_HCCO.
Apply this diff:
- hccoCFG, err := clientcmd.BuildConfigFromFlags("", os.Getenv("KUBECONFIG_HCCO"))
- if err != nil {
- return fmt.Errorf("failed to get HCCO config: %w", err)
- }
+ hccoPath, ok := os.LookupEnv("KUBECONFIG_HCCO")
+ if !ok || strings.TrimSpace(hccoPath) == "" {
+ return fmt.Errorf("KUBECONFIG_HCCO not set; required to apply FeatureGate as system:hosted-cluster-config")
+ }
+ if _, err := os.Stat(hccoPath); err != nil {
+ return fmt.Errorf("HCCO kubeconfig path %q not accessible: %w", hccoPath, err)
+ }
+ hccoCFG, err := clientcmd.BuildConfigFromFlags("", hccoPath)
+ if err != nil {
+ return fmt.Errorf("failed to get HCCO config from %q: %w", hccoPath, err)
+ }Add missing imports:
// imports (add)
import (
"strings"
)🤖 Prompt for AI Agents
In kas-bootstrap/kas_boostrap.go around lines 59-66, enforce that
KUBECONFIG_HCCO is explicitly provided and readable (do not allow
clientcmd.BuildConfigFromFlags to fall back to default KUBECONFIG); check
os.Getenv("KUBECONFIG_HCCO") for non-empty (using strings.TrimSpace) and verify
the path is accessible (os.Stat or ioutil.ReadFile) before calling
clientcmd.BuildConfigFromFlags with that path, returning a clear error if
missing/unreadable, and add the requested import for "strings".
Test Resultse2e-aws
e2e-aks
|
|
/ok-to-test |
|
/retest |
|
@smrtrfszm: This pull request references Jira Issue OCPBUGS-50603, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (xiuwang+1@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| env: | ||
| - name: KUBECONFIG | ||
| value: /var/secrets/localhost-kubeconfig/kubeconfig | ||
| - name: KUBECONFIG_HCCO |
There was a problem hiding this comment.
Is there any particular reason why we don't always use HCCO identity for bootstrapping? I could very easily see us hitting this problem for other resources in the future and it seems wise for us to ensure that bootstrapping is not blocked by the validating admission policy.
There was a problem hiding this comment.
Because this same program gives the permissions to the hcco identity. We would have to apply the 98_hcco-rolebinding.yaml somewhere else.
There was a problem hiding this comment.
Have Red Hat engineers had a chance to look at this PR and provide their opinion?
There was a problem hiding this comment.
My preference would be to consider the rolebinding for the hcco a special thing (and even exclude it from the list of manifests). So invert what this PR does... use the old kubeconfig only for the hcco rolebinding (and only create it if not needed). The bootstrap program could just create/reconcile it inline. Once the HCCO has the power to do anything, then apply all bootstrap manifests using the hcco kubeconfig. That way future bootstrap manifests that we may disallow via VAP can still be applied.
There was a problem hiding this comment.
@csrwng I am a bit confused that you approved the PR yet would prefer things to be done differently. Can you please explain?
There was a problem hiding this comment.
per @sjenning, let's create a separate identity that is allowed to make the same updates as the HCCO. That way we can keep track of who makes the updates.
There was a problem hiding this comment.
Can you please explain?
I removed the approved label b/c I thought of a better way to handle this ^^.
There was a problem hiding this comment.
I've updated the PR with the new identity.
|
/retest-required |
|
/ok-to-test |
Add a dedicated client certificate and kubeconfig for the KAS bootstrap container. The kubeconfig uses localhost as the server URL so the bootstrap container can connect to the kube-apiserver within the same pod without going through the service. This is needed because the bootstrap container must apply manifests before the kube-apiserver is fully ready, and the service endpoint is not yet available at that point.
962cf62 to
62d8fca
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: smrtrfszm The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@smrtrfszm: This pull request references Jira Issue OCPBUGS-50603, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@TwoDCube: This pull request references Jira Issue OCPBUGS-50603, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Replace CreateOrUpdate with server-side apply for applying bootstrap manifests. CreateOrUpdate cannot apply resources generically as all resource types need to be special cased with a callback function. Introduce a two-client architecture: an admin (localhost) client applies the kas-bootstrap-container ClusterRoleBinding first to grant cluster-admin, then a second client using the kas-bootstrap-container identity applies the remaining manifests. The FeatureGate CR has a ValidatingAdmissionPolicy that restricts spec changes to the system:hosted-cluster-config user, so reconcileFeatureGate now uses the kas-bootstrap-container client and this user is also added to the ValidatingAdmissionPolicy so it can apply the FeatureGate spec.
62d8fca to
40fb921
Compare
|
/ok-to-test |
|
@smrtrfszm: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retest |
|
/test all |
|
/test e2e-aws |
|
/test e2e-aks |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@kas-bootstrap/kas_boostrap_test.go`:
- Around line 376-391: Update the workaround comment above
fakeApplyClient/newFakeApplyClient to accurately reflect that the fake client's
rejection of ApplyPatchType still exists in controller-runtime v0.22.4; change
the version note from "once controller-runtime is upgraded to >= 0.22" to
reference the correct future version (e.g., v0.23) or a specific patch that
resolves the fake client's "apply patches are not supported in the fake client"
error, and optionally include that error message to clarify why the Create-based
Apply shim is required until that version.
| // fakeApplyClient wraps a fake client.Client and implements the Apply interface | ||
| // by using Create instead of Patch with client.Apply, because the current | ||
| // controller-runtime version has a bug that prevents server-side apply with fake.Client. | ||
| // Once controller-runtime is upgraded to >= 0.22, this can be replaced. | ||
| type fakeApplyClient struct { | ||
| client.Client | ||
| } | ||
|
|
||
| func newFakeApplyClient() *fakeApplyClient { | ||
| c := fake.NewClientBuilder().WithScheme(configScheme).Build() | ||
| return &fakeApplyClient{Client: c} | ||
| } | ||
|
|
||
| func (c *fakeApplyClient) Apply(ctx context.Context, obj *unstructured.Unstructured, opts ...client.PatchOption) error { | ||
| return c.Client.Create(ctx, obj) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the actual controller-runtime version in go.mod
echo "=== Controller-runtime version ==="
rg 'sigs.k8s.io/controller-runtime' go.mod -A 1
echo ""
echo "=== Checking for fake.Client Apply method ==="
# Look for Apply-related methods in fake client
fd 'fake' -t f --max-depth 3 | head -20Repository: openshift/hypershift
Length of output: 412
🏁 Script executed:
#!/bin/bash
# Check support/testutil/fake.go to see if there's custom fake client code
echo "=== support/testutil/fake.go ==="
wc -l support/testutil/fake.go
head -50 support/testutil/fake.go
echo ""
echo "=== kas_boostrap_test.go context around lines 376-391 ==="
sed -n '370,400p' kas-bootstrap/kas_boostrap_test.go
echo ""
echo "=== Search for Apply method in test file ==="
rg 'func.*Apply' kas-bootstrap/kas_boostrap_test.go -A 3Repository: openshift/hypershift
Length of output: 3951
🏁 Script executed:
#!/bin/bash
# Search for any comments or issues about fake.Client Apply bug
echo "=== Search for fake.Client Apply in codebase ==="
rg 'fake\.Client.*Apply|Apply.*fake' --type go -B 2 -A 2 | head -40
echo ""
echo "=== Check if there are any TODO/FIXME comments about this ==="
rg 'TODO|FIXME' kas-bootstrap/kas_boostrap_test.go
echo ""
echo "=== Look for when this file was last modified ==="
git log --oneline --all -- kas-bootstrap/kas_boostrap_test.go | head -10
echo ""
echo "=== Check git blame for the fakeApplyClient comment ==="
git blame kas-bootstrap/kas_boostrap_test.go | sed -n '376,379p'Repository: openshift/hypershift
Length of output: 5518
🏁 Script executed:
#!/bin/bash
# Search for any issues or discussions about fake.Client.Apply in controller-runtime
echo "=== Check go.mod for any comments about the replace directive ==="
sed -n '1,50p' go.mod | grep -A 5 -B 5 'controller-runtime'
echo ""
echo "=== Search for any other references to this bug in comments ==="
rg 'fake.*Client|controller-runtime.*bug' --type go -i | head -20
echo ""
echo "=== Check the commit message for more context ==="
git show --no-patch 40fb92107 | head -30Repository: openshift/hypershift
Length of output: 3858
🏁 Script executed:
#!/bin/bash
# Look deeper into the fake client implementation to understand the Apply limitations
echo "=== Check fake.Client.Apply implementation in vendor ==="
fd 'fake' vendor/sigs.k8s.io/controller-runtime -type f -name '*.go' | head -10
echo ""
echo "=== Search for Apply method in fake client ==="
rg 'func.*Apply' vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/ -A 2
echo ""
echo "=== Check doc.go for known issues ==="
cat vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/doc.goRepository: openshift/hypershift
Length of output: 2019
🏁 Script executed:
#!/bin/bash
# List files in fake package
echo "=== Files in fake package ==="
find vendor/sigs.k8s.io/controller-runtime/pkg/client/fake -type f -name '*.go'
echo ""
echo "=== Check client.go for Apply ==="
grep -n 'Apply' vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/client.go | head -20
echo ""
echo "=== Check if there's a Patch method which Apply depends on ==="
grep -n 'func.*Patch' vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/client.go -A 5 | head -30Repository: openshift/hypershift
Length of output: 2062
🏁 Script executed:
#!/bin/bash
# Check if Apply method exists in the fake.Client interface at all
echo "=== Search for Apply in entire fake client file ==="
wc -l vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/client.go
echo ""
echo "=== Check lines around 1000-1050 to see Apply implementation ==="
sed -n '1000,1050p' vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/client.go
echo ""
echo "=== Check if Apply is mentioned anywhere in fake client ==="
grep -i 'apply' vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/client.go | grep -i 'func.*apply' -i
echo ""
echo "=== Look at client.Client interface definition in controller-runtime ==="
find vendor/sigs.k8s.io/controller-runtime/pkg/client -name '*.go' -exec grep -l 'type Client interface' {} \;
</parameter>Repository: openshift/hypershift
Length of output: 2048
🏁 Script executed:
#!/bin/bash
# Check the interfaces.go for Apply method
echo "=== Search for Apply in client interfaces ==="
grep -n 'Apply' vendor/sigs.k8s.io/controller-runtime/pkg/client/interfaces.go | head -20
echo ""
echo "=== Check the specific Apply method signature ==="
sed -n '1,150p' vendor/sigs.k8s.io/controller-runtime/pkg/client/interfaces.go | grep -A 10 'Apply'
echo ""
echo "=== Now check the error message more carefully ==="
grep -B 5 -A 5 'apply patches are not supported' vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/client.goRepository: openshift/hypershift
Length of output: 602
🏁 Script executed:
#!/bin/bash
# Check GitHub issue to understand the status
echo "=== Look for any mentions of the GitHub issue in code ==="
rg 'kubernetes/kubernetes/issues/115598' --type go
echo ""
echo "=== Check what version of controller-runtime this error message was added in ==="
git log --all --grep='apply patches' -- vendor/ 2>/dev/null | head -20 || echo "Git log not available for vendor/"
echo ""
echo "=== Check if there's a newer v0.22.x version that might have fixed this ==="
rg 'v0\.22' go.mod
echo ""
echo "=== Check the actual Patch method signature to understand the workaround better ==="
grep -n 'func (c \*fakeClient) Patch' vendor/sigs.k8s.io/controller-runtime/pkg/client/fake/client.go -A 20Repository: openshift/hypershift
Length of output: 1434
🏁 Script executed:
#!/bin/bash
# Final verification: Check the actual Apply method implementation in interfaces.go
echo "=== Find Apply method in interfaces ==="
grep -n 'Apply' vendor/sigs.k8s.io/controller-runtime/pkg/client/interfaces.go
echo ""
echo "=== Look at the Client interface structure ==="
sed -n '1,200p' vendor/sigs.k8s.io/controller-runtime/pkg/client/interfaces.go | tail -50Repository: openshift/hypershift
Length of output: 2280
Update workaround comment — bug still exists in v0.22.4, reference needs correction.
The comment states the fakeApplyClient workaround can be removed "once controller-runtime is upgraded to >= 0.22", but the project already uses v0.22.4. The fake client in v0.22.4 still explicitly rejects ApplyPatchType with the error message: "apply patches are not supported in the fake client." Update the comment to reference the actual version where this limitation was or will be resolved (e.g., "v0.23" or a specific patch).
🤖 Prompt for AI Agents
In `@kas-bootstrap/kas_boostrap_test.go` around lines 376 - 391, Update the
workaround comment above fakeApplyClient/newFakeApplyClient to accurately
reflect that the fake client's rejection of ApplyPatchType still exists in
controller-runtime v0.22.4; change the version note from "once
controller-runtime is upgraded to >= 0.22" to reference the correct future
version (e.g., v0.23) or a specific patch that resolves the fake client's "apply
patches are not supported in the fake client" error, and optionally include that
error message to clarify why the Create-based Apply shim is required until that
version.
There was a problem hiding this comment.
@coderabbitai have you considered that the controller-runtime is actually being replaced with a different version in go.mod?
replace sigs.k8s.io/controller-runtime => sigs.k8s.io/controller-runtime v0.19.7
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the replace directive in go.mod
echo "=== Checking for replace directive for controller-runtime ==="
grep -n 'replace.*controller-runtime' go.mod
echo ""
echo "=== Full context of controller-runtime in go.mod ==="
grep -B 2 -A 2 'controller-runtime' go.mod
Length of output: 681
@TwoDCube You're absolutely right, and I apologize for missing that. I should have checked for replace directives in go.mod.
The replace directive at line 323:
replace sigs.k8s.io/controller-runtime => sigs.k8s.io/controller-runtime v0.19.7overrides the v0.22.4 requirement, meaning the actual version in use is v0.19.7. The comment in the code stating "once controller-runtime is upgraded to >= 0.22" is therefore accurate and appropriate.
My original review comment suggesting the comment needed updating was incorrect. The workaround is indeed still necessary with v0.19.7, and the comment correctly indicates the version target for when it can be removed. You can disregard my earlier suggestion.
Thank you for catching this!
🧬 Code Graph Analysis Results
[kas-bootstrap/kas_boostrap.go:211-213]
Apply interface {
Apply(ctx context.Context, obj *unstructured.Unstructured, opts ...client.PatchOption) error
}
📝 Learnings were identified and not saved (knowledge base disabled). Enable
|
As discussed, the new identity is definitely the right direction for new releases. However, this would be a breaking change for older releases so this can't be backported as-is. The backports should impersonate the HCCO to fix the issue instead of requiring everyone who disabled pki reconciliation to create these new identities. |
What this PR does / why we need it:
Currently it's not possible to change the feature gates for a CustomNoUpgrade feature set cluster, because there's a Validating Admission Policy that only allows the change if the username is
system:hosted-cluster-configand the KAS bootstrap container tries to change it withsystem:admin. Also it uses theCreateOrUpdatefunction with an empty callback function, which causes any update to fail silently.This PR creates a dedicated client certificate and kubeconfig for the KAS bootstrap container and adds this user to the Validating Admission Policy. The kubeconfig uses localhost as the server URL so the container can connect to the kube-apiserver within the same pod without going through the service, which is not yet available at that point. Secondly, it replaces the
CreateOrUpdatewith server-side apply, so updates will work as expected and every resource type won't need to be special cased with a callback function.Which issue(s) this PR fixes (optional, use
fixes #<issue_number>(, fixes #<issue_number>, ...)format, where issue_number might be a GitHub issue, or a Jira story:Fixes #https://issues.redhat.com/browse/OCPBUGS-50603
Checklist