Skip to content

Conversation

@Aakcht
Copy link
Contributor

@Aakcht Aakcht commented Nov 5, 2024

Purpose of this PR

Currently spark-operator entrypoint.sh has this logic. It is the same logic as in entrypoint.sh of old spark images. It is intended for openshift and modifies /etc/passwd. It was working in the past, because in the older base spark images permissions for /etc/passwd were modified.

New spark images do not modify permissions of /etc/passwd. Therefore this entrypoint.sh logic does not work currently and when running spark-operator under random user spark-application submissions fails with the following error:

 org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name

This PR changes this logic to the same logic as in newer spark images.

Proposed changes:

  • Modify entrypoint.sh logic of working with /etc/passwd and NSS_WRAPPER_PASSWD based on new spark image.

Change Category

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Rationale

Checklist

  • I have conducted a self-review of my own code.
  • I have updated documentation accordingly.
  • I have added tests that prove my changes are effective or that my feature works.
  • Existing unit tests pass locally with my changes.

Additional Notes

I validated this PR by adding new entrypoint.sh to spark-operator docker image and checked that spark-operator can now submit applications when running under random user.

@Aakcht Aakcht force-pushed the fix_openshift_run_as_any_user branch from 711339e to 4c9c397 Compare November 5, 2024 13:21
@Aakcht
Copy link
Contributor Author

Aakcht commented Nov 12, 2024

Hi, @ChenYi015 @ImpSy @jacobsalway , any chance for reviewing this PR?

@google-oss-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ChenYi015

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ChenYi015
Copy link
Member

@Aakcht Thanks for updating the entrypoint so we can run Spark operator with random user.
/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Dec 4, 2024
@google-oss-prow google-oss-prow bot merged commit 5dd91c4 into kubeflow:master Dec 4, 2024
1 check passed
@Aakcht Aakcht deleted the fix_openshift_run_as_any_user branch December 4, 2024 13:34
ChenYi015 pushed a commit to ChenYi015/spark-operator that referenced this pull request Dec 10, 2024
…age entrypoint.sh (kubeflow#2312)

Signed-off-by: Aakcht <[email protected]>
(cherry picked from commit 5dd91c4)
@ChenYi015 ChenYi015 mentioned this pull request Dec 10, 2024
google-oss-prow bot pushed a commit that referenced this pull request Dec 11, 2024
* Allow setting automountServiceAccountToken (#2298)

* Allow setting automountServiceAccountToken on workloads and serviceAccounts

Signed-off-by: Aran Shavit <[email protected]>

* update helm docs

Signed-off-by: Aran Shavit <[email protected]>

---------

Signed-off-by: Aran Shavit <[email protected]>
(cherry picked from commit 515d805)

* Fix: executor container security context does not work (#2306)

Signed-off-by: Yi Chen <[email protected]>
(cherry picked from commit 171e429)

* Fix: should not add emptyDir sizeLimit conf if it is nil (#2305)

Signed-off-by: Yi Chen <[email protected]>
(cherry picked from commit 763682d)

* Allow the Controller and Webhook Containers to run with the securityContext: readOnlyRootfilesystem: true (#2282)

* create a tmp dir for the controller to write Spark artifacts to and set the controller to readOnlyRootFilesystem

Signed-off-by: Nick Gretzon <[email protected]>

* mount a dir for the webhook container to generate its certificates in and set readOnlyRootFilesystem: true for the webhook pod

Signed-off-by: Nick Gretzon <[email protected]>

* update the securityContext in the controller deployment test

Signed-off-by: Nick Gretzon <[email protected]>

* update securityContext of the webhook container in the deployment_test

Signed-off-by: Nick Gretzon <[email protected]>

* update README

Signed-off-by: Nick Gretzon <[email protected]>

* remove -- so comments are not rendered in the README.md

Signed-off-by: Nick Gretzon <[email protected]>

* recreate README.md after removal of comments for volumes and volumeMounts

Signed-off-by: Nick Gretzon <[email protected]>

* make indentation for volumes and volumeMounts consistent with rest of values.yaml

Signed-off-by: Nick Gretzon <[email protected]>

* Revert "make indentation for volumes and volumeMounts consistent with rest of values.yaml"

This reverts commit dba97fc.

Signed-off-by: Nick Gretzon <[email protected]>

* fix indentation in webhook and controller deployment templates for volumes and volumeMounts

Signed-off-by: Nick Gretzon <[email protected]>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* Update charts/spark-operator-chart/values.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* Update charts/spark-operator-chart/templates/controller/deployment.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* Update charts/spark-operator-chart/templates/controller/deployment.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* Update charts/spark-operator-chart/templates/webhook/deployment.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* Update charts/spark-operator-chart/templates/webhook/deployment.yaml

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>

* add additional securityContext to the controller deployment_test.yaml

Signed-off-by: Nick Gretzon <[email protected]>

---------

Signed-off-by: Nick Gretzon <[email protected]>
Signed-off-by: Nicholas Gretzon <[email protected]>
Co-authored-by: Yi Chen <[email protected]>
(cherry picked from commit 72107fd)

* Fix: should not add emptyDir sizeLimit conf on executor pods if it is nil (#2316)

Signed-off-by: Cian Gallagher <[email protected]>
(cherry picked from commit 2999546)

* Bump `volcano.sh/apis` to 1.10.0 (#2320)

Signed-off-by: Jacob Salway <[email protected]>
(cherry picked from commit 22e4fb8)

* Truncate UI service name if over 63 characters (#2311)

* Truncate UI service name if over 63 characters

Signed-off-by: Jacob Salway <[email protected]>

* Also truncate ingress name

Signed-off-by: Jacob Salway <[email protected]>

---------

Signed-off-by: Jacob Salway <[email protected]>
(cherry picked from commit 43c1888)

* Bump aquasecurity/trivy-action from 0.28.0 to 0.29.0 (#2332)

Bumps [aquasecurity/trivy-action](https://github.com/aquasecurity/trivy-action) from 0.28.0 to 0.29.0.
- [Release notes](https://github.com/aquasecurity/trivy-action/releases)
- [Commits](aquasecurity/trivy-action@0.28.0...0.29.0)

---
updated-dependencies:
- dependency-name: aquasecurity/trivy-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 270b09e)

* Bump github.com/onsi/ginkgo/v2 from 2.20.2 to 2.22.0 (#2335)

Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.20.2 to 2.22.0.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](onsi/ginkgo@v2.20.2...v2.22.0)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 40423d5)

* The webhook-key-name command-line param isn't taking effect (#2344)

Signed-off-by: C. H. Afzal <[email protected]>
(cherry picked from commit a261523)

* Robustness to driver pod taking time to create (#2315)

* Retry after driver pod now found if recent submission

Signed-off-by: Thomas Newton <[email protected]>

* Add a test

Signed-off-by: Thomas Newton <[email protected]>

* Make grace period configurable

Signed-off-by: Thomas Newton <[email protected]>

* Update test

Signed-off-by: Thomas Newton <[email protected]>

* Add an extra test with the driver pod

Signed-off-by: Thomas Newton <[email protected]>

* Separate context to create and delete the driver pod

Signed-off-by: Thomas Newton <[email protected]>

* Tidy

Signed-off-by: Thomas Newton <[email protected]>

* Autoformat

Signed-off-by: Thomas Newton <[email protected]>

* Update error message

Signed-off-by: Thomas Newton <[email protected]>

* Add helm paramater

Signed-off-by: Thomas Newton <[email protected]>

* Update internal/controller/sparkapplication/controller.go

Co-authored-by: Yi Chen <[email protected]>
Signed-off-by: Thomas Newton <[email protected]>

* Newlines between helm tests

Signed-off-by: Thomas Newton <[email protected]>

---------

Signed-off-by: Thomas Newton <[email protected]>
Co-authored-by: Yi Chen <[email protected]>
(cherry picked from commit d815e78)

* Use NSS_WRAPPER_PASSWD instead of /etc/passwd as in spark-operator image entrypoint.sh (#2312)

Signed-off-by: Aakcht <[email protected]>
(cherry picked from commit 5dd91c4)

* Move sparkctl to cmd directory (#2347)

* Move spark-operator

Signed-off-by: Yi Chen <[email protected]>

* Move sparkctl to cmd directory

Signed-off-by: Yi Chen <[email protected]>

* Remove unnecessary app package/directory

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
(cherry picked from commit 2375a30)

* Spark Operator Official Release v2.1.0

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
Co-authored-by: Aran Shavit <[email protected]>
Co-authored-by: Nicholas Gretzon <[email protected]>
Co-authored-by: Cian (Keen) Gallagher <[email protected]>
Co-authored-by: Jacob Salway <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: C. H. Afzal <[email protected]>
Co-authored-by: Thomas Newton <[email protected]>
Co-authored-by: Aakcht <[email protected]>
NSS_WRAPPER_GROUP="$(mktemp)"
export LD_PRELOAD="$wrapper" NSS_WRAPPER_PASSWD NSS_WRAPPER_GROUP
mygid="$(id -g)"
printf 'spark:x:%s:%s:${SPARK_USER_NAME:-anonymous uid}:%s:/bin/false\n' "$myuid" "$mygid" "$SPARK_HOME" > "$NSS_WRAPPER_PASSWD"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line was supposed to look something like this:

printf 'spark:x:%s:%s:%s:%s:/bin/false\n' "$myuid" "$mygid" "${SPARK_USER_NAME:-anonymous uid}" "$SPARK_HOME" > "$NSS_WRAPPER_PASSWD"

Because currently the nss wrapper passwd looks something like:

spark:x:1000:1000:${SPARK_USER_NAME:-anonymous uid}:/foobar:/bin/false

the single quotes around the printf template prevent bash from interpolating $SPARK_USER_NAME

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants