-
Notifications
You must be signed in to change notification settings - Fork 33
[DCOS-49020] User propagation and override support for Spark Dispatcher #492
[DCOS-49020] User propagation and override support for Spark Dispatcher #492
Conversation
samvantran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good and I like the breadth of tests. I did have a few comments but not blocking but looks like the test_users failed. I see this error log where it attempts to clean up drivers https://teamcity.mesosphere.io/viewLog.html?tab=buildLog&logTab=tree&filter=debug&expand=all&buildId=1661915&_focus=85531
| fi | ||
|
|
||
| if [ "${SPARK_DOCKER_USER}" != "" ]; then | ||
| echo "spark.mesos.dispatcher.driverDefault.spark.mesos.executor.docker.parameters=user=${SPARK_DOCKER_USER}" >> ${SPARK_HOME}/conf/mesos-cluster-dispatcher.properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow what a config +1 for taking care of this for the user
| @pytest.mark.parametrize('user,use_ucr_containerizer,use_ucr_for_spark_submit', [ | ||
| ("nobody", True, True), | ||
| ("nobody", True, False), | ||
| ("nobody", False, True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to test scenarios where users launch dispatchers with docker but submit jobs with UCR? I guess it doesn't hurt but is this realistic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, Dispatcher is deployed using Docker containerizer. From here we can think of the next scenarios:
- a user submits a job (
False,False) - a user discovers UCR option, updates Dispatcher config, but keeps submitting jobs as before (
True,False) - user discovers spark-submit option
spark.mesos.containerizer=mesosbut runs Dispatcher with default config (False,True) - a user is experienced and enables UCR in both Dispatcher and jobs (
True,True)
The whole goal of this exhaustive testing is to make sure we don't miss some not-so-obvious combination of containerizers in which user propagation breaks.
| args=submit_args) | ||
| try: | ||
| sdk_tasks.check_running(app_name, 1, timeout_seconds=300) | ||
| driver_task = shakedown.get_task(driver_task_id, completed=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we ought to move away from shakedown but it doesn't have to be this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a Jira ticket for it, probably it makes sense to wait until we have proper dependency management for the testing package from dcos-commons.
574ebf2 to
bf694e1
Compare
bf694e1 to
9f3ceae
Compare
What changes were proposed in this pull request?
Resolves https://jira.mesosphere.com/browse/DCOS-49020
{{service.user}}by defaultspark.mesos.executor.docker.parameters=user=<user>andspark.mesos.driverEnv.SPARK_USER=<user>How were these changes tested?
test_spark_users.pyagainst both CoreOS and CentOS clustersRelease Notes
{{service.user}}by default for both Docker and Mesos containerizersdocker_userUID override for CentOS/RHEL (no need to specify UID at the time of job submit)nobodywhile Dispatcher is running isroot