Skip to content

Conversation

@skonto
Copy link
Contributor

@skonto skonto commented Mar 30, 2018

What changes were proposed in this pull request?

  • Adds delegations tokens to ugi if the proxy user exists. We run this early when the mesos backend starts.
  • Adds support for the proxy user in mesos client/cluster mode.
  • Fixes the HMS connection issue

How was this patch tested?

This was manually tested with a secured HDFS and by running the spark hive examples, both in client and cluster mode.

In cluster mode this was tested with a ticket cache by passing the following args:

--proxy-user nobody
--conf spark.mesos.driver.labels=DCOS_SPACE:/kerberized-spark
--conf spark.mesos.containerizer=mesos
--conf spark.mesos.driverEnv.SPARK_USER=nobody
--conf spark.mesos.driver.secret.names=/kerberized-spark/krb5cc_65534
--conf spark.mesos.driver.secret.filenames=krb5cc_65534
--conf spark.mesos.driverEnv.KRB5CCNAME=/mnt/mesos/sandbox/krb5cc_65534 \

@skonto
Copy link
Contributor Author

skonto commented Mar 30, 2018

@vanzin @susanxhuynh pls review, this probably needs to be backported to 2.3, as this is where a customer faced the issue. Yarn follows a different approach adds early the tokens at the ugi so no TGT is needed later on, still when I tried the same approach with mesos I hit the issue described in the ticket with the HadoopRDD (that RDD seems to be a permanent integration pain point). Not sure if this patch affects yarn at all.

@SparkQA
Copy link

SparkQA commented Mar 30, 2018

Test build #88751 has finished for PR 20945 at commit 5aa7231.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 30, 2018

Test build #88752 has finished for PR 20945 at commit 5f7851c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 30, 2018

I don't think this is right. You do not want to start the session as the real user. That's why you're using a proxy user in the first place - to identify as someone else to external services.

Aren't you just missing the delegation token for the proxy user?

@skonto
Copy link
Contributor Author

skonto commented Mar 31, 2018

@vanzin ok let's see if I understand correctly, so Spark Job's main is run as a proxy user if the user exists, and then we use the real user for HiveDelegationTokenProvider just because the hive needs the real user to create the delegation token correctly. It cannot use the proxy user for that. I guess I can use it for the session state though.
Thank you for the feedback, I followed the yarn's approach and adapted the code to use the delegation tokens already obtained and updated this PR. Pls review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only happen if security is enabled and proxy user exists.

@skonto
Copy link
Contributor Author

skonto commented Mar 31, 2018

I attach the log files of the last run with the updated PR. Also updated the description.
proxy_client_mode.log
proxy_cluster_stderr.log

@SparkQA
Copy link

SparkQA commented Mar 31, 2018

Test build #88776 has finished for PR 20945 at commit 76330eb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

desc.conf.getOption("spark.mesos.proxyUser").foreach { v =>
options ++= Seq("--proxy-user", v)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cluster mode we need to pass the proxy user to the dispatcher.

@skonto
Copy link
Contributor Author

skonto commented Apr 1, 2018

@vanzin @susanxhuynh I think its on the right path now.

@SparkQA
Copy link

SparkQA commented Apr 1, 2018

Test build #88788 has finished for PR 20945 at commit 1060405.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@skonto
Copy link
Contributor Author

skonto commented Apr 1, 2018

Failed unit test: org.apache.spark.launcher.LauncherServerSuite.testAppHandleDisconnect
Will re-test, this is not due to this patch.

@skonto
Copy link
Contributor Author

skonto commented Apr 1, 2018

retest this please

@SparkQA
Copy link

SparkQA commented Apr 1, 2018

Test build #88794 has finished for PR 20945 at commit 1060405.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val jobCreds = conf.getCredentials()
jobCreds.mergeAll(UserGroupInformation.getCurrentUser().getCredentials())
val userCreds = UserGroupInformation.getCurrentUser().getCredentials()
logInfo(s"Adding user credentials: ${SparkHadoopUtil.get.dumpTokens(userCreds)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use dumpTokens in an info message. Ok to add as debug if you want. But not really sure why you're adding this in this PR.

Copy link
Contributor Author

@skonto skonto Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I saw it being used here, so I thought it would be helpful at the info level. The reason I added it there is I would like to see what credentials the HadoopRDD uses. There are different parts in code base where credentials are added and understanding what is happening can be confusing when looking at the logs of a job. Not clear to people that HadoopRDD fetches tokens on its own.

val jobConf = getJobConf()
// add the credentials here as this can be called before SparkContext initialized
SparkHadoopUtil.get.addCredentials(jobConf)
logInfo(s"HadoopRDD credentials: ${SparkHadoopUtil.get.dumpTokens(jobConf.getCredentials)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor Author

@skonto skonto Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented above.

}

desc.conf.getOption("spark.mesos.proxyUser").foreach { v =>
options ++= Seq("--proxy-user", v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a little odd. How's a cluster mode app run in Mesos?

Basically what I want to know:

  • which process starts the driver
  • what user that process is running as, and which user will the driver process run as
  • what kerberos credentials does it have and how are they managed

The gist is that running the Spark driver in client mode (which I think is how the driver in cluster mode is started eventually?) with a proxy user is a weird combination. It means the user code running in that driver has access to the credentials of the more privileged user - and could in its turn use those to run anything as any other user...

In comparison, YARN + cluster mode + proxy user starts the YARN application as the proxy user. So the user code, which only runs in a YARN container, has no access to the privileged credentials, which only exist in the launcher.

Copy link
Contributor Author

@skonto skonto Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanzin

On DC/OS the spark dcos cli, which supports kerberos & keytab paths, submits jobs directly to the mesos rest api at the mesos dispatcher side. The keytabs are stored on the DC/OS secret store before the job is launched and they are mounted on the container before container is launched.
Thus, the idea here is to store the keytab for the superuser on the secret store, so the spark driver which is eventually launched in client mode within the cluster, to login to kerberos and impersonate another user. The driver will start the SparkJob's main as a proxy user (as usual) and will use the superuser credentials to impersonate the passed proxy user.
The driver is started by the mesos dispatcher and the mesos dispatcher does not have any access to keytabs, it just passes the spark config options. The driver can access a secret only if it is allowed to (this is controlled by DC/OS labels).
The OS user used by the container depends on the setup but that user should be the appropriate one.
Right now DC/OS switched back to root for Spark containers, previously it used nobody but users can customize the image to add their own users anyway.
You can change the user by passing --conf spark.mesos.driverEnv.SPARK_USER=.
Spark on Mesos uses that value if defined when setting up mesos tasks for the executors for example.
In containerized envs this adds extra headaches.
As a whole this is not that different from running this in client mode because in client mode as well I need to access the superuser's credentials somehow. The whole concept is migrated within a container and then the env (DC/OS) should make sure that the same user is consistent from the submit side all the way within the container and enforce restrictions. That is the intention here.

The other option to mimic yarn would be the spark submit to upload a locally created DT (to secret store) in the cluster and the driver to use that for impersonation. But this is not how things work on DC/OS deployments as Michael mentioned in the past here: https://issues.apache.org/jira/browse/SPARK-16742, you may not even have access to the keytab at the launcher side. Yarn has a different approach for that as you mentioned.
At the end of the day, if impersonation includes also launching the driver container as the proxy user then this can be supported with this PR by setting the appropriate user but it will have access to superuser's credentials, this is not ok. On the other hand, If impersonation for mesos starts within Spark at the integration level with the hadoop ecosystem (actually it starts with launching user's main with that user) then I dont see how this PR differs from mesos client mode with impersonation enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@susanxhuynh feel free to add more on how DC/OS (mesos) handles the multi-tenancy story in general and user identity management.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The driver will start the SparkJob's main as a proxy user (as usual) and will use the superuser credentials to impersonate the passed proxy user.

That's a big problem, because, as I said, that makes the super user credentials available to untrusted user code. How do you prevent the user's Spark app from using those credentials?

On YARN cluster mode the super user's credentials are never available to the user application. (On client mode they are, but really, if you're using --proxy-user in client mode you're missing the point, or I hope you know what you're doing.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, you have a problem here you need to solve.

You either have to require kerberos creds on the launcher side, so you can upload DTs in cluster mode, or you need some level of separation between the code that launches the driver and the driver itself. The current system you have here is not secure at all - any user can just impersonate any other user, since they have access to the super user's credentials.

Copy link
Contributor Author

@skonto skonto Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem here is that you're making spark-submit + proxy user + client mode the official way to > run Spark on Mesos in cluster mode, and now you're basically exposing everyone to that security > issue.

Yes because the assumption was client mode was safe. There is no warning about this especially for end users and I just started looking into the hadoop security part.
Anyway good to know will get back with an update thanx for the comments, discussing via comments is hard sometimes...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes because the assumption was client mode was safe. There is no warning about this

Could probably use something in the documentation - warnings printed to logs are easily ignored. Still, there are legitimate uses for client mode + proxy user, but I don't think this is one of them.

Copy link
Contributor Author

@skonto skonto Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are legitimate uses if it is not safe? Like knowing what your code does so its ok, spark shell?

require the launcher to have a kerberos login, and send DTs to the application. a.k.a. what >Spark-on-YARN does.
in the code that launches the driver on the Mesos side, create the DTs in a safe context (e.g. > not as part of the spark-submit invocation) and provide them to the Spark driver using the > HADOOP_TOKEN_FILE_LOCATION env var.

For the first option when I run the hive examples with yarn (EMR) in cluster mode (without a TGT) it did fail but it didnt require any credentials (no Spark code does that, its hadoop code). I got:

(Mechanism level: Failed to find any Kerberos tgt)

Coming from this line:

at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.jav> a:550)
at > org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:156)

So not sure what you mean here, unless you mean that and then if that check passes create the DTs at the launcher anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are legitimate uses if it is not safe? Like knowing what your code does so its ok

Yes. For example you can run some trusted code as a less-privileged user, so that you don't accidentally do something stupid as a super user.

(Mechanism level: Failed to find any Kerberos tgt)

That means you don't have any credentials (neither a tgt nor a dt). I don't know EMR so I don't know how to use it (with or without kerberos).

Copy link
Contributor Author

@skonto skonto Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means you don't have any credentials (neither a tgt nor a dt). I don't know EMR so I don't > know how to use it (with or without kerberos).

Yes that was the intention to check where Spark on Yarn code fails when there is no TGT (removed it with kdestroy). I am using it with kerberos.

@susanxhuynh
Copy link
Contributor

@skonto Basic question: in your example above, which user does the "krb5cc_65534" ticket cache belong to? The superuser or the proxy-user ("nobody")?

@skonto
Copy link
Contributor Author

skonto commented Apr 3, 2018

@susanxhuynh AFIK the cache represents the ticket for the superuser since he needs to create a DT from his TGT for nobody to impersonate nobody. The superuser has the right to impersonate. The ticket cache replaces the need to kinit with the superuser's keytab. I had to rename it because I am running within a container as user nobody anyway (didnt want to add a superuser in the container for testing). My superuser is hive which does not exist on the DC/OS Spark container or the DC/OS nodes.
The filename of the cache depends on the OS user to be picked up, not ugi current user:

[hadoop@ip-10-0-9-161 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_498
Default principal: nobody@LOCAL
Valid starting Expires Service principal
03/04/2018 11:15:37 03/04/2018 21:15:37 krbtgt/LOCAL@LOCAL
renew until 04/04/2018 11:15:37
[hadoop@ip-10-0-9-161 ~]$ id -u hadoop
498

In the above example the hadoop user has a ticketcache that has a suffix with his uid. On the other hand the cache contains a principal for nobody, it could be anything. As long as the ticket cache has a valid principal for user X and kerberos is used, then hadoop libraries will see user X as the authenticated one.

If I were to use a TGT with nobody user then I would get:

18/04/03 15:52:08 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: nobody@LOCAL is not allowed to impersonate nobody
at

Nobody is just an example here. You can use any other user as long as you have a superuser to impersonate him.

@skonto
Copy link
Contributor Author

skonto commented Apr 4, 2018

@vanzin @susanxhuynh

  • require the launcher to have a kerberos login, and send DTs to the application. a.k.a. what Spark-on-YARN does.
  • in the code that launches the driver on the Mesos side, create the DTs in a safe context (e.g. not as part of the spark-submit invocation) and provide them to the Spark driver using the HADOOP_TOKEN_FILE_LOCATION env var.

Option 1 above: Spark submit (launcher) could create the DTs locally as a utility function.
Then the user could upload them to secret store.
And then run the spark submit as usual referencing the DTs secret.
Due to https://issues.apache.org/jira/browse/SPARK-20982 we can not pass any info from the launcher side to the mesos dispatcher directly. Also we cannot write code in Spark Submit to integrate with secret store because this depends on the env. DC/OS has its own implementation and libs/API to integrate with it. Native mesos does not specify a secret store, you need to bring your own: http://mesos.apache.org/documentation/latest/secrets/

Option 2 above: means that the dispatcher should create the DTs and pass them to the driver’s container as secrets. That means it should be able to fetch the superusers TGT from the secret store create the delegation tokens in an isolated manner (other drivers could be launched in parallel) and store them back to the secret store so that the driver to be launched can use them. Again this would require for the mesos dispatcher to be integrated with DCOS APIs, for example to access the secret store you need an auth token to be passed and call a specific api https://docs.mesosphere.com/1.11.0/administration/secrets/secrets-api/.

Option 3: Spark submit (second one, client mode) within the container before it runs the user’s main it could create the DTs, save them to the local filesystem, point to them with HADOOP_TOKEN_FILE_LOCATION and then remove the TGT (/tmp/krb5cc_uid) (like kdestroy), so user code cannot use it to impersonate anyone. Could the user code fetch the TGT secret again from the secret store? It could if has access to /spark service's secrets. https://docs.mesosphere.com/services/spark/2.3.0-2.2.1-2/limitations/. @susanxhuynh would it possible to constraint this or all OS users within a driver's container can access all secrets given an auth token?

Option 4: Fix SPARK-20982 and pass DTs to the dispatcher in binary format, then store them to the secret store. The driver then can pick them up at launch time.

Thoughts? I am inclined to do 3 here if it is safe (minimal work). 1 is better but UX is ruined. 3,4 would bring unwanted dependencies, unless we fix this only at the DC/OS level. I checked but didnt see a mesos http API for the secret store.

@susanxhuynh
Copy link
Contributor

(1) seems the most secure. How do we handle keytabs today in cluster mode in pure Mesos? Is it the same situation -- the keytab gets sent over a HTTP connection to the Dispatcher?

(3) Yes, the TGT secret would still be available from the secret store. There's currently no constraint based on a OS user.

@skonto
Copy link
Contributor Author

skonto commented Apr 4, 2018

@susanxhuynh @vanzin

(1) We have a problem here I agree, and yes it is more secure not to have the TGT anywhere near the user's code.

(3) The proxy user doAs in Spark submit uses java security manager and calls at the end of the day:
java.security.AccessController.doPrivileged.

I think (wild guess) we could restrict access to both the /tmp/... file for the TGT and the url pointing in the secret store.
https://www.techrepublic.com/article/java-security-policies-and-permission-management/
https://stackoverflow.com/questions/38974784/can-somebody-explain-what-is-the-role-of-urlpermission-class-in-java-1-8-in-clie
https://docs.oracle.com/javase/7/docs/api/java/io/FilePermission.html

Of course there is always jni and native code which could bypass this I guess or maybe Runtime.exec() or not? Can the sec. manager sandbox such cases? It seems yes for the latter:
https://stackoverflow.com/questions/29457939/java-block-runtime-exec

PS. Right now this can only be solved easily in DC/OS side I guess.

@vanzin
Copy link
Contributor

vanzin commented Apr 4, 2018

SPARK-20982 doesn't look particularly hard to fix.

I don't understand the differences between plain Mesos and DC/OS so a lot of the things you're saying are over my head. I'm just concerned with the code that is present here in the Spark repo doing the right thing w.r.t. security, assuming whatever service it's talking to is secure.

@skonto
Copy link
Contributor Author

skonto commented Apr 5, 2018

@vanzin Sure we will try to comply the thing is pure mesos does not have an api for secrets only DC/OS has one and we cannot bring that api in the Spark project, otherwise I would just implement option 1) as with yarn and everyone would be happy and secure ;)
I will check SPARK-20982 as well, the issue there is that the dispatcher will need access to the secret store via some API which again does not exist in pure mesos, so any such solution would need to use dc/os libs or an api at the dispatcher side.

@skonto
Copy link
Contributor Author

skonto commented Apr 6, 2018

@susanxhuynh @vanzin
From what I see all secret stores I searched for provide an http API:
https://github.com/kubernetes/kubernetes/blob/09f321c80bfc9bca63a5530b56d7a1a3ba80ba9b/pkg/kubectl/cmd/util/factory_client_access.go#L473
https://www.vaultproject.io/api/index.html
https://docs.openshift.org/latest/rest_api/api/v1.Secret.html
https://v1-9.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#secret-v1-core
https://docs.mesosphere.com/1.11/security/ent/secrets/secrets-api/

So generating DTs at the first spark submit and then using an http API should be good enough, although all envs like k8s or DC/OS usually have a cli utility to do the job. That means only a few configuration options need to be passed like the api uri and some token for authentication (I assume). No real dependencies. This would require spark submit to be able to access the secret store's api (depends).
The alternative implementation would be (assuming SPARK-20982 is fixed), to pass the DTs to the dispatcher (this is always guaranteed to be accessible somehow) and then the dispatcher to access the secret store API so that it can store the DTs for the driver it is going to launch. The assumption here is that on pure mesos the dispatcher can access the secret store (not sure).

@skonto
Copy link
Contributor Author

skonto commented Apr 8, 2018

@susanxhuynh Unfortunately I cannot unify the APIs even for DC/OS, 1.10.x is different from 1.11.x (https://docs.mesosphere.com/services/spark/2.3.0-2.2.1-2/security/) and code is dependent on this (I played a bit with the DC/OS secret store API), not to mention other APIs out there. This would require a a generic secrets API at the pure mesos level (like in k8s) so I don't see a viable solution for now, unless I manage to restrict access to the TGT in client mode and essentially make it safe.

@skonto
Copy link
Contributor Author

skonto commented May 17, 2018

@vanzin here is the fix that works for DC/OS: d2iq-archive#26. It implements Yarn's approach.
Unfortunately I cannot bring it back here due to dependencies to the secret store. I will try fix: SPARK-20982​. So feel free to close this.

@vanzin
Copy link
Contributor

vanzin commented May 17, 2018

I actually can't close this, only you can.

If the DC/OS libraries are open source and something people can pull in by changing mesos.version or some other build-time parameter, you could potentially use reflection; we've done that many times for YARN.

But otherwise it'd be a little awkward to add the code to Spark.

@skonto
Copy link
Contributor Author

skonto commented May 19, 2018

@vanzin correct I will close it. The dependency is on a specific secret store api. So its mostly http calls which are DC/OS specific...

@skonto skonto closed this May 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants