@@ -42,148 +42,10 @@ import org.apache.spark.util.Utils
4242 * should access it from that. There are some cases where the SparkEnv hasn't been
4343 * initialized yet and this class must be instantiated directly.
4444 *
45- * Spark currently supports authentication via a shared secret.
46- * Authentication can be configured to be on via the 'spark.authenticate' configuration
47- * parameter. This parameter controls whether the Spark communication protocols do
48- * authentication using the shared secret. This authentication is a basic handshake to
49- * make sure both sides have the same shared secret and are allowed to communicate.
50- * If the shared secret is not identical they will not be allowed to communicate.
51- *
52- * The Spark UI can also be secured by using javax servlet filters. A user may want to
53- * secure the UI if it has data that other users should not be allowed to see. The javax
54- * servlet filter specified by the user can authenticate the user and then once the user
55- * is logged in, Spark can compare that user versus the view acls to make sure they are
56- * authorized to view the UI. The configs 'spark.acls.enable', 'spark.ui.view.acls' and
57- * 'spark.ui.view.acls.groups' control the behavior of the acls. Note that the person who
58- * started the application always has view access to the UI.
59- *
60- * Spark has a set of individual and group modify acls (`spark.modify.acls`) and
61- * (`spark.modify.acls.groups`) that controls which users and groups have permission to
62- * modify a single application. This would include things like killing the application.
63- * By default the person who started the application has modify access. For modify access
64- * through the UI, you must have a filter that does authentication in place for the modify
65- * acls to work properly.
66- *
67- * Spark also has a set of individual and group admin acls (`spark.admin.acls`) and
68- * (`spark.admin.acls.groups`) which is a set of users/administrators and admin groups
69- * who always have permission to view or modify the Spark application.
70- *
71- * Starting from version 1.3, Spark has partial support for encrypted connections with SSL.
72- *
73- * At this point spark has multiple communication protocols that need to be secured and
74- * different underlying mechanisms are used depending on the protocol:
75- *
76- * - HTTP for broadcast and file server (via HttpServer) -> Spark currently uses Jetty
77- * for the HttpServer. Jetty supports multiple authentication mechanisms -
78- * Basic, Digest, Form, Spnego, etc. It also supports multiple different login
79- * services - Hash, JAAS, Spnego, JDBC, etc. Spark currently uses the HashLoginService
80- * to authenticate using DIGEST-MD5 via a single user and the shared secret.
81- * Since we are using DIGEST-MD5, the shared secret is not passed on the wire
82- * in plaintext.
83- *
84- * We currently support SSL (https) for this communication protocol (see the details
85- * below).
86- *
87- * The Spark HttpServer installs the HashLoginServer and configures it to DIGEST-MD5.
88- * Any clients must specify the user and password. There is a default
89- * Authenticator installed in the SecurityManager to how it does the authentication
90- * and in this case gets the user name and password from the request.
91- *
92- * - BlockTransferService -> The Spark BlockTransferServices uses java nio to asynchronously
93- * exchange messages. For this we use the Java SASL
94- * (Simple Authentication and Security Layer) API and again use DIGEST-MD5
95- * as the authentication mechanism. This means the shared secret is not passed
96- * over the wire in plaintext.
97- * Note that SASL is pluggable as to what mechanism it uses. We currently use
98- * DIGEST-MD5 but this could be changed to use Kerberos or other in the future.
99- * Spark currently supports "auth" for the quality of protection, which means
100- * the connection does not support integrity or privacy protection (encryption)
101- * after authentication. SASL also supports "auth-int" and "auth-conf" which
102- * SPARK could support in the future to allow the user to specify the quality
103- * of protection they want. If we support those, the messages will also have to
104- * be wrapped and unwrapped via the SaslServer/SaslClient.wrap/unwrap API's.
105- *
106- * Since the NioBlockTransferService does asynchronous messages passing, the SASL
107- * authentication is a bit more complex. A ConnectionManager can be both a client
108- * and a Server, so for a particular connection it has to determine what to do.
109- * A ConnectionId was added to be able to track connections and is used to
110- * match up incoming messages with connections waiting for authentication.
111- * The ConnectionManager tracks all the sendingConnections using the ConnectionId,
112- * waits for the response from the server, and does the handshake before sending
113- * the real message.
114- *
115- * The NettyBlockTransferService ensures that SASL authentication is performed
116- * synchronously prior to any other communication on a connection. This is done in
117- * SaslClientBootstrap on the client side and SaslRpcHandler on the server side.
118- *
119- * - HTTP for the Spark UI -> the UI was changed to use servlets so that javax servlet filters
120- * can be used. Yarn requires a specific AmIpFilter be installed for security to work
121- * properly. For non-Yarn deployments, users can write a filter to go through their
122- * organization's normal login service. If an authentication filter is in place then the
123- * SparkUI can be configured to check the logged in user against the list of users who
124- * have view acls to see if that user is authorized.
125- * The filters can also be used for many different purposes. For instance filters
126- * could be used for logging, encryption, or compression.
127- *
128- * The exact mechanisms used to generate/distribute the shared secret are deployment-specific.
129- *
130- * For YARN deployments, the secret is automatically generated. The secret is placed in the Hadoop
131- * UGI which gets passed around via the Hadoop RPC mechanism. Hadoop RPC can be configured to
132- * support different levels of protection. See the Hadoop documentation for more details. Each
133- * Spark application on YARN gets a different shared secret.
134- *
135- * On YARN, the Spark UI gets configured to use the Hadoop YARN AmIpFilter which requires the user
136- * to go through the ResourceManager Proxy. That proxy is there to reduce the possibility of web
137- * based attacks through YARN. Hadoop can be configured to use filters to do authentication. That
138- * authentication then happens via the ResourceManager Proxy and Spark will use that to do
139- * authorization against the view acls.
140- *
141- * For other Spark deployments, the shared secret must be specified via the
142- * spark.authenticate.secret config.
143- * All the nodes (Master and Workers) and the applications need to have the same shared secret.
144- * This again is not ideal as one user could potentially affect another users application.
145- * This should be enhanced in the future to provide better protection.
146- * If the UI needs to be secure, the user needs to install a javax servlet filter to do the
147- * authentication. Spark will then use that user to compare against the view acls to do
148- * authorization. If not filter is in place the user is generally null and no authorization
149- * can take place.
150- *
151- * When authentication is being used, encryption can also be enabled by setting the option
152- * spark.authenticate.enableSaslEncryption to true. This is only supported by communication
153- * channels that use the network-common library, and can be used as an alternative to SSL in those
154- * cases.
155- *
156- * SSL can be used for encryption for certain communication channels. The user can configure the
157- * default SSL settings which will be used for all the supported communication protocols unless
158- * they are overwritten by protocol specific settings. This way the user can easily provide the
159- * common settings for all the protocols without disabling the ability to configure each one
160- * individually.
161- *
162- * All the SSL settings like `spark.ssl.xxx` where `xxx` is a particular configuration property,
163- * denote the global configuration for all the supported protocols. In order to override the global
164- * configuration for the particular protocol, the properties must be overwritten in the
165- * protocol-specific namespace. Use `spark.ssl.yyy.xxx` settings to overwrite the global
166- * configuration for particular protocol denoted by `yyy`. Currently `yyy` can be only`fs` for
167- * broadcast and file server.
168- *
169- * Refer to [[org.apache.spark.SSLOptions ]] documentation for the list of
170- * options that can be specified.
171- *
172- * SecurityManager initializes SSLOptions objects for different protocols separately. SSLOptions
173- * object parses Spark configuration at a given namespace and builds the common representation
174- * of SSL settings. SSLOptions is then used to provide protocol-specific SSLContextFactory for
175- * Jetty.
176- *
177- * SSL must be configured on each node and configured for each component involved in
178- * communication using the particular protocol. In YARN clusters, the key-store can be prepared on
179- * the client side then distributed and used by the executors as the part of the application
180- * (YARN allows the user to deploy files before the application is started).
181- * In standalone deployment, the user needs to provide key-stores and configuration
182- * options for master and workers. In this mode, the user may allow the executors to use the SSL
183- * settings inherited from the worker which spawned that executor. It can be accomplished by
184- * setting `spark.ssl.useNodeLocalConf` to `true`.
45+ * This class implements all of the configuration related to security features described
46+ * in the "Security" document. Please refer to that document for specific features implemented
47+ * here.
18548 */
186-
18749private [spark] class SecurityManager (
18850 sparkConf : SparkConf ,
18951 val ioEncryptionKey : Option [Array [Byte ]] = None )
0 commit comments