Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,6 @@ Collate:
'types.R'
'utils.R'
'window.R'
RoxygenNote: 5.0.1
RoxygenNote: 6.0.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls revert this

Copy link
Contributor Author

@vanzin vanzin Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, I thought I had removed this one. (It's really annoying when the build changes your working file set.)

VignetteBuilder: knitr
NeedsCompilation: no
144 changes: 3 additions & 141 deletions core/src/main/scala/org/apache/spark/SecurityManager.scala
Original file line number Diff line number Diff line change
Expand Up @@ -42,148 +42,10 @@ import org.apache.spark.util.Utils
* should access it from that. There are some cases where the SparkEnv hasn't been
* initialized yet and this class must be instantiated directly.
*
* Spark currently supports authentication via a shared secret.
* Authentication can be configured to be on via the 'spark.authenticate' configuration
* parameter. This parameter controls whether the Spark communication protocols do
* authentication using the shared secret. This authentication is a basic handshake to
* make sure both sides have the same shared secret and are allowed to communicate.
* If the shared secret is not identical they will not be allowed to communicate.
*
* The Spark UI can also be secured by using javax servlet filters. A user may want to
* secure the UI if it has data that other users should not be allowed to see. The javax
* servlet filter specified by the user can authenticate the user and then once the user
* is logged in, Spark can compare that user versus the view acls to make sure they are
* authorized to view the UI. The configs 'spark.acls.enable', 'spark.ui.view.acls' and
* 'spark.ui.view.acls.groups' control the behavior of the acls. Note that the person who
* started the application always has view access to the UI.
*
* Spark has a set of individual and group modify acls (`spark.modify.acls`) and
* (`spark.modify.acls.groups`) that controls which users and groups have permission to
* modify a single application. This would include things like killing the application.
* By default the person who started the application has modify access. For modify access
* through the UI, you must have a filter that does authentication in place for the modify
* acls to work properly.
*
* Spark also has a set of individual and group admin acls (`spark.admin.acls`) and
* (`spark.admin.acls.groups`) which is a set of users/administrators and admin groups
* who always have permission to view or modify the Spark application.
*
* Starting from version 1.3, Spark has partial support for encrypted connections with SSL.
*
* At this point spark has multiple communication protocols that need to be secured and
* different underlying mechanisms are used depending on the protocol:
*
* - HTTP for broadcast and file server (via HttpServer) -> Spark currently uses Jetty
* for the HttpServer. Jetty supports multiple authentication mechanisms -
* Basic, Digest, Form, Spnego, etc. It also supports multiple different login
* services - Hash, JAAS, Spnego, JDBC, etc. Spark currently uses the HashLoginService
* to authenticate using DIGEST-MD5 via a single user and the shared secret.
* Since we are using DIGEST-MD5, the shared secret is not passed on the wire
* in plaintext.
*
* We currently support SSL (https) for this communication protocol (see the details
* below).
*
* The Spark HttpServer installs the HashLoginServer and configures it to DIGEST-MD5.
* Any clients must specify the user and password. There is a default
* Authenticator installed in the SecurityManager to how it does the authentication
* and in this case gets the user name and password from the request.
*
* - BlockTransferService -> The Spark BlockTransferServices uses java nio to asynchronously
* exchange messages. For this we use the Java SASL
* (Simple Authentication and Security Layer) API and again use DIGEST-MD5
* as the authentication mechanism. This means the shared secret is not passed
* over the wire in plaintext.
* Note that SASL is pluggable as to what mechanism it uses. We currently use
* DIGEST-MD5 but this could be changed to use Kerberos or other in the future.
* Spark currently supports "auth" for the quality of protection, which means
* the connection does not support integrity or privacy protection (encryption)
* after authentication. SASL also supports "auth-int" and "auth-conf" which
* SPARK could support in the future to allow the user to specify the quality
* of protection they want. If we support those, the messages will also have to
* be wrapped and unwrapped via the SaslServer/SaslClient.wrap/unwrap API's.
*
* Since the NioBlockTransferService does asynchronous messages passing, the SASL
* authentication is a bit more complex. A ConnectionManager can be both a client
* and a Server, so for a particular connection it has to determine what to do.
* A ConnectionId was added to be able to track connections and is used to
* match up incoming messages with connections waiting for authentication.
* The ConnectionManager tracks all the sendingConnections using the ConnectionId,
* waits for the response from the server, and does the handshake before sending
* the real message.
*
* The NettyBlockTransferService ensures that SASL authentication is performed
* synchronously prior to any other communication on a connection. This is done in
* SaslClientBootstrap on the client side and SaslRpcHandler on the server side.
*
* - HTTP for the Spark UI -> the UI was changed to use servlets so that javax servlet filters
* can be used. Yarn requires a specific AmIpFilter be installed for security to work
* properly. For non-Yarn deployments, users can write a filter to go through their
* organization's normal login service. If an authentication filter is in place then the
* SparkUI can be configured to check the logged in user against the list of users who
* have view acls to see if that user is authorized.
* The filters can also be used for many different purposes. For instance filters
* could be used for logging, encryption, or compression.
*
* The exact mechanisms used to generate/distribute the shared secret are deployment-specific.
*
* For YARN deployments, the secret is automatically generated. The secret is placed in the Hadoop
* UGI which gets passed around via the Hadoop RPC mechanism. Hadoop RPC can be configured to
* support different levels of protection. See the Hadoop documentation for more details. Each
* Spark application on YARN gets a different shared secret.
*
* On YARN, the Spark UI gets configured to use the Hadoop YARN AmIpFilter which requires the user
* to go through the ResourceManager Proxy. That proxy is there to reduce the possibility of web
* based attacks through YARN. Hadoop can be configured to use filters to do authentication. That
* authentication then happens via the ResourceManager Proxy and Spark will use that to do
* authorization against the view acls.
*
* For other Spark deployments, the shared secret must be specified via the
* spark.authenticate.secret config.
* All the nodes (Master and Workers) and the applications need to have the same shared secret.
* This again is not ideal as one user could potentially affect another users application.
* This should be enhanced in the future to provide better protection.
* If the UI needs to be secure, the user needs to install a javax servlet filter to do the
* authentication. Spark will then use that user to compare against the view acls to do
* authorization. If not filter is in place the user is generally null and no authorization
* can take place.
*
* When authentication is being used, encryption can also be enabled by setting the option
* spark.authenticate.enableSaslEncryption to true. This is only supported by communication
* channels that use the network-common library, and can be used as an alternative to SSL in those
* cases.
*
* SSL can be used for encryption for certain communication channels. The user can configure the
* default SSL settings which will be used for all the supported communication protocols unless
* they are overwritten by protocol specific settings. This way the user can easily provide the
* common settings for all the protocols without disabling the ability to configure each one
* individually.
*
* All the SSL settings like `spark.ssl.xxx` where `xxx` is a particular configuration property,
* denote the global configuration for all the supported protocols. In order to override the global
* configuration for the particular protocol, the properties must be overwritten in the
* protocol-specific namespace. Use `spark.ssl.yyy.xxx` settings to overwrite the global
* configuration for particular protocol denoted by `yyy`. Currently `yyy` can be only`fs` for
* broadcast and file server.
*
* Refer to [[org.apache.spark.SSLOptions]] documentation for the list of
* options that can be specified.
*
* SecurityManager initializes SSLOptions objects for different protocols separately. SSLOptions
* object parses Spark configuration at a given namespace and builds the common representation
* of SSL settings. SSLOptions is then used to provide protocol-specific SSLContextFactory for
* Jetty.
*
* SSL must be configured on each node and configured for each component involved in
* communication using the particular protocol. In YARN clusters, the key-store can be prepared on
* the client side then distributed and used by the executors as the part of the application
* (YARN allows the user to deploy files before the application is started).
* In standalone deployment, the user needs to provide key-stores and configuration
* options for master and workers. In this mode, the user may allow the executors to use the SSL
* settings inherited from the worker which spawned that executor. It can be accomplished by
* setting `spark.ssl.useNodeLocalConf` to `true`.
* This class implements all of the configuration related to security features described
* in the "Security" document. Please refer to that document for specific features implemented
* here.
*/

private[spark] class SecurityManager(
sparkConf: SparkConf,
val ioEncryptionKey: Option[Array[Byte]] = None)
Expand Down
Loading