Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
82fd38d
[SPARK-5200] Disable web UI in Hive ThriftServer tests
JoshRosen Jan 12, 2015
ef9224e
[SPARK-5102][Core]subclass of MapStatus needs to be registered with Kryo
lianhuiwang Jan 12, 2015
13e610b
SPARK-4159 [BUILD] Addendum: improve running of single test after ena…
srowen Jan 12, 2015
a3978f3
[SPARK-5078] Optionally read from SPARK_LOCAL_HOSTNAME
marmbrus Jan 12, 2015
aff49a3
SPARK-5172 [BUILD] spark-examples-***.jar shades a wrong Hadoop distr…
srowen Jan 12, 2015
3aed305
[SPARK-4999][Streaming] Change storeInBlockManager to false by default
jerryshao Jan 12, 2015
5d9fa55
[SPARK-5049][SQL] Fix ordering of partition columns in ParquetTableScan
marmbrus Jan 12, 2015
1e42e96
[SPARK-5138][SQL] Ensure schema can be inferred from a namedtuple
mulby Jan 13, 2015
f7741a9
[SPARK-5006][Deploy]spark.port.maxRetries doesn't work
WangTaoTheTonic Jan 13, 2015
9dea64e
[SPARK-4697][YARN]System properties should override environment varia…
WangTaoTheTonic Jan 13, 2015
39e333e
[SPARK-5131][Streaming][DOC]: There is a discrepancy in WAL implement…
uncleGen Jan 13, 2015
8ead999
[SPARK-5223] [MLlib] [PySpark] fix MapConverter and ListConverter in …
Jan 13, 2015
6463e0b
[SPARK-4912][SQL] Persistent tables for the Spark SQL data sources api
yhuai Jan 13, 2015
14e3f11
[SPARK-5168] Make SQLConf a field rather than mixin in SQLContext
rxin Jan 13, 2015
f996909
[SPARK-5123][SQL] Reconcile Java/Scala API for data types.
rxin Jan 14, 2015
d5eeb35
[SPARK-5167][SQL] Move Row into sql package and make it usable for Java.
rxin Jan 14, 2015
a3f7421
[SPARK-5248] [SQL] move sql.types.decimal.Decimal to sql.types.Decimal
adrian-wang Jan 14, 2015
81f72a0
[SPARK-5211][SQL]Restore HiveMetastoreTypes.toDataType
yhuai Jan 14, 2015
38bdc99
[SQL] some comments fix for GROUPING SETS
adrian-wang Jan 14, 2015
5840f54
[SPARK-2909] [MLlib] [PySpark] SparseVector in pyspark now supports i…
MechCoder Jan 14, 2015
9d4449c
[SPARK-5228][WebUI] Hide tables for "Active Jobs/Completed Jobs/Faile…
sarutak Jan 14, 2015
259936b
[SPARK-4014] Add TaskContext.attemptNumber and deprecate TaskContext.…
JoshRosen Jan 14, 2015
2fd7f72
[SPARK-5235] Make SQLConf Serializable
alexbaretta Jan 14, 2015
76389c5
[SPARK-5234][ml]examples for ml don't have sparkContext.stop
Jan 14, 2015
13d2406
[SPARK-5254][MLLIB] Update the user guide to position spark.ml better
mengxr Jan 15, 2015
cfa397c
[SPARK-5193][SQL] Tighten up SQLContext API
rxin Jan 15, 2015
6abc45e
[SPARK-5254][MLLIB] remove developers section from spark.ml guide
mengxr Jan 15, 2015
4b325c7
[SPARK-5193][SQL] Tighten up HiveContext API
rxin Jan 15, 2015
3c8650c
[SPARK-5224] [PySpark] improve performance of parallelize list/ndarray
Jan 15, 2015
1881431
[SPARK-5274][SQL] Reconcile Java and Scala UDFRegistration.
rxin Jan 16, 2015
65858ba
[Minor] Fix tiny typo in BlockManager
sarutak Jan 16, 2015
96c2c71
[SPARK-4857] [CORE] Adds Executor membership events to SparkListener
Jan 16, 2015
a79a9f9
[SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
Jan 16, 2015
2be82b1
[SPARK-1507][YARN]specify # cores for ApplicationMaster
WangTaoTheTonic Jan 16, 2015
e200ac8
[SPARK-5201][CORE] deal with int overflow in the ParallelCollectionRD…
advancedxy Jan 16, 2015
f6b852a
[DOCS] Fix typo in return type of cogroup
srowen Jan 16, 2015
e8422c5
[SPARK-5231][WebUI] History Server shows wrong job submission time.
sarutak Jan 16, 2015
ecf943d
[WebUI] Fix collapse of WebUI layout
sarutak Jan 16, 2015
d05c9ee
[SPARK-4923][REPL] Add Developer API to REPL to allow re-publishing t…
Jan 16, 2015
fd3a8a1
[SPARK-733] Add documentation on use of accumulators in lazy transfor…
Jan 16, 2015
ee1c1f3
[SPARK-4937][SQL] Adding optimization to simplify the And, Or condit…
scwf Jan 16, 2015
61b427d
[SPARK-5193][SQL] Remove Spark SQL Java-specific API.
rxin Jan 17, 2015
f3bfc76
[SQL][minor] Improved Row documentation.
rxin Jan 17, 2015
c1f3c27
[SPARK-4937][SQL] Comment for the newly optimization rules in `Boolea…
scwf Jan 17, 2015
6999910
[SPARK-5096] Use sbt tasks instead of vals to get hadoop version
marmbrus Jan 18, 2015
e7884bc
[SQL][Minor] Added comments and examples to explain BooleanSimplifica…
rxin Jan 18, 2015
e12b5b6
MAINTENANCE: Automated closing of pull requests.
pwendell Jan 18, 2015
ad16da1
[HOTFIX]: Minor clean up regarding skipped artifacts in build files.
pwendell Jan 18, 2015
1727e08
[SPARK-5279][SQL] Use java.math.BigDecimal as the exposed Decimal type.
rxin Jan 18, 2015
1a200a3
[SQL][Minor] Update sql doc according to data type APIs changes
scwf Jan 18, 2015
1955645
[SQL][minor] Put DataTypes.java in java dir.
rxin Jan 19, 2015
7dbf1fd
[SQL] fix typo in class description
Jan 19, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-4923][REPL] Add Developer API to REPL to allow re-publishing t…
…he REPL jar

As requested in [SPARK-4923](https://issues.apache.org/jira/browse/SPARK-4923), I've provided a rough DeveloperApi for the repl. I've only done this for Scala 2.10 because it does not appear that Scala 2.11 is implemented. The Scala 2.11 repl still has the old `scala.tools.nsc` package and the SparkIMain does not appear to have the class server needed for shipping code over (unless this functionality has been moved elsewhere?). I also left alone the `ExecutorClassLoader` and `ConstructorCleaner` as I have no experience working with those classes.

This marks the majority of methods in `SparkIMain` as _private_ with a few special cases being _private[repl]_ as other classes within the same package access them. Any public method has been marked with `DeveloperApi` as suggested by pwendell and I took the liberty of writing up a Scaladoc for each one to further elaborate their usage.

As the Scala 2.11 REPL [conforms]((scala/scala#2206)) to [JSR-223](http://docs.oracle.com/javase/8/docs/technotes/guides/scripting/), the [Spark Kernel](https://github.com/ibm-et/spark-kernel) uses the SparkIMain of Scala 2.10 in the same manner. So, I've taken care to expose methods predominately related to necessary functionality towards a JSR-223 scripting engine implementation.

1. The ability to _get_ variables from the interpreter (and other information like class/symbol/type)
2. The ability to _put_ variables into the interpreter
3. The ability to _compile_ code
4. The ability to _execute_ code
5. The ability to get contextual information regarding the scripting environment

Additional functionality that I marked as exposed included the following:

1. The blocking initialization method (needed to actually start SparkIMain instance)
2. The class server uri (needed to set the _spark.repl.class.uri_ property after initialization), reduced from the entire class server
3. The class output directory (beneficial for tools like ours that need to inspect and use the directory where class files are served)
4. Suppression (quiet/silence) mechanics for output
5. Ability to add a jar to the compile/runtime classpath
6. The reset/close functionality
7. Metric information (last variable assignment, "needed" for extracting results from last execution, real variable name for better debugging)
8. Execution wrapper (useful to have, but debatable)

Aside from `SparkIMain`, I updated other classes/traits and their methods in the _repl_ package to be private/package protected where possible. A few odd cases (like the SparkHelper being in the scala.tools.nsc package to expose a private variable) still exist, but I did my best at labelling them.

`SparkCommandLine` has proven useful to extract settings and `SparkJLineCompletion` has proven to be useful in implementing auto-completion in the [Spark Kernel](https://github.com/ibm-et/spark-kernel) project. Other than those - and `SparkIMain` - my experience has yielded that other classes/methods are not necessary for interactive applications taking advantage of the REPL API.

Tested via the following:

    $ export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
    $ mvn -Phadoop-2.3 -DskipTests clean package && mvn -Phadoop-2.3 test

Also did a quick verification that I could start the shell and execute some code:

    $ ./bin/spark-shell
    ...

    scala> val x = 3
    x: Int = 3

    scala> sc.parallelize(1 to 10).reduce(_+_)
    ...
    res1: Int = 55

Author: Chip Senkbeil <rcsenkbe@us.ibm.com>
Author: Chip Senkbeil <chip.senkbeil@gmail.com>

Closes apache#4034 from rcsenkbeil/AddDeveloperApiToRepl and squashes the following commits:

053ca75 [Chip Senkbeil] Fixed failed build by adding missing DeveloperApi import
c1b88aa [Chip Senkbeil] Added DeveloperApi to public classes in repl
6dc1ee2 [Chip Senkbeil] Added missing method to expose error reporting flag
26fd286 [Chip Senkbeil] Refactored other Scala 2.10 classes and methods to be private/package protected where possible
925c112 [Chip Senkbeil] Added DeveloperApi and Scaladocs to SparkIMain for Scala 2.10
  • Loading branch information
Chip Senkbeil authored and pwendell committed Jan 16, 2015
commit d05c9ee6e8441e54732e40de45d1d2311307908f
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,21 @@ package org.apache.spark.repl

import scala.tools.nsc.{Settings, CompilerCommand}
import scala.Predef._
import org.apache.spark.annotation.DeveloperApi

/**
* Command class enabling Spark-specific command line options (provided by
* <i>org.apache.spark.repl.SparkRunnerSettings</i>).
*
* @example new SparkCommandLine(Nil).settings
*
* @param args The list of command line arguments
* @param settings The underlying settings to associate with this set of
* command-line options
*/
@DeveloperApi
class SparkCommandLine(args: List[String], override val settings: Settings)
extends CompilerCommand(args, settings) {

def this(args: List[String], error: String => Unit) {
this(args, new SparkRunnerSettings(error))
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ import scala.tools.nsc.ast.parser.Tokens.EOF

import org.apache.spark.Logging

trait SparkExprTyper extends Logging {
private[repl] trait SparkExprTyper extends Logging {
val repl: SparkIMain

import repl._
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,23 @@

package scala.tools.nsc

import org.apache.spark.annotation.DeveloperApi

// NOTE: Forced to be public (and in scala.tools.nsc package) to access the
// settings "explicitParentLoader" method

/**
* Provides exposure for the explicitParentLoader method on settings instances.
*/
@DeveloperApi
object SparkHelper {
/**
* Retrieves the explicit parent loader for the provided settings.
*
* @param settings The settings whose explicit parent loader to retrieve
*
* @return The Optional classloader representing the explicit parent loader
*/
@DeveloperApi
def explicitParentLoader(settings: Settings) = settings.explicitParentLoader
}
Loading