Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
446063e
[SPARK-3777] Display "Executor ID" for Tasks in Stage page
zsxwing Oct 7, 2014
3d7b36e
[SPARK-3790][MLlib] CosineSimilarity Example
rezazadeh Oct 7, 2014
098c734
[SPARK-3486][MLlib][PySpark] PySpark support for Word2Vec
Ishiihara Oct 7, 2014
b32bb72
[SPARK-3832][MLlib] Upgrade Breeze dependency to 0.10
dbtsai Oct 7, 2014
5912ca6
[SPARK-3398] [EC2] Have spark-ec2 intelligently wait for specific clu…
nchammas Oct 7, 2014
b69c9fb
[SPARK-3829] Make Spark logo image on the header of HistoryPage as a …
sarutak Oct 7, 2014
798ed22
[SPARK-3412] [PySpark] Replace Epydoc with Sphinx to generate Python …
davies Oct 8, 2014
c781843
[SPARK-3836] [REPL] Spark REPL optionally propagate internal exceptions
ahirreddy Oct 8, 2014
35afdfd
[SPARK-3710] Fix Yarn integration tests on Hadoop 2.2.
Oct 8, 2014
7fca8f4
[SPARK-3788] [yarn] Fix compareFs to do the right thing for HDFS name…
Oct 8, 2014
f18dd59
[SPARK-3848] yarn alpha doesn't build on master
sarutak Oct 8, 2014
bc44187
HOTFIX: Use correct Hadoop profile in build
pwendell Oct 8, 2014
b92bd5a
[SPARK-3841] [mllib] Pretty-print params for ML examples
jkbradley Oct 8, 2014
add174a
[SPARK-3843][Minor] Cleanup scalastyle.txt at the end of running dev/…
sarutak Oct 8, 2014
a85f24a
[SPARK-3831] [SQL] Filter rule Improvement and bool expression optimi…
sarutak Oct 9, 2014
a42cc08
[SPARK-3713][SQL] Uses JSON to serialize DataType objects
liancheng Oct 9, 2014
00b7791
[SQL][Doc] Keep Spark SQL README.md up to date
Ishiihara Oct 9, 2014
4ec9319
[SPARK-3707] [SQL] Fix bug of type coercion in DIV
chenghao-intel Oct 9, 2014
e703357
[SPARK-3810][SQL] Makes PreInsertionCasts handle partitions properly
liancheng Oct 9, 2014
3e4f09d
[SQL] Prevents per row dynamic dispatching and pattern matching when …
liancheng Oct 9, 2014
bcb1ae0
[SPARK-3857] Create joins package for various join operators.
rxin Oct 9, 2014
f706823
Fetch from branch v4 in Spark EC2 script.
JoshRosen Oct 9, 2014
9c439d3
[SPARK-3856][MLLIB] use norm operator after breeze 0.10 upgrade
mengxr Oct 9, 2014
b9df8af
[SPARK-2805] Upgrade to akka 2.3.4
avati Oct 9, 2014
86b3929
[SPARK-3844][UI] Truncate appName in WebUI if it is too long
mengxr Oct 9, 2014
13cab5b
add spark.driver.memory to config docs
nartz Oct 9, 2014
14f222f
[SPARK-3158][MLLIB]Avoid 1 extra aggregation for DecisionTree training
chouqin Oct 9, 2014
1e0aa4d
[Minor] use norm operator after breeze 0.10 upgrade
witgo Oct 9, 2014
73bf3f2
[SPARK-3741] Make ConnectionManager propagate errors properly and add…
zsxwing Oct 9, 2014
b77a02f
[SPARK-3752][SQL]: Add tests for different UDF's
vidaha Oct 9, 2014
752e90f
[SPARK-3711][SQL] Optimize where in clause filter queries
Oct 9, 2014
2c88513
[SPARK-3806][SQL] Minor fix for CliSuite
scwf Oct 9, 2014
e7edb72
[SPARK-3868][PySpark] Hard to recognize which module is tested from u…
cocoatomo Oct 9, 2014
ec4d40e
[SPARK-3853][SQL] JSON Schema support for Timestamp fields
Oct 9, 2014
1faa113
Revert "[SPARK-2805] Upgrade to akka 2.3.4"
pwendell Oct 9, 2014
1c7f0ab
[SPARK-3339][SQL] Support for skipping json lines that fail to parse
yhuai Oct 9, 2014
0c0e09f
[SPARK-3412][SQL]add missing row api
adrian-wang Oct 9, 2014
bc3b6cb
[SPARK-3858][SQL] Pass the generator alias into logical plan node
Oct 9, 2014
ac30205
[SPARK-3813][SQL] Support "case when" conditional functions in Spark …
ravipesala Oct 9, 2014
4e9b551
[SPARK-3772] Allow `ipython` to be used by Pyspark workers; IPython s…
JoshRosen Oct 9, 2014
2837bf8
[SPARK-3798][SQL] Store the output of a generator in a val
marmbrus Oct 10, 2014
363baac
SPARK-3811 [CORE] More robust / standard Utils.deleteRecursively, Uti…
srowen Oct 10, 2014
edf02da
[SPARK-3654][SQL] Unifies SQL and HiveQL parsers
liancheng Oct 10, 2014
421382d
[SPARK-3824][SQL] Sets in-memory table default storage level to MEMOR…
liancheng Oct 10, 2014
6f98902
[SPARK-3834][SQL] Backticks not correctly handled in subquery aliases
ravipesala Oct 10, 2014
411cf29
[SPARK-2805] Upgrade Akka to 2.3.4
avati Oct 10, 2014
90f73fc
[SPARK-3889] Attempt to avoid SIGBUS by not mmapping files in Connect…
aarondav Oct 10, 2014
72f36ee
[SPARK-3886] [PySpark] use AutoBatchedSerializer by default
davies Oct 10, 2014
1d72a30
HOTFIX: Fix build issue with Akka 2.3.4 upgrade.
pwendell Oct 10, 2014
0e8203f
[SPARK-2924] Required by scala 2.11, only one fun/ctor amongst overri…
ScrapCodes Oct 11, 2014
81015a2
[SPARK-3867][PySpark] ./python/run-tests failed when it run with Pyth…
cocoatomo Oct 11, 2014
7a3f589
[SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and…
cocoatomo Oct 11, 2014
69c67ab
[SPARK-2377] Python API for Streaming
giwa Oct 12, 2014
18bd67c
[SPARK-3887] Send stracktrace in ConnectionManager error replies
JoshRosen Oct 12, 2014
e5be4de
SPARK-3716 [GraphX] Update Analytics.scala for partitionStrategy assi…
NamelessAnalyst Oct 12, 2014
c86c976
[HOTFIX] Fix compilation error for Yarn 2.0.*-alpha
andrewor14 Oct 12, 2014
fc616d5
[SPARK-3121] Wrong implementation of implicit bytesWritableConverter
Oct 13, 2014
b4a7fa7
[SPARK-3905][Web UI]The keys for sorting the columns of Executor page…
witgo Oct 13, 2014
d8b8c21
Add echo "Run streaming tests ..."
giwa Oct 13, 2014
92e017f
[SPARK-3899][Doc]fix wrong links in streaming doc
scwf Oct 13, 2014
942847f
Bug Fix: without unpersist method in RandomForest.scala
omgteam Oct 13, 2014
39ccaba
[SPARK-3861][SQL] Avoid rebuilding hash tables for broadcast joins on…
rxin Oct 13, 2014
49bbdcb
[Spark] RDD take() method: overestimate too much
yingjieMiao Oct 13, 2014
46db277
[SPARK-3892][SQL] remove redundant type name
adrian-wang Oct 13, 2014
2ac40da
[SPARK-3407][SQL]Add Date type support
adrian-wang Oct 13, 2014
56102dc
[SPARK-2066][SQL] Adds checks for non-aggregate attributes with aggre…
liancheng Oct 13, 2014
d3cdf91
[SPARK-3529] [SQL] Delete the temp files after test exit
chenghao-intel Oct 13, 2014
73da9c2
[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection …
ueshin Oct 13, 2014
e10d71e
[SPARK-3559][SQL] Remove unnecessary columns from List of needed Colu…
gvramana Oct 13, 2014
371321c
[SQL] Add type checking debugging functions
marmbrus Oct 13, 2014
e6e3770
SPARK-3807: SparkSql does not work for tables created using custom serde
chiragaggarwal Oct 13, 2014
9d9ca91
[SQL]Small bug in unresolved.scala
Ishiihara Oct 13, 2014
9eb49d4
[SPARK-3809][SQL] Fixes test suites in hive-thriftserver
liancheng Oct 13, 2014
4d26aca
[SPARK-3912][Streaming] Fixed flakyFlumeStreamSuite
tdas Oct 14, 2014
186b497
[SPARK-3921] Fix CoarseGrainedExecutorBackend's arguments for Standal…
aarondav Oct 14, 2014
9b6de6f
SPARK-3178 setting SPARK_WORKER_MEMORY to a value without a label (m…
bbejeck Oct 14, 2014
7ced88b
[SPARK-3946] gitignore in /python includes wrong directory
tsudukim Oct 14, 2014
24b818b
[SPARK-3944][Core] Using Option[String] where value of String can be …
Shiti Oct 14, 2014
56096db
SPARK-3803 [MLLIB] ArrayIndexOutOfBoundsException found in executing …
srowen Oct 14, 2014
7b4f39f
[SPARK-3869] ./bin/spark-class miss Java version with _JAVA_OPTIONS set
cocoatomo Oct 14, 2014
66af8e2
[SPARK-3943] Some scripts bin\*.cmd pollutes environment variables in…
tsudukim Oct 15, 2014
18ab6bd
SPARK-1307 [DOCS] Don't use term 'standalone' to refer to a Spark App…
srowen Oct 15, 2014
293a0b5
[SPARK-2098] All Spark processes should support spark-defaults.conf, …
witgo Oct 15, 2014
044583a
[Core] Upgrading ScalaStyle version to 0.5 and removing SparkSpaceAft…
prudhvi953 Oct 16, 2014
4c589ca
[SPARK-3944][Core] Code re-factored as suggested
Shiti Oct 16, 2014
091d32c
[SPARK-3971] [MLLib] [PySpark] hotfix: Customized pickler should work…
davies Oct 16, 2014
99e416b
[SQL] Fixes the race condition that may cause test failure
liancheng Oct 16, 2014
2fe0ba9
SPARK-3874: Provide stable TaskContext API
ScrapCodes Oct 17, 2014
7f7b50e
[SPARK-3923] Increase Akka heartbeat pause above heartbeat interval
aarondav Oct 17, 2014
be2ec4a
[SQL]typo in HiveFromSpark
Oct 17, 2014
642b246
[SPARK-3941][CORE] _remainingmem should not increase twice when updat…
liyezhang556520 Oct 17, 2014
e7f4ea8
[SPARK-3890][Docs]remove redundant spark.executor.memory in doc
WangTaoTheTonic Oct 17, 2014
56fd34a
[SPARK-3741] Add afterExecute for handleConnectExecutor
zsxwing Oct 17, 2014
dedace8
[SPARK-3067] JobProgressPage could not show Fair Scheduler Pools sect…
YanTangZhai Oct 17, 2014
e678b9f
[SPARK-3973] Print call site information for broadcasts
shivaram Oct 17, 2014
c351862
[SPARK-3935][Core] log the number of records that has been written
Oct 17, 2014
803e7f0
[SPARK-3979] [yarn] Use fs's default replication.
Oct 17, 2014
adcb7d3
[SPARK-3855][SQL] Preserve the result attribute of python UDFs though…
marmbrus Oct 17, 2014
23f6171
[SPARK-3985] [Examples] fix file path using os.path.join
adrian-wang Oct 17, 2014
477c648
[SPARK-3934] [SPARK-3918] [mllib] Bug fixes for RandomForest, Decisi…
jkbradley Oct 17, 2014
f406a83
SPARK-3926 [CORE] Result of JavaRDD.collectAsMap() is not Serializable
srowen Oct 18, 2014
05db2da
[SPARK-3952] [Streaming] [PySpark] add Python examples in Streaming P…
davies Oct 19, 2014
7e63bb4
[SPARK-2546] Clone JobConf for each task (branch-1.0 / 1.1 backport)
JoshRosen Oct 19, 2014
d1966f3
[SPARK-3902] [SPARK-3590] Stabilize AsynRDDActions and add Java API
JoshRosen Oct 20, 2014
c7aeecd
[SPARK-3948][Shuffle]Fix stream corruption bug in sort-based shuffle
jerryshao Oct 20, 2014
51afde9
[SPARK-4010][Web UI]Spark UI returns 500 in yarn-client mode
witgo Oct 20, 2014
ea054e1
[SPARK-3986][SQL] Fix package names to fit their directory names.
ueshin Oct 20, 2014
4afe9a4
[SPARK-3736] Workers reconnect when disassociated from the master.
mccheah Oct 20, 2014
eadc4c5
[SPARK-3207][MLLIB]Choose splits for continuous features in DecisionT…
chouqin Oct 20, 2014
7f2f032
code format and minor fix
scwf Oct 20, 2014
04385a7
update with apache master and fix confilict
scwf Oct 20, 2014
3cd96cf
scala style fix
scwf Oct 20, 2014
a10f270
revert some diffs with apache master
scwf Oct 21, 2014
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
20 changes: 10 additions & 10 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -204,16 +204,16 @@
</dependency>
</dependencies>
</profile>
<profile>
<id>hbase</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hbase_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>hbase</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hbase_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>spark-ganglia-lgpl</id>
<dependencies>
Expand Down
9 changes: 2 additions & 7 deletions bin/compute-classpath.sh
Original file line number Diff line number Diff line change
Expand Up @@ -114,10 +114,6 @@ fi
datanucleus_jars="$(find "$datanucleus_dir" 2>/dev/null | grep "datanucleus-.*\\.jar")"
datanucleus_jars="$(echo "$datanucleus_jars" | tr "\n" : | sed s/:$//g)"

hive_files=$("$JAR_CMD" -tf "$ASSEMBLY_JAR" org/apache/hadoop/hive/ql/exec 2>/dev/null)

hive_files=$("$JAR_CMD" -tf "$ASSEMBLY_JAR" org/apache/hadoop/hive/ql/exec 2>/dev/null)

if [ -n "$datanucleus_jars" ]; then
hive_files=$("$JAR_CMD" -tf "$ASSEMBLY_JAR" org/apache/hadoop/hive/ql/exec 2>/dev/null)
if [ -n "$hive_files" ]; then
Expand All @@ -126,7 +122,6 @@ if [ -n "$datanucleus_jars" ]; then
fi
fi


# Add test classes if we're running from SBT or Maven with SPARK_TESTING set to 1
if [[ $SPARK_TESTING == 1 ]]; then
CLASSPATH="$CLASSPATH:$FWDIR/core/target/scala-$SCALA_VERSION/test-classes"
Expand All @@ -137,8 +132,8 @@ if [[ $SPARK_TESTING == 1 ]]; then
CLASSPATH="$CLASSPATH:$FWDIR/streaming/target/scala-$SCALA_VERSION/test-classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/catalyst/target/scala-$SCALA_VERSION/test-classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/core/target/scala-$SCALA_VERSION/test-classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/hive/target/scala-$SCALA_VERSION/test-classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/hbase/target/scala-$SCALA_VERSION/test-classes"
CLASSPATH="$CLASSPATH:$FWDIR/sql/hive/target/scala-$SCALA_VERSION/test-classes"
fi

# Add hadoop conf dir if given -- otherwise FileSystem.*, etc fail !
Expand All @@ -150,5 +145,5 @@ fi
if [ -n "$YARN_CONF_DIR" ]; then
CLASSPATH="$CLASSPATH:$YARN_CONF_DIR"
fi
echo "$CLASSPATH"

echo "$CLASSPATH"
51 changes: 38 additions & 13 deletions bin/pyspark
Original file line number Diff line number Diff line change
Expand Up @@ -50,22 +50,47 @@ fi

. "$FWDIR"/bin/load-spark-env.sh

# Figure out which Python executable to use
# In Spark <= 1.1, setting IPYTHON=1 would cause the driver to be launched using the `ipython`
# executable, while the worker would still be launched using PYSPARK_PYTHON.
#
# In Spark 1.2, we removed the documentation of the IPYTHON and IPYTHON_OPTS variables and added
# PYSPARK_DRIVER_PYTHON and PYSPARK_DRIVER_PYTHON_OPTS to allow IPython to be used for the driver.
# Now, users can simply set PYSPARK_DRIVER_PYTHON=ipython to use IPython and set
# PYSPARK_DRIVER_PYTHON_OPTS to pass options when starting the Python driver
# (e.g. PYSPARK_DRIVER_PYTHON_OPTS='notebook'). This supports full customization of the IPython
# and executor Python executables.
#
# For backwards-compatibility, we retain the old IPYTHON and IPYTHON_OPTS variables.

# Determine the Python executable to use if PYSPARK_PYTHON or PYSPARK_DRIVER_PYTHON isn't set:
if hash python2.7 2>/dev/null; then
# Attempt to use Python 2.7, if installed:
DEFAULT_PYTHON="python2.7"
else
DEFAULT_PYTHON="python"
fi

# Determine the Python executable to use for the driver:
if [[ -n "$IPYTHON_OPTS" || "$IPYTHON" == "1" ]]; then
# If IPython options are specified, assume user wants to run IPython
# (for backwards-compatibility)
PYSPARK_DRIVER_PYTHON_OPTS="$PYSPARK_DRIVER_PYTHON_OPTS $IPYTHON_OPTS"
PYSPARK_DRIVER_PYTHON="ipython"
elif [[ -z "$PYSPARK_DRIVER_PYTHON" ]]; then
PYSPARK_DRIVER_PYTHON="${PYSPARK_PYTHON:-"$DEFAULT_PYTHON"}"
fi

# Determine the Python executable to use for the executors:
if [[ -z "$PYSPARK_PYTHON" ]]; then
if [[ "$IPYTHON" = "1" || -n "$IPYTHON_OPTS" ]]; then
# for backward compatibility
PYSPARK_PYTHON="ipython"
if [[ $PYSPARK_DRIVER_PYTHON == *ipython* && $DEFAULT_PYTHON != "python2.7" ]]; then
echo "IPython requires Python 2.7+; please install python2.7 or set PYSPARK_PYTHON" 1>&2
exit 1
else
PYSPARK_PYTHON="python"
PYSPARK_PYTHON="$DEFAULT_PYTHON"
fi
fi
export PYSPARK_PYTHON

if [[ -z "$PYSPARK_PYTHON_OPTS" && -n "$IPYTHON_OPTS" ]]; then
# for backward compatibility
PYSPARK_PYTHON_OPTS="$IPYTHON_OPTS"
fi

# Add the PySpark classes to the Python path:
export PYTHONPATH="$SPARK_HOME/python/:$PYTHONPATH"
export PYTHONPATH="$SPARK_HOME/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH"
Expand Down Expand Up @@ -93,9 +118,9 @@ if [[ -n "$SPARK_TESTING" ]]; then
unset YARN_CONF_DIR
unset HADOOP_CONF_DIR
if [[ -n "$PYSPARK_DOC_TEST" ]]; then
exec "$PYSPARK_PYTHON" -m doctest $1
exec "$PYSPARK_DRIVER_PYTHON" -m doctest $1
else
exec "$PYSPARK_PYTHON" $1
exec "$PYSPARK_DRIVER_PYTHON" $1
fi
exit
fi
Expand All @@ -111,5 +136,5 @@ if [[ "$1" =~ \.py$ ]]; then
else
# PySpark shell requires special handling downstream
export PYSPARK_SHELL=1
exec "$PYSPARK_PYTHON" $PYSPARK_PYTHON_OPTS
exec "$PYSPARK_DRIVER_PYTHON" $PYSPARK_DRIVER_PYTHON_OPTS
fi
2 changes: 1 addition & 1 deletion bin/spark-class
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ else
exit 1
fi
fi
JAVA_VERSION=$("$RUNNER" -version 2>&1 | sed 's/.* version "\(.*\)\.\(.*\)\..*"/\1\2/; 1q')
JAVA_VERSION=$("$RUNNER" -version 2>&1 | grep 'version' | sed 's/.* version "\(.*\)\.\(.*\)\..*"/\1\2/; 1q')

# Set JAVA_OPTS to be able to load native libraries and to set heap size
if [ "$JAVA_VERSION" -ge 18 ]; then
Expand Down
5 changes: 3 additions & 2 deletions bin/spark-shell.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ rem See the License for the specific language governing permissions and
rem limitations under the License.
rem

set SPARK_HOME=%~dp0..
rem This is the entry point for running Spark shell. To avoid polluting the
rem environment, it just launches a new cmd to do the real work.

cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class org.apache.spark.repl.Main %* spark-shell
cmd /V /E /C %~dp0spark-shell2.cmd %*
22 changes: 22 additions & 0 deletions bin/spark-shell2.cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
@echo off

rem
rem Licensed to the Apache Software Foundation (ASF) under one or more
rem contributor license agreements. See the NOTICE file distributed with
rem this work for additional information regarding copyright ownership.
rem The ASF licenses this file to You under the Apache License, Version 2.0
rem (the "License"); you may not use this file except in compliance with
rem the License. You may obtain a copy of the License at
rem
rem http://www.apache.org/licenses/LICENSE-2.0
rem
rem Unless required by applicable law or agreed to in writing, software
rem distributed under the License is distributed on an "AS IS" BASIS,
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
rem See the License for the specific language governing permissions and
rem limitations under the License.
rem

set SPARK_HOME=%~dp0..

cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class org.apache.spark.repl.Main %* spark-shell
51 changes: 3 additions & 48 deletions bin/spark-submit.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,52 +17,7 @@ rem See the License for the specific language governing permissions and
rem limitations under the License.
rem

rem NOTE: Any changes in this file must be reflected in SparkSubmitDriverBootstrapper.scala!
rem This is the entry point for running Spark submit. To avoid polluting the
rem environment, it just launches a new cmd to do the real work.

set SPARK_HOME=%~dp0..
set ORIG_ARGS=%*

rem Reset the values of all variables used
set SPARK_SUBMIT_DEPLOY_MODE=client
set SPARK_SUBMIT_PROPERTIES_FILE=%SPARK_HOME%\conf\spark-defaults.conf
set SPARK_SUBMIT_DRIVER_MEMORY=
set SPARK_SUBMIT_LIBRARY_PATH=
set SPARK_SUBMIT_CLASSPATH=
set SPARK_SUBMIT_OPTS=
set SPARK_SUBMIT_BOOTSTRAP_DRIVER=

:loop
if [%1] == [] goto continue
if [%1] == [--deploy-mode] (
set SPARK_SUBMIT_DEPLOY_MODE=%2
) else if [%1] == [--properties-file] (
set SPARK_SUBMIT_PROPERTIES_FILE=%2
) else if [%1] == [--driver-memory] (
set SPARK_SUBMIT_DRIVER_MEMORY=%2
) else if [%1] == [--driver-library-path] (
set SPARK_SUBMIT_LIBRARY_PATH=%2
) else if [%1] == [--driver-class-path] (
set SPARK_SUBMIT_CLASSPATH=%2
) else if [%1] == [--driver-java-options] (
set SPARK_SUBMIT_OPTS=%2
)
shift
goto loop
:continue

rem For client mode, the driver will be launched in the same JVM that launches
rem SparkSubmit, so we may need to read the properties file for any extra class
rem paths, library paths, java options and memory early on. Otherwise, it will
rem be too late by the time the driver JVM has started.

if [%SPARK_SUBMIT_DEPLOY_MODE%] == [client] (
if exist %SPARK_SUBMIT_PROPERTIES_FILE% (
rem Parse the properties file only if the special configs exist
for /f %%i in ('findstr /r /c:"^[\t ]*spark.driver.memory" /c:"^[\t ]*spark.driver.extra" ^
%SPARK_SUBMIT_PROPERTIES_FILE%') do (
set SPARK_SUBMIT_BOOTSTRAP_DRIVER=1
)
)
)

cmd /V /E /C %SPARK_HOME%\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit %ORIG_ARGS%
cmd /V /E /C %~dp0spark-submit2.cmd %*
68 changes: 68 additions & 0 deletions bin/spark-submit2.cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
@echo off

rem
rem Licensed to the Apache Software Foundation (ASF) under one or more
rem contributor license agreements. See the NOTICE file distributed with
rem this work for additional information regarding copyright ownership.
rem The ASF licenses this file to You under the Apache License, Version 2.0
rem (the "License"); you may not use this file except in compliance with
rem the License. You may obtain a copy of the License at
rem
rem http://www.apache.org/licenses/LICENSE-2.0
rem
rem Unless required by applicable law or agreed to in writing, software
rem distributed under the License is distributed on an "AS IS" BASIS,
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
rem See the License for the specific language governing permissions and
rem limitations under the License.
rem

rem NOTE: Any changes in this file must be reflected in SparkSubmitDriverBootstrapper.scala!

set SPARK_HOME=%~dp0..
set ORIG_ARGS=%*

rem Reset the values of all variables used
set SPARK_SUBMIT_DEPLOY_MODE=client
set SPARK_SUBMIT_PROPERTIES_FILE=%SPARK_HOME%\conf\spark-defaults.conf
set SPARK_SUBMIT_DRIVER_MEMORY=
set SPARK_SUBMIT_LIBRARY_PATH=
set SPARK_SUBMIT_CLASSPATH=
set SPARK_SUBMIT_OPTS=
set SPARK_SUBMIT_BOOTSTRAP_DRIVER=

:loop
if [%1] == [] goto continue
if [%1] == [--deploy-mode] (
set SPARK_SUBMIT_DEPLOY_MODE=%2
) else if [%1] == [--properties-file] (
set SPARK_SUBMIT_PROPERTIES_FILE=%2
) else if [%1] == [--driver-memory] (
set SPARK_SUBMIT_DRIVER_MEMORY=%2
) else if [%1] == [--driver-library-path] (
set SPARK_SUBMIT_LIBRARY_PATH=%2
) else if [%1] == [--driver-class-path] (
set SPARK_SUBMIT_CLASSPATH=%2
) else if [%1] == [--driver-java-options] (
set SPARK_SUBMIT_OPTS=%2
)
shift
goto loop
:continue

rem For client mode, the driver will be launched in the same JVM that launches
rem SparkSubmit, so we may need to read the properties file for any extra class
rem paths, library paths, java options and memory early on. Otherwise, it will
rem be too late by the time the driver JVM has started.

if [%SPARK_SUBMIT_DEPLOY_MODE%] == [client] (
if exist %SPARK_SUBMIT_PROPERTIES_FILE% (
rem Parse the properties file only if the special configs exist
for /f %%i in ('findstr /r /c:"^[\t ]*spark.driver.memory" /c:"^[\t ]*spark.driver.extra" ^
%SPARK_SUBMIT_PROPERTIES_FILE%') do (
set SPARK_SUBMIT_BOOTSTRAP_DRIVER=1
)
)
)

cmd /V /E /C %SPARK_HOME%\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit %ORIG_ARGS%
Loading