Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
250cb95
Do not ignore spark.driver.extra* for client mode
andrewor14 Aug 4, 2014
a2ab1b0
Parse spark.driver.extra* in bash
andrewor14 Aug 6, 2014
0025474
Revert SparkSubmit handling of --driver-* options for only cluster mode
andrewor14 Aug 6, 2014
63ed2e9
Merge branch 'master' of github.com:apache/spark into submit-driver-e…
andrewor14 Aug 6, 2014
75ee6b4
Remove accidentally added file
andrewor14 Aug 6, 2014
8843562
Fix compilation issues...
andrewor14 Aug 6, 2014
98dd8e3
Add warning if properties file does not exist
andrewor14 Aug 6, 2014
130f295
Handle spark.driver.memory too
andrewor14 Aug 6, 2014
4edcaa8
Redirect stdout to stderr for python
andrewor14 Aug 6, 2014
e5cfb46
Collapse duplicate code + fix potential whitespace issues
andrewor14 Aug 6, 2014
4ec22a1
Merge branch 'master' of github.com:apache/spark into submit-driver-e…
andrewor14 Aug 6, 2014
ef12f74
Minor formatting
andrewor14 Aug 6, 2014
fa2136e
Escape Java options + parse java properties files properly
andrewor14 Aug 7, 2014
dec2343
Only export variables if they exist
andrewor14 Aug 7, 2014
a4df3c4
Move parsing and escaping logic to utils.sh
andrewor14 Aug 7, 2014
de765c9
Print spark-class command properly
andrewor14 Aug 7, 2014
8e552b7
Include an example of spark.*.extraJavaOptions
andrewor14 Aug 7, 2014
c13a2cb
Merge branch 'master' of github.com:apache/spark into submit-driver-e…
andrewor14 Aug 7, 2014
c854859
Add small comment
andrewor14 Aug 7, 2014
1cdc6b1
Fix bug: escape escaped double quotes properly
andrewor14 Aug 7, 2014
45a1eb9
Fix bug: escape escaped backslashes and quotes properly...
andrewor14 Aug 7, 2014
aabfc7e
escape -> split (minor)
andrewor14 Aug 7, 2014
a992ae2
Escape spark.*.extraJavaOptions correctly
andrewor14 Aug 7, 2014
c7b9926
Minor changes to spark-defaults.conf.template
andrewor14 Aug 7, 2014
5d8f8c4
Merge branch 'master' of github.com:apache/spark into submit-driver-e…
andrewor14 Aug 7, 2014
e793e5f
Handle multi-line arguments
andrewor14 Aug 8, 2014
c2273fc
Fix typo (minor)
andrewor14 Aug 8, 2014
b3c4cd5
Fix bug: count the number of quotes instead of detecting presence
andrewor14 Aug 8, 2014
4ae24c3
Fix bug: escape properly in quote_java_property
andrewor14 Aug 8, 2014
8d26a5c
Add tests for bash/utils.sh
andrewor14 Aug 8, 2014
2732ac0
Integrate BASH tests into dev/run-tests + log error properly
andrewor14 Aug 9, 2014
aeb79c7
Merge branch 'master' of github.com:apache/spark into handle-configs-…
andrewor14 Aug 9, 2014
8d4614c
Merge branch 'master' of github.com:apache/spark into handle-configs-…
andrewor14 Aug 16, 2014
56ac247
Use eval and set to simplify splitting
andrewor14 Aug 16, 2014
bd0d468
Simplify parsing config file by ignoring multi-line arguments
andrewor14 Aug 16, 2014
be99eb3
Fix tests to not include multi-line configs
andrewor14 Aug 16, 2014
371cac4
Add function prefix (minor)
andrewor14 Aug 16, 2014
fa11ef8
Parse the properties file only if the special configs exist
andrewor14 Aug 16, 2014
7396be2
Explicitly comment that multi-line properties are not supported
andrewor14 Aug 16, 2014
7a4190a
Merge branch 'master' of github.com:apache/spark into handle-configs-…
andrewor14 Aug 16, 2014
c886568
Fix lines too long + a few comments / style (minor)
andrewor14 Aug 16, 2014
0effa1e
Add code in Scala that handles special configs
andrewor14 Aug 19, 2014
a396eda
Nullify my own hard work to simplify bash
andrewor14 Aug 19, 2014
c37e08d
Revert a few more changes
andrewor14 Aug 19, 2014
3a8235d
Only parse the properties file if special configs exist
andrewor14 Aug 19, 2014
7d94a8d
Merge branch 'master' of github.com:apache/spark into handle-configs-…
andrewor14 Aug 19, 2014
b71f52b
Revert a few more changes (minor)
andrewor14 Aug 19, 2014
c84f5c8
Remove debug print statement (minor)
andrewor14 Aug 19, 2014
158f813
Remove "client mode" boolean argument
andrewor14 Aug 19, 2014
a91ea19
Fix precedence of library paths, classpath, java opts and memory
andrewor14 Aug 19, 2014
1ea6bbe
SparkClassLauncher -> SparkSubmitDriverBootstrapper
andrewor14 Aug 19, 2014
d6488f9
Merge branch 'master' of github.com:apache/spark into handle-configs-…
andrewor14 Aug 19, 2014
19464ad
SPARK_SUBMIT_JAVA_OPTS -> SPARK_SUBMIT_OPTS
andrewor14 Aug 19, 2014
8867a09
A few more naming things (minor)
andrewor14 Aug 19, 2014
9ba37e2
Don't barf when the properties file does not exist
andrewor14 Aug 19, 2014
a78cb26
Revert a few changes in utils.sh (minor)
andrewor14 Aug 19, 2014
d0f20db
Don't pass empty library paths, classpath, java opts etc.
andrewor14 Aug 19, 2014
9a778f6
Fix PySpark: actually kill driver on termination
andrewor14 Aug 20, 2014
51aeb01
Filter out JVM memory in Scala rather than Bash (minor)
andrewor14 Aug 20, 2014
ff34728
Minor comments
andrewor14 Aug 20, 2014
08fd788
Warn against external usages of SparkSubmitDriverBootstrapper
andrewor14 Aug 20, 2014
24dba60
Merge branch 'master' of github.com:apache/spark into handle-configs-…
andrewor14 Aug 20, 2014
bed4bdf
Change a few comments / messages (minor)
andrewor14 Aug 20, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Move parsing and escaping logic to utils.sh
This commit also fixes a deadly typo.
  • Loading branch information
andrewor14 committed Aug 7, 2014
commit a4df3c4165ce4546742fbd0b9d92ea612973bb2e
43 changes: 10 additions & 33 deletions bin/spark-class
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ FWDIR="$(cd `dirname $0`/..; pwd)"
# Export this as SPARK_HOME
export SPARK_HOME="$FWDIR"

# Load utility functions
. "$SPARK_HOME/bin/utils.sh"

. $FWDIR/bin/load-spark-env.sh

if [ -z "$1" ]; then
Expand Down Expand Up @@ -77,7 +80,7 @@ case "$1" in
'org.apache.spark.deploy.SparkSubmit')
OUR_JAVA_OPTS="$SPARK_JAVA_OPTS $SPARK_SUBMIT_OPTS"
if [ -n "$SPARK_SUBMIT_LIBRARY_PATH" ]; then
OUR_JAVA_OPTS="$OUT_JAVA_OPTS -Djava.library.path=$SPARK_SUBMIT_LIBRARY_PATH"
OUR_JAVA_OPTS="$OUR_JAVA_OPTS -Djava.library.path=$SPARK_SUBMIT_LIBRARY_PATH"
fi
OUR_JAVA_MEM=${SPARK_DRIVER_MEMORY:-$DEFAULT_MEM}
;;
Expand All @@ -103,11 +106,16 @@ fi
# Set JAVA_OPTS to be able to load native libraries and to set heap size
JAVA_OPTS="-XX:MaxPermSize=128m $OUR_JAVA_OPTS"
JAVA_OPTS="$JAVA_OPTS -Xms$OUR_JAVA_MEM -Xmx$OUR_JAVA_MEM"

# Load extra JAVA_OPTS from conf/java-opts, if it exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know why this was exported before? I looked and I couldn't find anything downstream that consumes this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not consumed as far as I can tell.

if [ -e "$FWDIR/conf/java-opts" ] ; then
JAVA_OPTS="$JAVA_OPTS `cat $FWDIR/conf/java-opts`"
fi
export JAVA_OPTS

# Escape JAVA_OPTS properly to handle whitespace, double quotes and backslashes
# This exports the escaped java options into ESCAPED_JAVA_OPTS
escape_java_options "$JAVA_OPTS"

# Attention: when changing the way the JAVA_OPTS are assembled, the change must be reflected in CommandUtils.scala!

TOOLS_DIR="$FWDIR"/tools
Expand Down Expand Up @@ -148,37 +156,6 @@ if $cygwin; then
fi
export CLASSPATH

# Properly escape java options, dealing with whitespace, double quotes and backslashes
# This accepts a string, and returns the escaped list through ESCAPED_JAVA_OPTS
escape_java_options() {
ESCAPED_JAVA_OPTS=() # return value
option_buffer="" # buffer for collecting parts of an option
opened_quotes=0 # whether we are expecting a closing double quotes
for word in $1; do
contains_quote=$(echo "$word" | grep \" | grep -v \\\\\")
if [ -n "$contains_quote" ]; then
# Flip the bit
opened_quotes=$(((opened_quotes + 1) % 2))
fi
if [[ $opened_quotes == 0 ]]; then
ESCAPED_JAVA_OPTS+=("$(echo "$option_buffer $word" | sed "s/^[[:space:]]*//" | sed "s/\([^\\]\)\"/\1/g")")
option_buffer=""
else
option_buffer="$option_buffer $word"
fi
done
# Something is wrong if we ended with open double quotes
if [[ $opened_quotes == 1 ]]; then
echo "Java options parse error! Expecting closing double quotes." 1>&2
exit 1
fi
}

escape_java_options "$JAVA_OPTS"
for option in "${ESCAPED_JAVA_OPTS[@]}"; do
echo "$option"
done

if [ "$SPARK_PRINT_LAUNCH_COMMAND" == "1" ]; then
echo -n "Spark Command: " 1>&2
echo "$RUNNER" -cp "$CLASSPATH" "${ESCAPED_JAVA_OPTS[@]}" "$@" 1>&2
Expand Down
39 changes: 13 additions & 26 deletions bin/spark-submit
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,14 @@
export SPARK_HOME="$(cd `dirname $0`/..; pwd)"
ORIG_ARGS=("$@")

# Load utility functions
. "$SPARK_HOME/bin/utils.sh"

# For client mode, the driver will be launched in the JVM that launches
# SparkSubmit, so we need to handle the class paths, java options, and
# memory pre-emptively in bash. Otherwise, it will be too late by the
# time the JVM has started.

while (($#)); do
if [ "$1" = "--deploy-mode" ]; then
DEPLOY_MODE=$2
Expand All @@ -40,34 +48,14 @@ done
DEPLOY_MODE=${DEPLOY_MODE:-"client"}
PROPERTIES_FILE=${PROPERTIES_FILE:-"$SPARK_HOME/conf/spark-defaults.conf"}

# For client mode, the driver will be launched in the JVM that launches
# SparkSubmit, so we need to handle the class paths, java options, and
# memory pre-emptively in bash. Otherwise, it will be too late by the
# time the JVM has started.

if [ $DEPLOY_MODE == "client" ]; then
# We parse the default properties file here, assuming each line is
# a key value pair delimited either by white space or "=" sign. All
# spark.driver.* configs must be processed now before it's too late.
# Parse the default properties file here for spark.driver.* configs
if [ -f "$PROPERTIES_FILE" ]; then
echo "Using properties file $PROPERTIES_FILE." 1>&2

# Parse the value of the given config according to the specifications outlined in
# http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html#load(java.io.Reader)
parse_config() {
result=$( \
sed "/^[#!]/ d" "conf/spark-defaults.conf" | \
grep "$1" | \
sed "s/$1//" | \
sed "s/^[[:space:]]*[:=]\{0,1\}//" | \
sed "s/^[[:space:]]*\(.*\)[[:space:]]*$/\1/g" \
)
}
parse_config "spark.driver.memory"; DRIVER_MEMORY_CONF="$result"
parse_config "spark.driver.extraJavaOptions"; DRIVER_EXTRA_JAVA_OPTS="$result"
parse_config "spark.driver.extraClassPath"; DRIVER_EXTRA_CLASSPATH="$result"
parse_config "spark.driver.extraLibraryPath"; DRIVER_EXTRA_LIBRARY_PATH="$result"

parse_java_property "spark.driver.memory"; DRIVER_MEMORY_CONF="$JAVA_PROPERTY_VALUE"
parse_java_property "spark.driver.extraJavaOptions"; DRIVER_EXTRA_JAVA_OPTS="$JAVA_PROPERTY_VALUE"
parse_java_property "spark.driver.extraClassPath"; DRIVER_EXTRA_CLASSPATH="$JAVA_PROPERTY_VALUE"
parse_java_property "spark.driver.extraLibraryPath"; DRIVER_EXTRA_LIBRARY_PATH="$JAVA_PROPERTY_VALUE"
if [ -n "$DRIVER_EXTRA_JAVA_OPTS" ]; then
export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS $DRIVER_EXTRA_JAVA_OPTS"
fi
Expand All @@ -80,7 +68,6 @@ if [ $DEPLOY_MODE == "client" ]; then
else
echo "Warning: properties file $PROPERTIES_FILE does not exist!" 1>&2
fi

# Favor command line memory over config memory
DRIVER_MEMORY=${DRIVER_MEMORY:-"$DRIVER_MEMORY_CONF"}
if [ -n "$DRIVER_MEMORY" ]; then
Expand Down
59 changes: 59 additions & 0 deletions bin/utils.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# Parse the value of a config from a java properties file according to the specifications in
# http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html#load(java.io.Reader).
# This accepts the name of the config and returns the value through JAVA_PROPERTY_VALUE.
parse_java_property() {
JAVA_PROPERTY_VALUE=$( \
sed "/^[#!]/ d" "conf/spark-defaults.conf" | \
grep "$1" | \
sed "s/$1//" | \
sed "s/^[[:space:]]*[:=]\{0,1\}//" | \
sed "s/^[[:space:]]*\(.*\)[[:space:]]*$/\1/g" \
)
export JAVA_PROPERTY_VALUE
}

# Properly escape java options, dealing with whitespace, double quotes and backslashes
# This accepts a string, and returns the escaped list through ESCAPED_JAVA_OPTS.
escape_java_options() {
ESCAPED_JAVA_OPTS=() # return value
option_buffer="" # buffer for collecting parts of an option
opened_quotes=0 # whether we are expecting a closing double quotes
for word in $1; do
contains_quote=$(echo "$word" | grep \" | grep -v \\\\\")
if [ -n "$contains_quote" ]; then
# Flip the bit
opened_quotes=$(((opened_quotes + 1) % 2))
fi
if [[ $opened_quotes == 0 ]]; then
ESCAPED_JAVA_OPTS+=("$(echo "$option_buffer $word" | sed "s/^[[:space:]]*//" | sed "s/\([^\\]\)\"/\1/g")")
option_buffer=""
else
option_buffer="$option_buffer $word"
fi
done
# Something is wrong if we ended with open double quotes
if [[ $opened_quotes == 1 ]]; then
echo "Java options parse error! Expecting closing double quotes." 1>&2
exit 1
fi
export ESCAPED_JAVA_OPTS
}