-
Notifications
You must be signed in to change notification settings - Fork 29k
SPARK-1565 (Addendum): Replace run-example with spark-submit.
#704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -39,17 +39,22 @@ And run the following command, which should also return 1000: | |
| ## Example Programs | ||
|
|
||
| Spark also comes with several sample programs in the `examples` directory. | ||
| To run one of them, use `./bin/run-example <class> <params>`. For example: | ||
| To run one of them, use `./bin/run-example <class> [<params>]`. For example: | ||
|
|
||
| ./bin/run-example org.apache.spark.examples.SparkLR local[2] | ||
| ./bin/run-example org.apache.spark.examples.SparkLR | ||
|
|
||
| will run the Logistic Regression example locally on 2 CPUs. | ||
| will run the Logistic Regression example locally. | ||
|
|
||
| Each of the example programs prints usage help if no params are given. | ||
| You can set the MASTER environment variable when running examples to submit | ||
| examples to a cluster. This can be a mesos:// or spark:// URL, | ||
| "yarn-cluster" or "yarn-client" to run on YARN, and "local" to run | ||
| locally with one thread, or "local[N]" to run locally with N thread. You | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. N threads |
||
| can also use an abbreviated class name if the class is in the `examples` | ||
| package. For instance: | ||
|
|
||
| All of the Spark samples take a `<master>` parameter that is the cluster URL | ||
| to connect to. This can be a mesos:// or spark:// URL, or "local" to run | ||
| locally with one thread, or "local[N]" to run locally with N threads. | ||
| MASTER=spark://host:7077 ./bin/run-example SparkPi | ||
|
|
||
| Many of the example programs print usage help if no params are given. | ||
|
|
||
| ## Running Tests | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,28 +17,10 @@ | |
| # limitations under the License. | ||
| # | ||
|
|
||
| cygwin=false | ||
| case "`uname`" in | ||
| CYGWIN*) cygwin=true;; | ||
| esac | ||
|
|
||
| SCALA_VERSION=2.10 | ||
|
|
||
| # Figure out where the Scala framework is installed | ||
| FWDIR="$(cd `dirname $0`/..; pwd)" | ||
|
|
||
| # Export this as SPARK_HOME | ||
| export SPARK_HOME="$FWDIR" | ||
|
|
||
| . $FWDIR/bin/load-spark-env.sh | ||
|
|
||
| if [ -z "$1" ]; then | ||
| echo "Usage: run-example <example-class> [<args>]" >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Figure out the JAR file that our examples were packaged into. This includes a bit of a hack | ||
| # to avoid the -sources and -doc packages that are built by publish-local. | ||
| EXAMPLES_DIR="$FWDIR"/examples | ||
|
|
||
| if [ -f "$FWDIR/RELEASE" ]; then | ||
|
|
@@ -49,46 +31,29 @@ fi | |
|
|
||
| if [[ -z $SPARK_EXAMPLES_JAR ]]; then | ||
| echo "Failed to find Spark examples assembly in $FWDIR/lib or $FWDIR/examples/target" >&2 | ||
| echo "You need to build Spark with sbt/sbt assembly before running this program" >&2 | ||
| echo "You need to build Spark before running this program" >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| EXAMPLE_MASTER=${MASTER:-"local[2]"} | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not a consistent default. Should we use |
||
|
|
||
| # Since the examples JAR ideally shouldn't include spark-core (that dependency should be | ||
| # "provided"), also add our standard Spark classpath, built using compute-classpath.sh. | ||
| CLASSPATH=`$FWDIR/bin/compute-classpath.sh` | ||
| CLASSPATH="$SPARK_EXAMPLES_JAR:$CLASSPATH" | ||
|
|
||
| if $cygwin; then | ||
| CLASSPATH=`cygpath -wp $CLASSPATH` | ||
| export SPARK_EXAMPLES_JAR=`cygpath -w $SPARK_EXAMPLES_JAR` | ||
| fi | ||
|
|
||
| # Find java binary | ||
| if [ -n "${JAVA_HOME}" ]; then | ||
| RUNNER="${JAVA_HOME}/bin/java" | ||
| else | ||
| if [ `command -v java` ]; then | ||
| RUNNER="java" | ||
| else | ||
| echo "JAVA_HOME is not set" >&2 | ||
| exit 1 | ||
| fi | ||
| fi | ||
|
|
||
| # Set JAVA_OPTS to be able to load native libraries and to set heap size | ||
| JAVA_OPTS="$SPARK_JAVA_OPTS" | ||
| # Load extra JAVA_OPTS from conf/java-opts, if it exists | ||
| if [ -e "$FWDIR/conf/java-opts" ] ; then | ||
| JAVA_OPTS="$JAVA_OPTS `cat $FWDIR/conf/java-opts`" | ||
| if [ -n "$1" ]; then | ||
| EXAMPLE_CLASS="$1" | ||
| shift | ||
| else | ||
| echo "usage: ./bin/run-example <example-class> [<example-args>]" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oops sorry there's another one |
||
| echo " - set MASTER=XX to use a specific master" | ||
| echo " - can use abbreviated example class name (e.g. SparkPi)" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we should also add an example for say MLlib or Sql examples; people might try to run |
||
| echo | ||
| exit -1 | ||
| fi | ||
| export JAVA_OPTS | ||
|
|
||
| if [ "$SPARK_PRINT_LAUNCH_COMMAND" == "1" ]; then | ||
| echo -n "Spark Command: " | ||
| echo "$RUNNER" -cp "$CLASSPATH" $JAVA_OPTS "$@" | ||
| echo "========================================" | ||
| echo | ||
| if [[ ! $EXAMPLE_CLASS == org.apache.spark.examples* ]]; then | ||
| EXAMPLE_CLASS="org.apache.spark.examples.$EXAMPLE_CLASS" | ||
| fi | ||
|
|
||
| exec "$RUNNER" -cp "$CLASSPATH" $JAVA_OPTS "$@" | ||
| ./bin/spark-submit \ | ||
| --master $EXAMPLE_MASTER \ | ||
| --class $EXAMPLE_CLASS \ | ||
| $SPARK_EXAMPLES_JAR \ | ||
| "$@" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: What does the notation
[< ... >]mean? I think it's clearer if it's just[params]