33Spark is a fast and general cluster computing system for Big Data. It provides
44high-level APIs in Scala, Java, and Python, and an optimized engine that
55supports general computation graphs for data analysis. It also supports a
6- rich set of higher-level tools including Spark SQL for SQL and structured
7- data processing, MLlib for machine learning, GraphX for graph processing,
6+ rich set of higher-level tools including Spark SQL for SQL and DataFrames,
7+ MLlib for machine learning, GraphX for graph processing,
88and Spark Streaming for stream processing.
99
1010< http://spark.apache.org/ >
@@ -22,7 +22,7 @@ This README file only contains basic setup instructions.
2222Spark is built using [ Apache Maven] ( http://maven.apache.org/ ) .
2323To build Spark and its example programs, run:
2424
25- mvn -DskipTests clean package
25+ build/ mvn -DskipTests clean package
2626
2727(You do not need to do this if you downloaded a pre-built package.)
2828More detailed documentation is available from the project site, at
@@ -43,7 +43,7 @@ Try the following command, which should return 1000:
4343Alternatively, if you prefer Python, you can use the Python shell:
4444
4545 ./bin/pyspark
46-
46+
4747And run the following command, which should also return 1000:
4848
4949 >>> sc.parallelize(range(1000)).count()
@@ -58,9 +58,9 @@ To run one of them, use `./bin/run-example <class> [params]`. For example:
5858will run the Pi example locally.
5959
6060You can set the MASTER environment variable when running examples to submit
61- examples to a cluster. This can be a mesos:// or spark:// URL,
62- "yarn-cluster" or "yarn-client" to run on YARN, and "local" to run
63- locally with one thread, or "local[ N] " to run locally with N threads. You
61+ examples to a cluster. This can be a mesos:// or spark:// URL,
62+ "yarn-cluster" or "yarn-client" to run on YARN, and "local" to run
63+ locally with one thread, or "local[ N] " to run locally with N threads. You
6464can also use an abbreviated class name if the class is in the ` examples `
6565package. For instance:
6666
@@ -75,7 +75,7 @@ can be run using:
7575
7676 ./dev/run-tests
7777
78- Please see the guidance on how to
78+ Please see the guidance on how to
7979[ run tests for a module, or individual tests] ( https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools ) .
8080
8181## A Note About Hadoop Versions
0 commit comments