-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-1267][SPARK-18129] Allow PySpark to be pip installed #15659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
109 commits
Select commit
Hold shift + click to select a range
7763f3c
Adds setup.py
30debc7
Fix spacing.
5155531
updUpdate py4j dependency. Add mllib to extas_require, fix some inden…
2f0bf9b
Adds MANIFEST.in file.
4c00b98
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 7ff8d0f
Start working towards post-2.0 pip installable PypSpark (so including…
holdenk 610b975
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk cb2e06d
So MANIFEST and setup can't refer to things above the root of the pro…
holdenk 01f791d
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk e2e4d1c
Keep the symlink
holdenk fb15d7e
Some progress we need to use SDIST but is ok
holdenk aab7ee4
Reenable cleanup
holdenk 5a57620
Try and provide a clear error message when pip installed directly, fi…
holdenk 646aa23
Add two scripts
holdenk 36c9d45
package_data doesn't work so well with nested directories so instead …
holdenk a78754b
Use copyfile also check for jars dir too
holdenk 955e92b
Check if pip installed when finding the shell file
holdenk 2d88a40
Check if jars dir exists rather than release file
holdenk 9e5c532
Start working a bit on the docs
holdenk be7eadd
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 07d3849
Try and include pyspark zip file for yarn use
holdenk 11b5fa8
Copy pyspark zip for use in yarn cluster mode
holdenk 8791f82
Start adding scripts to test pip installability
holdenk 92837a3
Works on yarn, works with spark submit, still need to fix import base…
holdenk 6947a85
Start updating find-spark-home to be available in many cases.
holdenk 944160c
Use Switch to find_spark_home.py
holdenk 5bf0746
Move to under pyspark
holdenk 435f842
Update to py4j 0.10.4 in the deps, also switch how we are copying fin…
holdenk 27ca27e
Update java gateway to use _find_spark_home function, add quick sanit…
holdenk df126cf
Lint fixes
holdenk 70a78a0
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 555d443
More progress on running the pip installability tests
holdenk 051abe5
Try and unify path used for shell script file, add a README.md file f…
holdenk b345bdb
Add README file
holdenk 28da44b
Switch version to a PEP440 version otherwise it can't go on PyPiTest,…
holdenk 0f16c08
More notes
holdenk 574c1f0
Add pip-sanity-check.py to the linter list and add a note that we sho…
holdenk 6299744
Fix handling of long_description, add check for existing artifacts in…
holdenk 17104c1
Fix check for number of sdists
holdenk 0447ea2
Typo fixes, make sure SPARK_HOME isn't being set based on PWD during …
holdenk c335c80
More typo fixes
holdenk 146567b
We are python 2 and 3 compat :)
holdenk 0e2223d
Use more standard version.py file, check sys version is greater than …
holdenk 849ded0
First pass at updating the release-build script
holdenk cf5ab7e
consider handling being inside a release
holdenk 4b69871
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 3788bfb
Fix up make-distribution to build the python artifacts, update releas…
holdenk 308a168
Fix python lint errors and add linting to setup.py
holdenk 74b79c4
Add python packaging tests to run-tests script
holdenk 3056553
Add license header to setup.cfg
holdenk 125ae2a
Fix typo PyPi to PyPI
holdenk d2da8b0
Fix typo PyPi to PyPI (2)
holdenk 595409f
Use copytree and rmtree on windows - note: still not explicitly teste…
holdenk cf421b0
Fix style issues
holdenk 31ac8e2
Add license header to version.py and manifest.in
holdenk 0e9cb8d
newer version of numpy are fine
holdenk 264b253
Add BLOCK_PYSPARK_PIP_TESTS to jenkins test error codes
holdenk 802f682
Add README.md as description file to metadata in setup.cfg
holdenk fba37a0
We store version in a different file now
holdenk 8ba499f
Early PR feedback, switch to os.path.join rather than strings, add a …
holdenk 1c177f3
Add BLOCK_PYSPARK_PIP_TESTS to error code set
holdenk 6ace070
Fix path used to run the pip tests in jenkins
holdenk ab8ca53
Fix typo
holdenk 77f8eca
Show how to build the sdist in building-spark.md
holdenk f590898
Have clearer messages (as suggested by @viirya)
holdenk f956a5d
Try and improve the wording a little bit
holdenk 489d4e3
Fix typo
holdenk 9e4fdb5
Drop extra .gz
holdenk e668af6
Drop '
holdenk 3bf961e
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk c9d48d3
Make packaging PySpark as pip optional part of make-distirbution the …
holdenk e9f1e8e
Fix indentation and clarify error message (since we still technically…
holdenk 1cdcf61
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 7af912a
Move Python version check up earlier.
holdenk c77d9fd
Fix python3 setup
holdenk 7b1d8b7
test both python/python3 if they are installed on the system for pip …
holdenk 298bda6
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 9770260
Actually run the python3 packaging tests and fix path finding
holdenk 99940ee
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk f6806b2
Break up sentence in setup.py error message, drop 3.0-3.3 tags from s…
holdenk b0cd655
Just copy shell in advance because the setup time copy has issues wit…
holdenk 6bb422e
Change shell symlink
holdenk b5b4713
Move the copy up earlier for python3 venv install issue
holdenk 2b808dc
Fix normalizaiton of paths
holdenk b958f7e
Handle edit mode based installations
holdenk 577554b
Just skip caching rather than cleaning up the wheels
holdenk 9cf2ec9
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 154a287
Remove % formatting and replace with format and os.path.join
holdenk b478bdf
s/True/pass/ in the places where it makes sense, fix a formatting issue
holdenk fb62a8a
Test both edit mode and regular installs
holdenk 6540964
Add exit(-1)
holdenk d2389ed
CR feedback - switch symlink support checking into a function and use…
holdenk 48cd1ad
Add a docstring comment just cause
holdenk 23109a4
Fix support_symlinks / docstring
holdenk 49fc6db
use update to usr bin env python
holdenk 7001f90
s/deps/TEMP_PATH/ incase we change it later
holdenk 8d74672
Merge branch 'master' into SPARK-1267-pip-install-pyspark
holdenk 210c9d4
drop usr/bin/env python since we don't want MANIFEST to run as a script
holdenk 9efca67
Use python2 if available and fallback to python
holdenk fd3e89c
Fix more shell check issues
holdenk 587c0eb
Fix shellcheck issues - note most of these were prexisting but since …
holdenk 2904998
Move pip tests into a self cleaning up script instead of 2
holdenk 3345eb9
Clarify what is required to build the PySpark pip installable artifacts.
holdenk f86574a
Make messaging more consistent
holdenk 05fc25f
Switch to "s cause its easier to do that with sed rewrites
holdenk dd243a2
Update release tagging script
holdenk df5a3f9
Drop the notice since the script does it now
holdenk d753d80
Fix the next version output and update the comment to be more precise
holdenk e139855
Add a global-exclude and add a format to the setup.py for multiple as…
holdenk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Lint fixes
- Loading branch information
commit df126cf219b9367792e9a25b7d3493b7a060daee
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,10 +21,13 @@ | |
| # that Spark may have been installed on the system with pip. | ||
|
|
||
| from __future__ import print_function | ||
| import os, sys | ||
| import os | ||
| import sys | ||
|
|
||
|
|
||
| def _find_spark_home(): | ||
| """Find the SPARK_HOME.""" | ||
| # If the enviroment has SPARK_HOME set trust it. | ||
| if "SPARK_HOME" in os.environ: | ||
| return os.environ["SPARK_HOME"] | ||
|
|
||
|
|
@@ -51,7 +54,7 @@ def is_spark_home(path): | |
| True | ||
|
|
||
| # Normalize the paths | ||
| paths = map(lambda path:os.path.abspath(path), paths) | ||
| paths = map(lambda path: os.path.abspath(path), paths) | ||
|
||
|
|
||
| try: | ||
| return next(path for path in paths if is_spark_home(path)) | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same nit about
pass.