Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
9d99de0
[SPARK-32245][INFRA] Run Spark tests in Github Actions
HyukjinKwon Jul 11, 2020
dc96207
[SPARK-32245][INFRA][FOLLOWUP] Reenable Github Actions on commit
dongjoon-hyun Jul 12, 2020
d4726dd
[SPARK-32292][SPARK-32252][INFRA] Run the relevant tests only in GitH…
HyukjinKwon Jul 13, 2020
3845400
[SPARK-32316][TESTS][INFRA] Test PySpark with Python 3.8 in Github Ac…
HyukjinKwon Jul 15, 2020
4083d5e
[SPARK-32408][BUILD] Enable crossPaths back to prevent side effects
HyukjinKwon Jul 24, 2020
5624db3
[SPARK-32303][PYTHON][TESTS] Remove leftover from editable mode insta…
HyukjinKwon Jul 14, 2020
b8fc422
[SPARK-32363][PYTHON][BUILD] Fix flakiness in pip package testing in …
HyukjinKwon Jul 21, 2020
130a9d0
[SPARK-32419][PYTHON][BUILD] Avoid using subshell for Conda env (de)a…
HyukjinKwon Jul 25, 2020
3f1acea
[SPARK-32422][SQL][TESTS] Use python3 executable instead of python3.6…
HyukjinKwon Jul 25, 2020
78b5833
[SPARK-32491][INFRA] Do not install SparkR in test-only mode in testi…
HyukjinKwon Jul 30, 2020
32a5fee
[SPARK-32493][INFRA] Manually install R instead of using setup-r in G…
HyukjinKwon Jul 30, 2020
5c23116
[SPARK-32496][INFRA] Include GitHub Action file as the changes in tes…
HyukjinKwon Jul 30, 2020
a278c5e
[SPARK-32497][INFRA] Installs qpdf package for CRAN check in GitHub A…
HyukjinKwon Jul 30, 2020
11d170f
[SPARK-32357][INFRA] Publish failed and succeeded test reports in Git…
HyukjinKwon Aug 14, 2020
2fc2b82
[SPARK-32606][SPARK-32605][INFRA] Remove the forks of action-surefire…
HyukjinKwon Aug 17, 2020
2366f37
[SPARK-32248][BUILD] Recover Java 11 build in Github Actions
dongjoon-hyun Jul 30, 2020
fbb4ac8
[MINOR][INFRA] Rename master.yml to build_and_test.yml
HyukjinKwon Aug 18, 2020
d029dba
[SPARK-32645][INFRA] Upload unit-tests.log as an artifact
HyukjinKwon Aug 19, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-32316][TESTS][INFRA] Test PySpark with Python 3.8 in Github Ac…
…tions

This PR aims to test PySpark with Python 3.8 in Github Actions. In the script side, it is already ready:

https://github.com/apache/spark/blob/4ad9bfd53b84a6d2497668c73af6899bae14c187/python/run-tests.py#L161

This PR includes small related fixes together:

1. Install Python 3.8
2. Only install one Python implementation instead of installing many for SQL and Yarn test cases because they need one Python executable in their test cases that is higher than Python 2.
3. Do not install Python 2 which is not needed anymore after we dropped Python 2 at SPARK-32138
4. Remove a comment about installing PyPy3 on Jenkins - SPARK-32278. It is already installed.

Currently, only PyPy3 and Python 3.6 are being tested with PySpark in Github Actions. We should test the latest version of Python as well because some optimizations can be only enabled with Python 3.8+. See also #29114

No, dev-only.

Was not tested. Github Actions build in this PR will test it out.

Closes #29116 from HyukjinKwon/test-python3.8-togehter.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
HyukjinKwon committed Aug 19, 2020
commit 38454008b8d24ae99987316bc80c85576938bee2
29 changes: 17 additions & 12 deletions .github/workflows/master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -119,37 +119,42 @@ jobs:
java-version: ${{ matrix.java }}
# PySpark
- name: Install PyPy3
# SQL component also has Python related tests, for example, IntegratedUDFTestUtils.
# Note that order of Python installations here matters because default python3 is
# overridden by pypy3.
uses: actions/setup-python@v2
if: contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
if: contains(matrix.modules, 'pyspark')
with:
python-version: pypy3
architecture: x64
- name: Install Python 2.7
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark 3.0 did not drop Python 2 yet. In master, we test Python 3.6 and 3.8. In branch-3.0, we test Python 3.8 and 2.7.

uses: actions/setup-python@v2
if: contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
if: contains(matrix.modules, 'pyspark')
with:
python-version: 2.7
architecture: x64
- name: Install Python 3.6
- name: Install Python 3.8
uses: actions/setup-python@v2
if: contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
# We should install one Python that is higher then 3+ for SQL and Yarn because:
# - SQL component also has Python related tests, for example, IntegratedUDFTestUtils.
# - Yarn has a Python specific test too, for example, YarnClusterSuite.
if: contains(matrix.modules, 'yarn') || contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
with:
python-version: 3.6
python-version: 3.8
architecture: x64
- name: Install Python packages
if: contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
- name: Install Python packages (Python 2.7 and PyPy3)
if: contains(matrix.modules, 'pyspark')
# PyArrow is not supported in PyPy yet, see ARROW-2651.
# TODO(SPARK-32247): scipy installation with PyPy fails for an unknown reason.
run: |
python3 -m pip install numpy pyarrow pandas scipy
python3 -m pip list
python2 -m pip install numpy pyarrow pandas scipy
python2 -m pip list
python2.7 -m pip install numpy pyarrow pandas scipy
python2.7 -m pip list
pypy3 -m pip install numpy pandas
pypy3 -m pip list
- name: Install Python packages (Python 3.8)
if: contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
run: |
python3.8 -m pip install numpy pyarrow pandas scipy
python3.8 -m pip list
# SparkR
- name: Install R 3.6
uses: r-lib/actions/setup-r@v1
Expand Down
2 changes: 1 addition & 1 deletion python/run-tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ def run_individual_python_test(target_dir, test_name, pyspark_python):


def get_default_python_executables():
python_execs = [x for x in ["python3.6", "python2.7", "pypy3", "pypy"] if which(x)]
python_execs = [x for x in ["python3.8", "python2.7", "pypy3", "pypy"] if which(x)]

if "python3.6" not in python_execs:
p = which("python3")
Expand Down