Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Aug 18, 2020

What changes were proposed in this pull request?

This PR proposes to backport the following JIRAs:

in order to enable GitHub Actions in branch-3.0.

Why are the changes needed?

To be able to run the tests in branch-3.0. Jenkins jobs are unstable.

Does this PR introduce any user-facing change?

No, dev-only.

How was this patch tested?

Build in this PR will test.

mllib-local, mllib,
yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl
- >-
pyspark-sql, pyspark-mllib
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pyspark-resource modules does not exist in Spark 3.0. It was added in SPARK-29641.

- uses: actions/setup-java@v1
python-version: pypy3
architecture: x64
- name: Install Python 2.7
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark 3.0 did not drop Python 2 yet. In master, we test Python 3.6 and 3.8. In branch-3.0, we test Python 3.8 and 2.7.

@HyukjinKwon
Copy link
Member Author

Otherwise, it's a clean porting back. I will pick and merge each commit once the tests pass. cc @dongjoon-hyun, @viirya, @gengliangwang.

cc @ScrapCodes and @zhengruifeng FYI. This is a huge PR but none of them are in main codes. Should be okay.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

java:
- 1.8
hadoop:
- hadoop2.7
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadoop is 2.7 by default in 3.0.

@HyukjinKwon
Copy link
Member Author

I will port these back once Jenkins tests pass.

@SparkQA

This comment has been minimized.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Aug 18, 2020

Test build #127579 has finished for PR 29460 at commit a5fd498.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 18, 2020

Test build #127587 has finished for PR 29460 at commit a5fd498.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@TagAnnotation
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD, ElementType.TYPE})
public @interface GitHubActionsUnstableTest { }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I temporarily added this tag to exclude some tests being failed in GitHub Actions. We should ideally remove this tag and reenable the tests skipped.

FWIW, the failed tests are:

  • ThriftServerQueryTestSuite
  • ThriftServerWithSparkContextSuite
  • VersionSuite

these do not happen in the master branch, and Jenkins still tests them so would not be a big deal.

ThriftServerQueryTestSuite and ThriftServerWithSparkContextSuite were fixed, IIRC, with an invasive fix that only landed to the master.

VersionSuite failure, I remember this but I could not figure out the root cause at that time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon
Copy link
Member Author

Now it should pass all tests. I will port back once I get a green. (except SparkR tests fixed at #29462).

@dongjoon-hyun
Copy link
Member

This goes as a single commit. Did I understand correctly?

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Aug 19, 2020

Oh no. I will add one by one. I intentionally made corresponding changes to each appropriate commit (by git commit --amend). Does it make sense?
If you prefer single commit way, I don't mind. I just thought pushing it as a single commit is less manageable so took this path but maybe too much in a way.

This PR aims to run the Spark tests in Github Actions.

To briefly explain the main idea:

- Reuse `dev/run-tests.py` with SBT build
- Reuse the modules in `dev/sparktestsupport/modules.py` to test each module
- Pass the modules to test into `dev/run-tests.py` directly via `TEST_ONLY_MODULES` environment variable. For example, `pyspark-sql,core,sql,hive`.
- `dev/run-tests.py` _does not_ take the dependent modules into account but solely the specified modules to test.

Another thing to note might be `SlowHiveTest` annotation. Running the tests in Hive modules takes too much so the slow tests are extracted and it runs as a separate job. It was extracted from the actual elapsed time in Jenkins:

![Screen Shot 2020-07-09 at 7 48 13 PM](https://user-images.githubusercontent.com/6477701/87050238-f6098e80-c238-11ea-9c4a-ab505af61381.png)

So, Hive tests are separated into to jobs. One is slow test cases, and the other one is the other test cases.

_Note that_ the current GitHub Actions build virtually copies what the default PR builder on Jenkins does (without other profiles such as JDK 11, Hadoop 2, etc.). The only exception is Kinesis https://github.com/apache/spark/pull/29057/files#diff-04eb107ee163a50b61281ca08f4e4c7bR23

Last week and onwards, the Jenkins machines became very unstable for many reasons:
  - Apparently, the machines became extremely slow. Almost all tests can't pass.
  - One machine (worker 4) started to have the corrupt `.m2` which fails the build.
  - Documentation build fails time to time for an unknown reason in Jenkins machine specifically. This is disabled for now at apache#29017.
  - Almost all PRs are basically blocked by this instability currently.

The advantages of using Github Actions:
  - To avoid depending on few persons who can access to the cluster.
  - To reduce the elapsed time in the build - we could split the tests (e.g., SQL, ML, CORE), and run them in parallel so the total build time will significantly reduce.
  - To control the environment more flexibly.
  - Other contributors can test and propose to fix Github Actions configurations so we can distribute this build management cost.

Note that:
- The current build in Jenkins takes _more than 7 hours_. With Github actions it takes _less than 2 hours_
- We can now control the environments especially for Python easily.
- The test and build look more stable than the Jenkins'.

No, dev-only change.

Tested at #4

Closes apache#29057 from HyukjinKwon/migrate-to-github-actions.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
HyukjinKwon and others added 9 commits August 19, 2020 12:22
…ctivation in pip packaging test

### What changes were proposed in this pull request?

This PR proposes to avoid using subshell when it activates Conda environment. Looks like it ends up with activating the env within the subshell even if you use `conda` command.

### Why are the changes needed?

If you take a close look for GitHub Actions log:

```
 Installing dist into virtual env
Processing ./python/dist/pyspark-3.1.0.dev0.tar.gz
Collecting py4j==0.10.9
 Downloading py4j-0.10.9-py2.py3-none-any.whl (198 kB)
Using legacy setup.py install for pyspark, since package 'wheel' is not installed.
Installing collected packages: py4j, pyspark
 Running setup.py install for pyspark: started
 Running setup.py install for pyspark: finished with status 'done'
Successfully installed py4j-0.10.9 pyspark-3.1.0.dev0

...

Installing dist into virtual env
Obtaining file:///home/runner/work/spark/spark/python
Collecting py4j==0.10.9
 Downloading py4j-0.10.9-py2.py3-none-any.whl (198 kB)
Installing collected packages: py4j, pyspark
 Attempting uninstall: py4j
 Found existing installation: py4j 0.10.9
 Uninstalling py4j-0.10.9:
 Successfully uninstalled py4j-0.10.9
 Attempting uninstall: pyspark
 Found existing installation: pyspark 3.1.0.dev0
 Uninstalling pyspark-3.1.0.dev0:
 Successfully uninstalled pyspark-3.1.0.dev0
 Running setup.py develop for pyspark
Successfully installed py4j-0.10.9 pyspark
```

It looks not properly using Conda as it removes the previously installed one when it reinstalls again.
We should ideally test it with Conda environment as it's intended.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions will test. I also manually tested in my local.

Closes apache#29212 from HyukjinKwon/SPARK-32419.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
… in IntegratedUDFTestUtils

### What changes were proposed in this pull request?

This PR uses `python3` instead of `python3.6` executable as a fallback in `IntegratedUDFTestUtils`.

### Why are the changes needed?

Currently, GitHub Actions skips pandas UDFs. Python 3.8 is installed explicitly but somehow `python3.6` looks available in GitHub Actions build environment by default.

```
[info] - udf/postgreSQL/udf-case.sql - Scalar Pandas UDF is skipped because pyspark,pandas and/or pyarrow were not available in [python3.6]. !!! IGNORED !!!
...
[info] - udf/postgreSQL/udf-select_having.sql - Scalar Pandas UDF is skipped because pyspark,pandas and/or pyarrow were not available in [python3.6]. !!! IGNORED !!!
...
```

It was chosen as `python3.6` for Jenkins to pick one Python explicitly; however, looks we're already using `python3` here and there.

It will also reduce the overhead to fix when we deprecate or drop Python versions.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

It should be tested in Jenkins and GitHub Actions environments here.

Closes apache#29217 from HyukjinKwon/SPARK-32422.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
…ng script

### What changes were proposed in this pull request?

This PR proposes to skip SparkR installation that is to run R linters (see SPARK-8505) in the test-only mode at `dev/run-tests.py` script.

As of SPARK-32292, the test-only mode in `dev/run-tests.py` was introduced, for example:

```
dev/run-tests.py --modules sql,core
```

which only runs the relevant tests and does not run other tests such as linters. Therefore, we don't need to install SparkR when `--modules` are specified.

### Why are the changes needed?

GitHub Actions build is currently failed as below:

```
ERROR: this R is version 3.4.4, package 'SparkR' requires R >= 3.5
[error] running /home/runner/work/spark/spark/R/install-dev.sh ; received return code 1
##[error]Process completed with exit code 10.
```

For some reasons, looks GitHub Actions started to have R 3.4.4 installed by default; however, R 3.4 was dropped as of SPARK-32073.  When SparkR tests are not needed, GitHub Actions still builds SparkR with a low R version and it causes the test failure.

This PR partially fixes it by avoid the installation of SparkR.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions tests should run to confirm this fix is correct.

Closes apache#29300 from HyukjinKwon/install-r.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
…itHub Actions

This PR proposes to manually install R instead of using `setup-r` which seems broken. Currently, GitHub Actions uses its default R 3.4.4 installed, which we dropped as of SPARK-32073.

While I am here, I am also upgrading R version to 4.0. Jenkins will test the old version and GitHub Actions tests the new version. AppVeyor uses R 4.0 but it does not check CRAN which is important when we make a release.

To recover GitHub Actions build.

No, dev-only

Manually tested at #15

Closes apache#29302 from HyukjinKwon/SPARK-32493.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
…ting

### What changes were proposed in this pull request?

apache#26556 excluded `.github/workflows/master.yml`. So tests are skipped if the GitHub Actions configuration file is changed.

As of SPARK-32245, we now run the regular tests via the testing script. We should include it to test to make sure GitHub Actions build does not break due to some changes such as Python versions.

### Why are the changes needed?

For better test coverage in GitHub Actions build.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions in this PR will test.

Closes apache#29305 from HyukjinKwon/SPARK-32496.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
…ctions

### What changes were proposed in this pull request?

CRAN check fails due to the size of the generated PDF docs as below:

```
...
 WARNING
‘qpdf’ is needed for checks on size reduction of PDFs
...
Status: 1 WARNING, 1 NOTE
See
  ‘/home/runner/work/spark/spark/R/SparkR.Rcheck/00check.log’
for details.
```

This PR proposes to install `qpdf` in GitHub Actions.

Note that I cannot reproduce in my local with the same R version so I am not documenting it for now.

Also, while I am here, I piggyback to install SparkR when the module includes `sparkr`. it is rather a followup of SPARK-32491.

### Why are the changes needed?

To fix SparkR CRAN check failure.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions will test it out.

Closes apache#29306 from HyukjinKwon/SPARK-32497.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
…Hub Actions

This PR proposes to report the failed and succeeded tests in GitHub Actions in order to improve the development velocity by leveraging [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report). See the example below:

![Screen Shot 2020-08-13 at 8 17 52 PM](https://user-images.githubusercontent.com/6477701/90128649-28f7f280-dda2-11ea-9211-e98e34332f6b.png)

Note that we cannot just use [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report) in Apache Spark because PRs are from the forked repository, and GitHub secrets are unavailable for the security reason. This plugin and all similar plugins require to have the GitHub token that has the write access in order to post test results but it is unavailable in PRs.

To work around this limitation, I took this approach:

1. In workflow A, run the tests and upload the JUnit XML test results. GitHub provides to upload and download some files.
2. GitHub introduced new event type [`workflow_run`](https://github.blog/2020-08-03-github-actions-improvements-for-fork-and-pull-request-workflows/) 10 days ago. By leveraging this, it triggers another workflow B.
3. Workflow B is in the main repo instead of fork repo, and has the write access the plugin needs. In workflow B, it downloads the artifact uploaded from workflow A (from the forked repository).
4. Workflow B generates the test reports to port from JUnit xml files.
5. Workflow B looks up the PR and posts the test reports.

The `workflow_run` event is very new feature, and looks not so many GitHub Actions plugins support. In order to make this working with [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report), I had to fork two GitHub Actions plugins to use:
 - [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report) to have this custom fix: HyukjinKwon/action-surefire-report@c96094c
    It added `commit` argument to specify the commit to post the test reports. With `workflow_run`, it can access, in workflow B, to the commit from workflow A.

 - [dawidd6/action-download-artifact](https://github.com/dawidd6/action-download-artifact) to have this custom fix: HyukjinKwon/action-download-artifact@750b71a
    It added the support of downloading all artifacts from workflow A, in workflow B. By default, it only supports to specify the name of artifact.

    Note that I was not able to use the official [actions/download-artifact](https://github.com/actions/download-artifact) because:
      - It does not support to download artifacts between different workflows, see also actions/download-artifact#3. Once this issue is resolved, we can switch it back to [actions/download-artifact](https://github.com/actions/download-artifact).

I plan to make a pull request for both repositories so we don't have to rely on forks.

Currently, it's difficult to check the failed tests. You should scroll down long logs from GitHub Actions logs.

No, dev-only.

Manually tested at: #17, #18, #19, #20, and master branch of my forked repository.

Closes apache#29333 from HyukjinKwon/SPARK-32357-fix.

Lead-authored-by: Hyukjin Kwon <[email protected]>
Co-authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
…-report and action-download-artifact in test_report.yml

### What changes were proposed in this pull request?

This PR proposes to remove the usage of my own forks and use the original plugins in GitHub Actions testing report.

SPARK-32357 introduced the GitHub Actions test reporting by leveraging two plugins:
 - [ScaCap/action-surefire-report](https://github.com/ScaCap/action-surefire-report)
 - [dawidd6/action-download-artifact](https://github.com/dawidd6/action-download-artifact)

In order to make it working, it had to fork two repositories with custom fixes:
  - HyukjinKwon/action-surefire-reportc96094c
  - HyukjinKwon/action-download-artifact@f86c565

The two custom fixes are thankfully merged at ScaCap/action-surefire-report#14 and dawidd6/action-download-artifact#24, and they released new ones to use at [ScaCap/action-surefire-report/commits/v1](https://github.com/ScaCap/action-surefire-report/commits/v1) and [dawidd6/action-download-artifact/commits/v2](https://github.com/dawidd6/action-download-artifact/commits/v2)  - thanks jmisur and dawidd6 again.

### Why are the changes needed?

To avoid relying on forks and code duplications.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Logically there is no diff. I tested it at https://github.com/HyukjinKwon/spark/runs/992824229 for doubly sure.

NOTE that this PR cannot be tested here within the workflow triggered by this PR without merging the changes in `test_report.yml` into the master.

Closes apache#29449 from HyukjinKwon/SPARK-32606-SPARK-32605.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request?

This PR aims to recover Java 11 build in `GitHub Action`.

### Why are the changes needed?

This test coverage is removed before. Now, it's time to recover it.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the GitHub Action.

Closes apache#29295 from dongjoon-hyun/SPARK-32248.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@HyukjinKwon HyukjinKwon changed the title [DO-NOT-MERGE][SPARK-32249][3.0] Run Github Actions builds in other branches as well [SPARK-32249][3.0] Run Github Actions builds in other branches as well Aug 19, 2020
@HyukjinKwon
Copy link
Member Author

I just turned this as single PR.

### What changes were proposed in this pull request?

This PR renames `master.yml` to `build_and_test.yml` to indicate this is the workflow that builds and runs the tests.

### Why are the changes needed?

Just for readability. `master.yml` looks like the name of the branch (to me).

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions build in this PR will test it out.

Closes apache#29459 from HyukjinKwon/minor-rename.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request?

This PR proposes to upload `target/unit-tests.log` into the artifact so it will be able to download here:
![Screen Shot 2020-08-18 at 2 23 18 PM](https://user-images.githubusercontent.com/6477701/90474095-789e3b80-e15f-11ea-87f8-e7da3df3c03e.png)

### Why are the changes needed?

Jenkins has this feature. It should be best to have the same dev functionalities with it.
Also, note that this was pointed out apache#29225 (comment).

### Does this PR introduce _any_ user-facing change?

No, dev-only

### How was this patch tested?

https://github.com/apache/spark/actions/runs/213000777 should demonstrate it

Closes apache#29454 from HyukjinKwon/SPARK-32645.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Aug 19, 2020

Should be ready to be reviewed or go merging.
Arrow related SparkR tests which will be fixed at #29462.

@SparkQA

This comment has been minimized.

@HyukjinKwon HyukjinKwon changed the title [SPARK-32249][3.0] Run Github Actions builds in other branches as well [SPARK-32249][3.0] Run Github Actions builds in branch-3.0 Aug 19, 2020
@HyukjinKwon HyukjinKwon changed the title [SPARK-32249][3.0] Run Github Actions builds in branch-3.0 [SPARK-32249][INFRA][3.0] Run Github Actions builds in branch-3.0 Aug 19, 2020
@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@HyukjinKwon
Copy link
Member Author

retest this please

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you so much always.
Yep. We can proceed R testing separately.

dongjoon-hyun added a commit that referenced this pull request Aug 19, 2020
### What changes were proposed in this pull request?

This PR proposes to backport the following JIRAs:

- SPARK-32245
- SPARK-32292
- SPARK-32252
- SPARK-32316
- SPARK-32408
- SPARK-32303
- SPARK-32363
- SPARK-32419
- SPARK-32422
- SPARK-32491
- SPARK-32493
- SPARK-32496
- SPARK-32497
- SPARK-32357
- SPARK-32606
- SPARK-32605
- SPARK-32248
- SPARK-32645
- Minor renaming d0dfe49#diff-02d9c370a663741451423342d5869b21

in order to enable GitHub Actions in branch-3.0.

### Why are the changes needed?

To be able to run the tests in branch-3.0. Jenkins jobs are unstable.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Build in this PR will test.

Closes #29460 from HyukjinKwon/SPARK-32249.

Lead-authored-by: HyukjinKwon <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>
Co-authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Merged to branch-3.0.

@HyukjinKwon
Copy link
Member Author

Thanks @dongjoon-hyun!

@SparkQA
Copy link

SparkQA commented Aug 19, 2020

Test build #127647 has finished for PR 29460 at commit d029dba.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon deleted the SPARK-32249 branch December 7, 2020 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants