[SPARK-24653][tests] Avoid cross-job pollution in TestUtils / SpillListener. #21639

vanzin · 2018-06-25T22:06:08Z

There is a narrow race in this code that is caused when the code being
run in assertSpilled / assertNotSpilled runs more than a single job.

SpillListener assumed that only a single job was run, and so would only
block waiting for that single job to finish when numSpilledStages was
called. But some tests (like SQL tests that call checkAnswer) run more
than one job, and so that wait was basically a no-op.

This could cause the next test to install a listener to receive events
from the previous job. Which could cause test failures in certain cases.

The change fixes that race, and also uninstalls listeners after the
test runs, so they don't accumulate when the SparkContext is shared
among multiple tests.

…stener. There is a narrow race in this code that is caused when the code being run in assertSpilled / assertNotSpilled runs more than a single job. SpillListener assumed that only a single job was run, and so would only block waiting for that single job to finish when `numSpilledStages` was called. But some tests (like SQL tests that call `checkAnswer`) run more than one job, and so that wait was basically a no-op. This could cause the next test to install a listener to receive events from the previous job. Which could cause test failures in certain cases. The change fixes that race, and also uninstalls listeners after the test runs, so they don't accumulate when the SparkContext is shared among multiple tests.

SparkQA · 2018-06-26T02:35:56Z

Test build #92312 has finished for PR 21639 at commit 9f0a9c4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jiangxb1987 · 2018-06-26T15:10:59Z

Seems the JIRA number is not related?

jiangxb1987 · 2018-06-26T15:20:48Z

core/src/main/scala/org/apache/spark/TestUtils.scala

-    body
-    assert(spillListener.numSpilledStages > 0, s"expected $identifier to spill, but did not")
+    withListener(sc, new SpillListener) { listener =>
+      val ret = body


Maybe I'm missing something obvious, but why shall we need the return value here?

I saw the return type in the closure, but the method itself returns Unit, so all that can be cleaned up.

vanzin · 2018-06-26T16:07:53Z

Oops, no idea how I got the wrong bug.

SparkQA · 2018-06-26T20:37:05Z

Test build #92345 has finished for PR 21639 at commit 18d5ebf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2018-07-02T18:40:21Z

lgtm

HyukjinKwon · 2018-07-16T02:41:38Z

retest this please

SparkQA · 2018-07-16T07:05:01Z

Test build #93063 has finished for PR 21639 at commit 18d5ebf.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-07-16T07:15:28Z

retest this please

SparkQA · 2018-07-16T10:39:37Z

Test build #93091 has finished for PR 21639 at commit 18d5ebf.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2018-07-17T17:06:18Z

retest this please

SparkQA · 2018-07-17T17:08:41Z

Test build #93185 has started for PR 21639 at commit 18d5ebf.

srowen · 2018-07-31T22:00:41Z

core/src/main/scala/org/apache/spark/TestUtils.scala

+   * this method will wait until all events posted to the listener bus are processed, and then
+   * remove the listener from the bus.
+   */
+  def withListener[L <: SparkListener](sc: SparkContext, listener: L) (body: L => Unit): Unit = {


private? hardly matters.

SparkQA · 2018-08-01T01:52:07Z

Test build #4226 has finished for PR 21639 at commit 18d5ebf.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-08-01T02:50:24Z

retest this please

SparkQA · 2018-08-01T07:30:26Z

Test build #93861 has finished for PR 21639 at commit 18d5ebf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2018-08-01T07:47:28Z

Merged to master

jiangxb1987 reviewed Jun 26, 2018

View reviewed changes

vanzin changed the title ~~[SPARK-24631][tests] Avoid cross-job pollution in TestUtils / SpillListener.~~ [SSPARK-24653][tests] Avoid cross-job pollution in TestUtils / SpillListener. Jun 26, 2018

Clean up return types.

18d5ebf

vanzin changed the title ~~[SSPARK-24653][tests] Avoid cross-job pollution in TestUtils / SpillListener.~~ [SPARK-24653][tests] Avoid cross-job pollution in TestUtils / SpillListener. Jun 26, 2018

srowen approved these changes Jul 31, 2018

View reviewed changes

asfgit closed this in 1122754 Aug 1, 2018

vanzin deleted the SPARK-24653 branch August 24, 2018 19:54

[SPARK-24653][tests] Avoid cross-job pollution in TestUtils / SpillListener. #21639

[SPARK-24653][tests] Avoid cross-job pollution in TestUtils / SpillListener. #21639

Uh oh!

Conversation

vanzin commented Jun 25, 2018

Uh oh!

SparkQA commented Jun 26, 2018

Uh oh!

jiangxb1987 commented Jun 26, 2018

Uh oh!

jiangxb1987 Jun 26, 2018

Choose a reason for hiding this comment

Uh oh!

vanzin Jun 26, 2018

Choose a reason for hiding this comment

Uh oh!

vanzin commented Jun 26, 2018

Uh oh!

SparkQA commented Jun 26, 2018

Uh oh!

squito commented Jul 2, 2018

Uh oh!

HyukjinKwon commented Jul 16, 2018

Uh oh!

SparkQA commented Jul 16, 2018

Uh oh!

HyukjinKwon commented Jul 16, 2018

Uh oh!

SparkQA commented Jul 16, 2018

Uh oh!

squito commented Jul 17, 2018

Uh oh!

SparkQA commented Jul 17, 2018

Uh oh!

srowen Jul 31, 2018

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Aug 1, 2018

Uh oh!

HyukjinKwon commented Aug 1, 2018

Uh oh!

SparkQA commented Aug 1, 2018

Uh oh!

HyukjinKwon commented Aug 1, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants