-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28247][SS][TEST]Fix flaky test "query without test harness" on ContinuousSuite #32316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @jose-torres |
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Test build #137872 has finished for PR 32316 at commit
|
|
It seems to fail Scala 2.13 build.
|
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks reasonable.
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137896 has finished for PR 32316 at commit
|
|
cc @HeartSaVioR FYI |
HeartSaVioR
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix!
|
Would we like to wait for @jose-torres to do the final review (and probably sign-off), or OK to go merging? |
jose-torres
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
OK thanks everyone for reviewing. I'm going to merge this. |
…n ContinuousSuite ### What changes were proposed in this pull request? This is another attempt to fix the flaky test "query without test harness" on ContinuousSuite. `query without test harness` is flaky because it starts a continuous query with two partitions but assumes they will run at the same speed. In this test, 0 and 2 will be written to partition 0, 1 and 3 will be written to partition 1. It assumes when we see 3, 2 should be written to the memory sink. But this is not guaranteed. We can add `if (currentValue == 2) Thread.sleep(5000)` at this line https://github.com/apache/spark/blob/b2a2b5d8206b7c09b180b8b6363f73c6c3fdb1d8/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala#L135 to reproduce the failure: `Result set Set([0], [1], [3]) are not a superset of Set(0, 1, 2, 3)!` The fix is changing `waitForRateSourceCommittedValue` to wait until all partitions reach the desired values before stopping the query. ### Why are the changes needed? Fix a flaky test. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Manually verify the reproduction I mentioned above doesn't fail after this change. Closes #32316 from zsxwing/SPARK-28247-fix. Authored-by: Shixiong Zhu <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]> (cherry picked from commit 0df3b50) Signed-off-by: Jungtaek Lim <[email protected]>
…n ContinuousSuite ### What changes were proposed in this pull request? This is another attempt to fix the flaky test "query without test harness" on ContinuousSuite. `query without test harness` is flaky because it starts a continuous query with two partitions but assumes they will run at the same speed. In this test, 0 and 2 will be written to partition 0, 1 and 3 will be written to partition 1. It assumes when we see 3, 2 should be written to the memory sink. But this is not guaranteed. We can add `if (currentValue == 2) Thread.sleep(5000)` at this line https://github.com/apache/spark/blob/b2a2b5d8206b7c09b180b8b6363f73c6c3fdb1d8/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala#L135 to reproduce the failure: `Result set Set([0], [1], [3]) are not a superset of Set(0, 1, 2, 3)!` The fix is changing `waitForRateSourceCommittedValue` to wait until all partitions reach the desired values before stopping the query. ### Why are the changes needed? Fix a flaky test. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Manually verify the reproduction I mentioned above doesn't fail after this change. Closes #32316 from zsxwing/SPARK-28247-fix. Authored-by: Shixiong Zhu <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]> (cherry picked from commit 0df3b50) Signed-off-by: Jungtaek Lim <[email protected]>
|
Thanks @zsxwing for the fix! I merged this in master/3.1/3.0. I skipped 2.4 as it's unlikely that we'll want to maintain 2.4 version line further. |
…n ContinuousSuite ### What changes were proposed in this pull request? This is another attempt to fix the flaky test "query without test harness" on ContinuousSuite. `query without test harness` is flaky because it starts a continuous query with two partitions but assumes they will run at the same speed. In this test, 0 and 2 will be written to partition 0, 1 and 3 will be written to partition 1. It assumes when we see 3, 2 should be written to the memory sink. But this is not guaranteed. We can add `if (currentValue == 2) Thread.sleep(5000)` at this line https://github.com/apache/spark/blob/b2a2b5d8206b7c09b180b8b6363f73c6c3fdb1d8/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala#L135 to reproduce the failure: `Result set Set([0], [1], [3]) are not a superset of Set(0, 1, 2, 3)!` The fix is changing `waitForRateSourceCommittedValue` to wait until all partitions reach the desired values before stopping the query. ### Why are the changes needed? Fix a flaky test. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Manually verify the reproduction I mentioned above doesn't fail after this change. Closes apache#32316 from zsxwing/SPARK-28247-fix. Authored-by: Shixiong Zhu <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…n ContinuousSuite ### What changes were proposed in this pull request? This is another attempt to fix the flaky test "query without test harness" on ContinuousSuite. `query without test harness` is flaky because it starts a continuous query with two partitions but assumes they will run at the same speed. In this test, 0 and 2 will be written to partition 0, 1 and 3 will be written to partition 1. It assumes when we see 3, 2 should be written to the memory sink. But this is not guaranteed. We can add `if (currentValue == 2) Thread.sleep(5000)` at this line https://github.com/apache/spark/blob/b2a2b5d8206b7c09b180b8b6363f73c6c3fdb1d8/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala#L135 to reproduce the failure: `Result set Set([0], [1], [3]) are not a superset of Set(0, 1, 2, 3)!` The fix is changing `waitForRateSourceCommittedValue` to wait until all partitions reach the desired values before stopping the query. ### Why are the changes needed? Fix a flaky test. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Manually verify the reproduction I mentioned above doesn't fail after this change. Closes apache#32316 from zsxwing/SPARK-28247-fix. Authored-by: Shixiong Zhu <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]> (cherry picked from commit 0df3b50) Signed-off-by: Jungtaek Lim <[email protected]>
What changes were proposed in this pull request?
This is another attempt to fix the flaky test "query without test harness" on ContinuousSuite.
query without test harnessis flaky because it starts a continuous query with two partitions but assumes they will run at the same speed.In this test, 0 and 2 will be written to partition 0, 1 and 3 will be written to partition 1. It assumes when we see 3, 2 should be written to the memory sink. But this is not guaranteed. We can add
if (currentValue == 2) Thread.sleep(5000)at this linespark/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
Line 135 in b2a2b5d
Result set Set([0], [1], [3]) are not a superset of Set(0, 1, 2, 3)!The fix is changing
waitForRateSourceCommittedValueto wait until all partitions reach the desired values before stopping the query.Why are the changes needed?
Fix a flaky test.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing tests. Manually verify the reproduction I mentioned above doesn't fail after this change.