-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32452][R][SQL] Bump up the minimum Arrow version as 1.0.0 in SparkR #29253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #126601 has finished for PR 29253 at commit
|
appveyor.yml
Outdated
| # This environment variable works around to test SparkR against a higher version. | ||
| R_REMOTES_NO_ERRORS_FROM_WARNINGS: true | ||
| # AppVeyor does not have python3 yet which is used by default. | ||
| PYSPARK_PYTHON: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding this change together to fix AppVeyor build.
|
Test build #126604 has finished for PR 29253 at commit
|
| e1071, | ||
| survival, | ||
| arrow (>= 0.15.1) | ||
| arrow (>= 1.0.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice. Are we going to drop Arrow 0.x officially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think at least it's okay for SparkR for now.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. It would be great if we can enforce Arrow 1.0 for all languages.
appveyor.yml
Outdated
| # AppVeyor does not have python3 yet which is used by default. | ||
| PYSPARK_PYTHON: python | ||
| # TODO(SPARK-32453): Remove SPARK_SCALA_VERSION environment and let load-spark-env scripts detect it. | ||
| SPARK_SCALA_VERSION: 2.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question. Is this enough to fix AppVeyor build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm testing now.. :). If it needs a more complicated fix, I'll just revert this bit and merge.
|
I love to have this patch on |
|
Test build #126607 has finished for PR 29253 at commit
|
|
I am going to investigate AppVeyor test failures at HyukjinKwon#13 |
|
I am merging this since the tests already passed. The last commit just reverted some additional changes. |
|
Thank you @viirya and @dongjoon-hyun for reviewing it. Merged to master. |
|
Thanks! |
What changes were proposed in this pull request?
This PR proposes to set the minimum Arrow version as 1.0.0 to minimise the maintenance overhead and keep the minimal version up to date.
Other required changes to support 1.0.0 were already made in SPARK-32451.
Why are the changes needed?
R side, people rather aggressively encourage people to use the latest version, and SparkR vectorization is very experimental that was added from Spark 3.0.
Also, we're technically not testing old Arrow versions in SparkR for now.
Does this PR introduce any user-facing change?
Yes, users wouldn't be able to use SparkR with old Arrow.
How was this patch tested?
GitHub Actions and AppVeyor are already testing them.