-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48329][SQL] Enable spark.sql.sources.v2.bucketing.pushPartValues.enabled by default
#46673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sql/core/src/test/scala/org/apache/spark/sql/connector/KeyGroupedPartitioningSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AS-IS PR title seems to be partial because this PR changes two configurations.
[SPARK-48329][SQL] SPJ: Default spark.sql.sources.v2.bucketing.pushPartValues.enabled to true
Also, please resolve the conflicts and revert the test coverage change, @szehon-ho .
18481ba to
332adfe
Compare
…artValues.enabled" to true
332adfe to
d1f7436
Compare
|
Thanks @dongjoon-hyun sorry the pr was not ready. I was trying to integrate the changes from @superdiaodiao who I asaw also made a pr for the same, so we can be co-authors. Reverted the additional config change and test case, will check the test result. |
I will close my PR and you can continue, let's co-author this time. |
|
Thanks! @dongjoon-hyun @sunchao can you take another look? |
spark.sql.sources.v2.bucketing.pushPartValues.enabled by default
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @szehon-ho and @superdiaodiao
Merged to master for Apache Spark 4.0.0.
Thank you~~~ |
|
BTW, @szehon-ho and @superdiaodiao . The umbrella JIRA is closed one already for Apache Spark 3.3.0. I moved SPARK-48329 to a subtask of SPARK-44111 for now.
For the other new tasks, please create a new umbrella JIRA and use it. |
OK |
|
@dongjoon-hyun do you have any guidance how we may have a new Spark 4.0+ tracking JIRA to link all the new SPJ items? I think it will be nice to have all of them in one list. |
|
Feel free to create a new umbrella Jira issue with a proper meaningful title instead of duplicating old Jira issue title. Then, link it to SPARK-44111, @szehon-ho . That's enough. |
### What changes were proposed in this pull request? Add docs for SPJ ### Why are the changes needed? There are no docs describing SPJ, even though it is mentioned in migration notes: #46673 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Checked the new text ### Was this patch authored or co-authored using generative AI tooling? No Closes #46745 from szehon-ho/doc_spj. Authored-by: Szehon Ho <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>

What changes were proposed in this pull request?
This PR aims to enable
spark.sql.sources.v2.bucketing.pushPartValues.enabledby default for Apache Spark 4.0.0 while keepingspark.sql.sources.v2.bucketing.enabledisfalse.Why are the changes needed?
spark.sql.sources.v2.bucketing.pushPartValues.enabledwas added at Apache Spark 3.4.0 and has been used as one of the datasource v2 bucketing feature. This PR will help the datasource v2 bucketing users use this feature more easily.Note that this change is technically no-op for the default users because
spark.sql.sources.v2.bucketing.enabledisfalsestill.Does this PR introduce any user-facing change?
No
How was this patch tested?
Pass the CIs.
Was this patch authored or co-authored using generative AI tooling?
No