-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-30719][SQL] do not log warning if AQE is intentionally skipped and add a config to force apply #27452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| val ADAPTIVE_EXECUTION_FORCE_APPLY = buildConf("spark.sql.adaptive.forceApply") | ||
| .internal() | ||
| .doc("Adaptive query execution is skipped when the query does not have exchanges or " + | ||
| "subqueries. By setting this config to true, Spark will be forced to apply adaptive " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Spark will force apply adaptive query execution for all supported queries.
| // only know if the query may contain exchange or not by checking | ||
| // `SparkPlan.requiredChildDistribution`. | ||
| // - The query contains sub-query. | ||
| private def shouldApplyAQE(plan: SparkPlan, isSubquery: Boolean): Boolean = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we just move the logic of mayContainExchange and containSubquery here and get rid of those methods?
| // `SparkPlan.requiredChildDistribution`. | ||
| // - The query contains sub-query. | ||
| private def shouldApplyAQE(plan: SparkPlan, isSubquery: Boolean): Boolean = { | ||
| conf.getConf(SQLConf.ADAPTIVE_EXECUTION_FORCE_APPLY) || isSubquery || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could move conf.adaptiveExecutionEnabled here too.
|
Test build #117833 has finished for PR 27452 at commit
|
maryannxue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Test build #117895 has finished for PR 27452 at commit
|
|
Test build #117884 has finished for PR 27452 at commit
|
|
LGTM. |
|
retest this please |
|
Test build #117917 has finished for PR 27452 at commit
|
|
retest this please |
|
Test build #117931 has finished for PR 27452 at commit
|
| .booleanConf | ||
| .createWithDefault(false) | ||
|
|
||
| val ADAPTIVE_EXECUTION_FORCE_APPLY = buildConf("spark.sql.adaptive.forceApply") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark.sql.adaptive.forceApply.enabled?
cc @gatorsmile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No... that sounds weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maryannxue . .enabled is a general guideline for boolean flag from @gatorsmile .
No... that sounds weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's usually xxx.featureName.enabled, but forceApply is a verb. For example, spark.sql.join.preferSortMergeJoin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Got it. I misunderstood the rule at that part until now. My bad. Thank you, @cloud-fan and @maryannxue .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a policy that config name must be xxx.featureName.enabled? At least for internal configs we follow PR author's personal preference AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gatorsmile do we need to add ".enabled" post-fix to all boolean configs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR, it would be great if we have a documented policy for this, @gatorsmile and @cloud-fan .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the references, @HyukjinKwon .
| .internal() | ||
| .doc("Adaptive query execution is skipped when the query does not have exchanges or " + | ||
| "subqueries. By setting this config to true, Spark will force apply adaptive query " + | ||
| "execution for all supported queries.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, shall we mention like By setting both spark.sql.adaptive.enabled and this config to true,?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By setting this config (together with spark.sql.adaptive.enabled) to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan .
Could you revise the PR title? Currently, it looks like a warning suppression, but we are adding a new configuration spark.sql.adaptive.forceApply technically.
|
Test build #117975 has finished for PR 27452 at commit
|
|
retest this please |
|
Test build #117982 has finished for PR 27452 at commit
|
| // - The query may need to add exchanges. It's an overkill to run `EnsureRequirements` here, so | ||
| // we just check `SparkPlan.requiredChildDistribution` and see if it's possible that the | ||
| // the query needs to add exchanges later. | ||
| // - The query contains sub-query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much clear than the original code.
gatorsmile
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JkSelf Could you write a test suite to test whether we are correctly logging a warning?
… and add a config to force apply ### What changes were proposed in this pull request? Update `InsertAdaptiveSparkPlan` to not log warning if AQE is skipped intentionally. This PR also add a config to not skip AQE. ### Why are the changes needed? It's not a warning at all if we intentionally skip AQE. ### Does this PR introduce any user-facing change? no ### How was this patch tested? run `AdaptiveQueryExecSuite` locally and verify that there is no warning logs. Closes #27452 from cloud-fan/aqe. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Xiao Li <[email protected]> (cherry picked from commit 8ce5862) Signed-off-by: Xiao Li <[email protected]>
|
Thanks! Merged to master/3.0 |
|
@gatorsmile Add unit test in #27515. |
… intentionally skip AQE ### What changes were proposed in this pull request? This is a follow up in [#27452](#27452). Add a unit test to verify whether the log warning is print when intentionally skip AQE. ### Why are the changes needed? Add unit test ### Does this PR introduce any user-facing change? No ### How was this patch tested? adding unit test Closes #27515 from JkSelf/aqeLoggingWarningTest. Authored-by: jiake <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
… intentionally skip AQE ### What changes were proposed in this pull request? This is a follow up in [#27452](#27452). Add a unit test to verify whether the log warning is print when intentionally skip AQE. ### Why are the changes needed? Add unit test ### Does this PR introduce any user-facing change? No ### How was this patch tested? adding unit test Closes #27515 from JkSelf/aqeLoggingWarningTest. Authored-by: jiake <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 5a24060) Signed-off-by: Wenchen Fan <[email protected]>
… and add a config to force apply ### What changes were proposed in this pull request? Update `InsertAdaptiveSparkPlan` to not log warning if AQE is skipped intentionally. This PR also add a config to not skip AQE. ### Why are the changes needed? It's not a warning at all if we intentionally skip AQE. ### Does this PR introduce any user-facing change? no ### How was this patch tested? run `AdaptiveQueryExecSuite` locally and verify that there is no warning logs. Closes apache#27452 from cloud-fan/aqe. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Xiao Li <[email protected]>
… intentionally skip AQE ### What changes were proposed in this pull request? This is a follow up in [apache#27452](apache#27452). Add a unit test to verify whether the log warning is print when intentionally skip AQE. ### Why are the changes needed? Add unit test ### Does this PR introduce any user-facing change? No ### How was this patch tested? adding unit test Closes apache#27515 from JkSelf/aqeLoggingWarningTest. Authored-by: jiake <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Update
InsertAdaptiveSparkPlanto not log warning if AQE is skipped intentionally.This PR also add a config to not skip AQE.
Why are the changes needed?
It's not a warning at all if we intentionally skip AQE.
Does this PR introduce any user-facing change?
no
How was this patch tested?
run
AdaptiveQueryExecSuitelocally and verify that there is no warning logs.