-
Notifications
You must be signed in to change notification settings - Fork 1.8k
experiment: Selectively remove CoalesceBatchesExec #15479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The partial aggregation can switch to a pass-through mode. In this case, coalescing might make sense
|
That makes a lot of sense! |
|
|
||
| /// Remove CoalesceBatchesExec that are in front of a AggregateExec | ||
| #[derive(Default, Debug)] | ||
| pub struct UnCoalesceBatches {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we instead can avoid adding them in the CoalesceBatches optimizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly something to attempt. I've not done it (yet) because it's not necessary to evaluate the impact of this change
|
I think you can generalize this logic by tracking the |
|
BTW thank you very much @ctsk -- it is really cool to see the joins get some careful love and attention ❤️ |
|
Here is my result after removing |
I can give some guidance on that BTW, if you prefer so |
|
Marking as draft to signify the implementation is not finished yet Thanks for the benchmark @Rachelint. Did you use the implementation in this PR or write your own? Overall the impact seems small (still good for how little it takes to implement!). I think it's worth investigating if a Coalesce with a different threshold makes sense. Thanks for the offer @berkaysynnada. I think I know what to do, but haven't found the time the past few days. I'll give it a try later today. |
It looks to me like clickbench results from bench.sh: https://github.com/apache/datafusion/blob/main/benchmarks/README.md |
|
@berkaysynnada I had a look at I could add a blanket rule for other plans - potentially outside of datafusion repo - that removes the coalesce for each child of a plan that does not have EmissionType::Incremental. Unfortunately this does not cover the rule for the aggregation: Here I purposefully kept the CoalesceBatchesExec underneath partial aggregations, because those can switch to passing-through batches without aggregating. So far, this PR is a lot of reasoning what might make sense, but in the end it's down to measuring the impact for each operator. I plan on trying different coalesce thresholds tomorrow and see what works best. |
I understand why it doesn't fit in this use case. Maybe we should have another API for operators like pipeline_behavior: |
|
I think this is a nice experiment. |
|
Sorry for the silence. I've been extensively benchmarking this PR and the results have been fairly mixed. I've also tried different thresholds for coalescing. I plan on generating tables and push the results later today. |
|
benchmarks/hashagg-results.md I've checked in the results because I think they would be too large to include as a comment. Each file contains the results of reducing the coalesce threshold for a single operator - joins, hash aggregations, and sorts. Coalescing before all other operators remains unchanged. The value behind each configuration describes what the coalesce threshold was set to: SORT0 means that CoalesceBatches operators were fully removed, whereas SORT256 means that the CoalesceBatches operator in front of a SORT was configured to emit a batch once it had 256 rows buffered. The same applies to joins and hash aggregations. The benchmarks were run with 16 target partitions. I suspect that the more target partitions there are, the smaller the batches produced by RepartitionExec become. Therefore, removing coalesce might work better with smaller target partition counts (for hash aggregation and joins). |
Relates to Issue: #15478
Rationale for this change
The blocking operators (HJ buid side, Aggregation) are often planned on top of a RepartitionExec with a CoalesceBatchesExec in-between. However, one of the first things these operators do is concatenate the freshly CoalescedBatches.
This PR is to test if the overhead of the 2-step coalesce+concat outweighs the gains of fewer dispatches of the consuming operators.
What changes are included in this PR?
This PR adds a physical optimizer rule
UncoalesceBatches. It runs after theCoalesceBatchesrule and removesCoalesceBatchesExecthat are at the build side of HashJoins and in front of non-partial aggregationsAre these changes tested?
Not yet!
Are there any user-facing changes?
Yes.