-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Skipping partial aggregation when it is not helping for high cardinality aggregates #11627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
b54fa14
rfc: optional skipping partial aggregation
korowa f7db8e3
benchmarks for convert_to_state
korowa 8549c60
speeding up conversion to state
korowa 8913fcf
Fix MSRV error on 1.76.0
alamb b3c033f
Improve aggregatation documentation
alamb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -62,10 +62,12 @@ pub(crate) enum ExecutionState { | |||||
| /// When producing output, the remaining rows to output are stored | ||||||
| /// here and are sliced off as needed in batch_size chunks | ||||||
| ProducingOutput(RecordBatch), | ||||||
| /// Indicates that GroupedHashAggregateStream should produce | ||||||
| /// intermediate aggregate state for each input rows without | ||||||
| /// their aggregation | ||||||
| /// Produce intermediate aggregate state for each input row without | ||||||
| /// aggregation. | ||||||
| /// | ||||||
| /// See "partial aggregation" discussion on [`GroupedHashAggregateStream`] | ||||||
| SkippingAggregation, | ||||||
| /// All input has been consumed and all groups have been emitted | ||||||
| Done, | ||||||
| } | ||||||
|
|
||||||
|
|
@@ -94,6 +96,9 @@ struct SpillState { | |||||
| merging_group_by: PhysicalGroupBy, | ||||||
| } | ||||||
|
|
||||||
| /// Tracks if the aggregate should skip partial aggregations | ||||||
| /// | ||||||
| /// See "partial aggregation" discussion on [`GroupedHashAggregateStream`] | ||||||
| struct SkipAggregationProbe { | ||||||
| /// Number of processed input rows | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| input_rows: usize, | ||||||
|
|
@@ -204,7 +209,7 @@ impl SkipAggregationProbe { | |||||
| /// of `x` and one accumulator for `SUM(y)`, specialized for the data | ||||||
| /// type of `y`. | ||||||
| /// | ||||||
| /// # Description | ||||||
| /// # Discussion | ||||||
| /// | ||||||
| /// [`group_values`] does not store any aggregate state inline. It only | ||||||
| /// assigns "group indices", one for each (distinct) group value. The | ||||||
|
|
@@ -222,7 +227,25 @@ impl SkipAggregationProbe { | |||||
| /// | ||||||
| /// [`group_values`]: Self::group_values | ||||||
| /// | ||||||
| /// # Spilling | ||||||
| /// # Partial Aggregate and multi-phase grouping | ||||||
| /// | ||||||
| /// As described on [`Accumulator::state`], this operator is used in the context | ||||||
| /// "multi-phase" grouping when the mode is [`AggregateMode::Partial`]. | ||||||
| /// | ||||||
| /// An important optimization for multi-phase partial aggregation is to skip | ||||||
| /// partial aggregation when it is not effective enough to warrant the memory or | ||||||
| /// CPU cost, as is often the case for queries many distinct groups (high | ||||||
| /// cardinality group by). Memory is particularly important because each Partial | ||||||
| /// aggregator must store the intermediate state for each group. | ||||||
| /// | ||||||
| /// If the ratio of the number of groups to the number of input rows exceeds a | ||||||
| /// threshold, and [`GroupsAccumulator::supports_convert_to_state`] is | ||||||
| /// supported, this operator will stop applying Partial aggregation and directly | ||||||
| /// pass the input rows to the next aggregation phase. | ||||||
| /// | ||||||
| /// [`Accumulator::state`]: datafusion_expr::Accumulator::state | ||||||
| /// | ||||||
| /// # Spilling (to disk) | ||||||
| /// | ||||||
| /// The sizes of group values and accumulators can become large. Before that causes out of memory, | ||||||
| /// this hash aggregator outputs partial states early for partial aggregation or spills to local | ||||||
|
|
@@ -344,7 +367,7 @@ pub(crate) struct GroupedHashAggregateStream { | |||||
| group_values_soft_limit: Option<usize>, | ||||||
|
|
||||||
| /// Optional probe for skipping data aggregation, if supported by | ||||||
| /// current stream | ||||||
| /// current stream. | ||||||
| skip_aggregation_probe: Option<SkipAggregationProbe>, | ||||||
| } | ||||||
|
|
||||||
|
|
||||||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @kazuyukitanimura -- I wonder if you have time to review this change to hash aggregate spilling as you originally contributed #7400
Context: