-
Notifications
You must be signed in to change notification settings - Fork 29k
SPARK-8153 Add configuration for disabling partial aggregation in runtime #6696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test |
|
Test build #34736 has finished for PR 6696 at commit
|
|
Test fail was just caused by appearance order. Added order-by for deterministic result |
|
Test build #34819 has finished for PR 6696 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Scala classes in Scala code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to scala. Could you suggest one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, sorry, I take that back. There is no equivalent in Scala that I can find, that sets an existing array's values. Can I suggest thought that importing just java.util might look confusing, so either just import it (there is no Scala Arrays to mix it up with) or if this is just one usage, write java.util.Arrays.fill(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed comments. Thanks!
|
Test build #34822 has finished for PR 6696 at commit
|
|
It looks like this is failing its own test. |
|
Strange.. cannot reproduce the fail in local env. I'll check it again. |
|
The memory leakage caused the test fail was a existing bug in master branch. Currently, unsafe-based hash is released on 'next' call but if input is empty, it would not be called ever. |
|
@navis, good catch on finding the memory leak in the unsafe aggregation path. I think that maybe we should extract that bugfix into its own PR so that it's easier to backport to 1.4.x. |
|
@JoshRosen ok, sure. |
|
Test build #34864 has finished for PR 6696 at commit
|
|
done in SPARK-8357 |
|
Test build #41344 timed out for PR 6696 at commit |
|
Test build #41593 has finished for PR 6696 at commit
|
|
A note about the naming: I think you meant |
|
@JoshRosen @marmbrus any updates on the details of this patch? |
|
Unfortunately, the Aggregate1 code path is deprecated and going to be removed shortly. We should probably close this issue and design the feature on JIRA for the new aggregation code path. |
Same thing with "hive.map.aggr.hash.min.reduction" in hive, which disables hash aggregation if it's not sufficiently decreasing the output size.
Added two configuration