SPARK-8153 Add configuration for disabling partial aggregation in runtime #6696

navis · 2015-06-08T06:26:04Z

Same thing with "hive.map.aggr.hash.min.reduction" in hive, which disables hash aggregation if it's not sufficiently decreasing the output size.

Added two configuration

spark.sql.partial.aggregation.checkInterval
spark.sql.partial.aggregation.minReduction

marmbrus · 2015-06-12T01:04:37Z

ok to test

SparkQA · 2015-06-12T02:41:27Z

Test build #34736 has finished for PR 6696 at commit 388ea7a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

navis · 2015-06-13T05:15:17Z

Test fail was just caused by appearance order. Added order-by for deterministic result

SparkQA · 2015-06-13T06:55:19Z

Test build #34819 has finished for PR 6696 at commit 527c7b5.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2015-06-13T07:32:16Z

core/src/main/scala/org/apache/spark/util/collection/BitSet.scala

Use Scala classes in Scala code

I'm new to scala. Could you suggest one?

Hm, sorry, I take that back. There is no equivalent in Scala that I can find, that sets an existing array's values. Can I suggest thought that importing just java.util might look confusing, so either just import it (there is no Scala Arrays to mix it up with) or if this is just one usage, write java.util.Arrays.fill(...)

Addressed comments. Thanks!

SparkQA · 2015-06-13T12:20:46Z

Test build #34822 has finished for PR 6696 at commit 4cf1c99.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2015-06-13T19:51:39Z

It looks like this is failing its own test.

navis · 2015-06-14T00:43:35Z

Strange.. cannot reproduce the fail in local env. I'll check it again.

navis · 2015-06-14T02:24:11Z

The memory leakage caused the test fail was a existing bug in master branch. Currently, unsafe-based hash is released on 'next' call but if input is empty, it would not be called ever.
Now returns an empty iterator if input is empty.

JoshRosen · 2015-06-14T02:31:01Z

@navis, good catch on finding the memory leak in the unsafe aggregation path. I think that maybe we should extract that bugfix into its own PR so that it's easier to backport to 1.4.x.

navis · 2015-06-14T02:58:11Z

@JoshRosen ok, sure.

SparkQA · 2015-06-14T04:47:30Z

Test build #34864 has finished for PR 6696 at commit f9616b9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

navis · 2015-06-14T07:52:18Z

done in SPARK-8357

SparkQA · 2015-08-21T05:11:03Z

Test build #41344 timed out for PR 6696 at commit 2c73bbd after a configured wait of 175m.

…time

SparkQA · 2015-08-26T07:29:03Z

Test build #41593 has finished for PR 6696 at commit 0b2da51.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class LogisticRegressionModel @Since("1.3.0") (
- class SVMModel @Since("1.1.0") (
- class FPGrowthModel[Item: ClassTag] @Since("1.3.0") (
- class FreqItemset[Item] @Since("1.3.0") (
- class FreqSequence[Item] @Since("1.5.0") (
- class PrefixSpanModel[Item] @Since("1.5.0") (
- abstract class SetOperation(left: LogicalPlan, right: LogicalPlan) extends BinaryNode
- case class Union(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right)
- case class Intersect(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right)
- case class Except(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right)

andrewor14 · 2015-09-01T20:19:47Z

A note about the naming: I think you meant spark.sql.partialAggregation.* instead? It doesn't make sense to have a spark.sql.partial.* namespace.

andrewor14 · 2015-09-01T20:20:04Z

@JoshRosen @marmbrus any updates on the details of this patch?

marmbrus · 2015-09-01T21:58:59Z

Unfortunately, the Aggregate1 code path is deprecated and going to be removed shortly. We should probably close this issue and design the feature on JIRA for the new aggregation code path.

navis force-pushed the SPARK-8153 branch from 388ea7a to ce58cf5 Compare June 13, 2015 05:13

navis force-pushed the SPARK-8153 branch from ce58cf5 to 527c7b5 Compare June 13, 2015 05:16

srowen reviewed Jun 13, 2015
View reviewed changes

navis force-pushed the SPARK-8153 branch from 527c7b5 to 4cf1c99 Compare June 13, 2015 10:34

navis force-pushed the SPARK-8153 branch from 4cf1c99 to f9616b9 Compare June 14, 2015 02:18

navis force-pushed the SPARK-8153 branch from f9616b9 to 2c73bbd Compare August 21, 2015 02:10

SPARK-8153 Add configuration for disabling partial aggregation in run…

0b2da51

…time

navis force-pushed the SPARK-8153 branch from 2c73bbd to 0b2da51 Compare August 26, 2015 04:35

asfgit closed this in 804a012 Sep 4, 2015

SPARK-8153 Add configuration for disabling partial aggregation in runtime #6696

SPARK-8153 Add configuration for disabling partial aggregation in runtime #6696

Uh oh!

Conversation

navis commented Jun 8, 2015

Uh oh!

marmbrus commented Jun 12, 2015

Uh oh!

SparkQA commented Jun 12, 2015

Uh oh!

navis commented Jun 13, 2015

Uh oh!

SparkQA commented Jun 13, 2015

Uh oh!

srowen Jun 13, 2015

Choose a reason for hiding this comment

Uh oh!

navis Jun 13, 2015

Choose a reason for hiding this comment

Uh oh!

srowen Jun 13, 2015

Choose a reason for hiding this comment

Uh oh!

navis Jun 13, 2015

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jun 13, 2015

Uh oh!

JoshRosen commented Jun 13, 2015

Uh oh!

navis commented Jun 14, 2015

Uh oh!

navis commented Jun 14, 2015

Uh oh!

JoshRosen commented Jun 14, 2015

Uh oh!

navis commented Jun 14, 2015

Uh oh!

SparkQA commented Jun 14, 2015

Uh oh!

navis commented Jun 14, 2015

Uh oh!

SparkQA commented Aug 21, 2015

Uh oh!

SparkQA commented Aug 26, 2015

Uh oh!

andrewor14 commented Sep 1, 2015

Uh oh!

andrewor14 commented Sep 1, 2015

Uh oh!

marmbrus commented Sep 1, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants