Skip to content

Conversation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is for testing purpose.

@SparkQA
Copy link

SparkQA commented Oct 26, 2015

Test build #44342 has finished for PR 9276 at commit 3abf78f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

", in bytes" ?

@SparkQA
Copy link

SparkQA commented Oct 28, 2015

Test build #44472 has finished for PR 9276 at commit e31ed6b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 28, 2015

Test build #44487 has finished for PR 9276 at commit 27af247.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

@yhuai
Copy link
Contributor Author

yhuai commented Oct 28, 2015

yhuai@24e1caf is only for testing purpose. We will remove it before we commit the code.

@yhuai
Copy link
Contributor Author

yhuai commented Oct 28, 2015

@JoshRosen I changed InnerJoinSuite to make our physical join operator do not use the same SparkPlan as its both left and right children (yhuai@c2463da).

@yhuai yhuai changed the title [SPARK-9858][SPARK-9859][SPARK-9861][SQL][WIP] Add an ExchangeCoordinator to estimate the number of post-shuffle partitions for aggregates and joins [SPARK-9858][SPARK-9859][SPARK-9861][SQL] Add an ExchangeCoordinator to estimate the number of post-shuffle partitions for aggregates and joins Oct 28, 2015
@SparkQA
Copy link

SparkQA commented Oct 28, 2015

Test build #44544 has finished for PR 9276 at commit 24e1caf.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment why that is required briefly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@SparkQA
Copy link

SparkQA commented Oct 29, 2015

Test build #44561 has finished for PR 9276 at commit 2130068.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

@yhuai
Copy link
Contributor Author

yhuai commented Oct 29, 2015

The last three commits are just for testing purpose.

@yhuai
Copy link
Contributor Author

yhuai commented Oct 29, 2015

test this please

@SparkQA
Copy link

SparkQA commented Oct 29, 2015

Test build #44632 has finished for PR 9276 at commit f97ea39.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

@SparkQA
Copy link

SparkQA commented Oct 29, 2015

Test build #44638 has finished for PR 9276 at commit 197b63a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class HasSolver(Params):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a toString method to this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@yhuai
Copy link
Contributor Author

yhuai commented Nov 1, 2015

test this please

@SparkQA
Copy link

SparkQA commented Nov 1, 2015

Test build #44748 has finished for PR 9276 at commit 2d1f262.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

@SparkQA
Copy link

SparkQA commented Nov 2, 2015

Test build #44781 has finished for PR 9276 at commit 890521e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an @GuardedBy("this")?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a set instead of an array buffer in order to guard against double-registration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case, I guess we should expose that and fix the double registration problem, right? In doEstimationIfNecessary, we have an assert assert(exchanges.length == numExchanges).

@JoshRosen
Copy link
Contributor

Let's merge this now and post-hoc review in more detail later.

@SparkQA
Copy link

SparkQA commented Nov 3, 2015

Test build #44882 has finished for PR 9276 at commit 60e371e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * case class Exchange(\n * class CoalescedPartitioner(val parent: Partitioner, val partitionStartIndices: Array[Int])\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this call super.beforeAll()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we are fine since SparkFunSuite does not extend BeforeAndAfterAll.

@yhuai
Copy link
Contributor Author

yhuai commented Nov 3, 2015

ok. I am merging it now. Will have a follow-up pr to address @JoshRosen's comments.

@asfgit asfgit closed this in d728d5c Nov 3, 2015
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Nov 3, 2015
asfgit pushed a commit that referenced this pull request Nov 6, 2015
…f post-shuffle partitions for aggregates and joins (follow-up)

https://issues.apache.org/jira/browse/SPARK-9858

This PR is the follow-up work of #9276. It addresses JoshRosen's comments.

Author: Yin Huai <[email protected]>

Closes #9453 from yhuai/numReducer-followUp.

(cherry picked from commit 8211aab)
Signed-off-by: Yin Huai <[email protected]>
asfgit pushed a commit that referenced this pull request Nov 6, 2015
…f post-shuffle partitions for aggregates and joins (follow-up)

https://issues.apache.org/jira/browse/SPARK-9858

This PR is the follow-up work of #9276. It addresses JoshRosen's comments.

Author: Yin Huai <[email protected]>

Closes #9453 from yhuai/numReducer-followUp.
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Nov 11, 2015
markhamstra pushed a commit to markhamstra/spark that referenced this pull request Nov 11, 2015
…f post-shuffle partitions for aggregates and joins (follow-up)

https://issues.apache.org/jira/browse/SPARK-9858

This PR is the follow-up work of apache#9276. It addresses JoshRosen's comments.

Author: Yin Huai <[email protected]>

Closes apache#9453 from yhuai/numReducer-followUp.
kiszk pushed a commit to kiszk/spark-gpu that referenced this pull request Dec 26, 2015
…f post-shuffle partitions for aggregates and joins (follow-up)

https://issues.apache.org/jira/browse/SPARK-9858

This PR is the follow-up work of apache/spark#9276. It addresses JoshRosen's comments.

Author: Yin Huai <[email protected]>

Closes #9453 from yhuai/numReducer-followUp.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the implementation, we can't guarantee to satisfy minNumPostShufflePartitions right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. It is advisory.

@darabos
Copy link
Contributor

darabos commented Mar 2, 2016

Is spark.sql.adaptive.enabled documented somewhere? It's not in http://spark.apache.org/docs/1.6.0/configuration.html.

@yhuai
Copy link
Contributor Author

yhuai commented Mar 2, 2016

No it is not documented. Right now, it is mainly for people who are interested in experimenting it. There are still work that needs to be done to make it support more cases.

@dreamworks007
Copy link

@yhuai , could you please let us know is there any known issues / limitation with this feature ? Has this feature been tested under some large jobs ?

We are also considering automatical determining shuffle partitions, and happened to see this PR, and therefore interested in exploring this feature a little bit to see if we could productionize it for all jobs (by default).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants