-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-18067] Avoid shuffling child if join keys are superset of child's partitioning keys #19054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #81131 has finished for PR 19054 at commit
|
|
cc @hvanhovell @cloud-fan for review |
9ba8add to
ec8bd80
Compare
|
Test build #81562 has finished for PR 19054 at commit
|
ec8bd80 to
b0db6aa
Compare
|
Test build #84366 has finished for PR 19054 at commit
|
b0db6aa to
69e288e
Compare
|
Test build #84368 has finished for PR 19054 at commit
|
69e288e to
c689ff1
Compare
|
Test build #85985 has finished for PR 19054 at commit
|
|
cc @hvanhovell @cloud-fan for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some documentation to explain what the return value is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added more doc. I wasn't sure how to make it easier to understand. Hope that the example helps with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if leftPartitioning is HashPartitioning, we don't need to care about rightPartitioning at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that this was only done over SortMergeJoinExec and ShuffledHashJoinExec where both the partitionings are HashPartitioning, things worked fine. I have changed this to have a stricter check.
c689ff1 to
00bb14b
Compare
| */ | ||
| private def reorderJoinPredicates(plan: SparkPlan): SparkPlan = { | ||
| plan.transformUp { | ||
| case BroadcastHashJoinExec(leftKeys, rightKeys, joinType, buildSide, condition, left, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal of BroadcastHashJoinExec is intentional. The children are expected to have BroadcastDistribution or UnspecifiedDistribution so this method wont help here (this optimization only helps in case of shuffle based joins)
|
Test build #86406 has finished for PR 19054 at commit
|
| val rightKeysBuffer = ArrayBuffer[Expression]() | ||
| expectedOrderOfKeys: Seq[Expression], // comes from child's output partitioning | ||
| currentOrderOfKeys: Seq[Expression]): // comes from join predicate | ||
| (Seq[Expression], Seq[Expression], Seq[Expression], Seq[Expression]) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add a comment describing the return type? a tuple4 is not such a descriptive type 😃
| rightKeysBuffer.append(rightKeys(index)) | ||
| val index = currentOrderOfKeys.zipWithIndex.find { case (currKey, i) => | ||
| !processedIndicies.contains(i) && currKey.semanticEquals(expression) | ||
| }.get._2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the find guaranteed to always succeed?
if so, worth a comment on method's pre/post conditions.
a getOrElse(sys error "...") might also be a good way of documenting this.
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
Jira : https://issues.apache.org/jira/browse/SPARK-18067
What problem is being addressed in this PR ?
Currently shuffle based joins require its children to be shuffled over all the columns in the join condition. In case the child node is already distributed over a subset of columns in the join condition, this shuffle is not needed (eg. if the input is bucketed, if the input is output of a subquery). Avoiding the shuffle makes the join run faster and more stably as its single stage.
To dive deeper, lets look at this example. Both input tables
table1andtable2are bucketed on columnsiandjand have 8 buckets. The query is joining the 2 tables overi,j,k. With bucketing, all the rows with the same values ofiandjshould reside in the same bucket of both the inputs. So, if we simply sort the corresponding buckets over the join columns and perform the join, that would suffice the requirements.What changes were proposed in this pull request?
Both shuffled hash join and sort merge join would not keep track of
which keys should the children be distributed on ?. To start off, this is same as the join keys. The ruleReorderJoinPredicatesis modified to detect if the child's output partitioning is over a subset of join keys and based on that the distribution keys for the join operator are revised.How was this patch tested?
Query:
BEFORE
AFTER