Commit 4a6b2b9
[SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
### What changes were proposed in this pull request?
- move the rule `OptimizeSkewedJoin` from stage optimization phase to stage preparation phase.
- run the rule `EnsureRequirements` one more time after the `OptimizeSkewedJoin` rule in the stage preparation phase.
- add `SkewJoinAwareCost` to support estimate skewed join cost
- add new config to decide if force optimize skewed join
- in `OptimizeSkewedJoin`, we generate 2 physical plans, one with skew join optimization and one without. Then we use the cost evaluator w.r.t. the force-skew-join flag and pick the plan with lower cost.
### Why are the changes needed?
In general, skewed join has more impact on performance than once more shuffle. It makes sense to force optimize skewed join even if introduce extra shuffle.
A common case:
```
HashAggregate
SortMergJoin
Sort
Exchange
Sort
Exchange
```
and after this PR, the plan looks like:
```
HashAggregate
Exchange
SortMergJoin (isSkew=true)
Sort
Exchange
Sort
Exchange
```
Note that, the new introduced shuffle also can be optimized by AQE.
### Does this PR introduce _any_ user-facing change?
Yes, a new config.
### How was this patch tested?
* Add new test
* pass exists test `SPARK-30524: Do not optimize skew join if introduce additional shuffle`
* pass exists test `SPARK-33551: Do not use custom shuffle reader for repartition`
Closes #32816 from ulysses-you/support-extra-shuffle.
Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>1 parent e1e1961 commit 4a6b2b9
File tree
6 files changed
+217
-58
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/internal
- core/src
- main/scala/org/apache/spark/sql/execution
- adaptive
- exchange
- test/scala/org/apache/spark/sql/execution/adaptive
6 files changed
+217
-58
lines changedLines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
666 | 666 | | |
667 | 667 | | |
668 | 668 | | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
669 | 676 | | |
670 | 677 | | |
671 | 678 | | |
| |||
Lines changed: 17 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
100 | 106 | | |
101 | 107 | | |
102 | 108 | | |
103 | | - | |
104 | | - | |
| 109 | + | |
105 | 110 | | |
106 | 111 | | |
107 | 112 | | |
108 | 113 | | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
113 | 124 | | |
114 | 125 | | |
115 | 126 | | |
116 | 127 | | |
117 | 128 | | |
118 | 129 | | |
119 | | - | |
120 | | - | |
121 | 130 | | |
122 | 131 | | |
123 | 132 | | |
| |||
169 | 178 | | |
170 | 179 | | |
171 | 180 | | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | 181 | | |
179 | 182 | | |
180 | 183 | | |
| |||
Lines changed: 19 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | | - | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
51 | | - | |
52 | | - | |
53 | | - | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
54 | 56 | | |
55 | 57 | | |
56 | 58 | | |
| |||
250 | 252 | | |
251 | 253 | | |
252 | 254 | | |
253 | | - | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
254 | 266 | | |
255 | 267 | | |
256 | 268 | | |
257 | 269 | | |
258 | 270 | | |
259 | 271 | | |
260 | 272 | | |
261 | | - | |
| 273 | + | |
| 274 | + | |
262 | 275 | | |
263 | 276 | | |
264 | 277 | | |
| |||
Lines changed: 43 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
38 | | - | |
39 | | - | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
40 | 44 | | |
41 | | - | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
42 | 67 | | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
43 | 73 | | |
44 | | - | |
| 74 | + | |
45 | 75 | | |
46 | 76 | | |
47 | | - | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
48 | 86 | | |
49 | 87 | | |
Lines changed: 63 additions & 33 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
| 42 | + | |
41 | 43 | | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
50 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
51 | 56 | | |
52 | | - | |
| 57 | + | |
53 | 58 | | |
54 | 59 | | |
55 | 60 | | |
56 | 61 | | |
57 | 62 | | |
58 | 63 | | |
59 | 64 | | |
60 | | - | |
| 65 | + | |
61 | 66 | | |
62 | 67 | | |
63 | 68 | | |
| |||
69 | 74 | | |
70 | 75 | | |
71 | 76 | | |
72 | | - | |
| 77 | + | |
73 | 78 | | |
74 | 79 | | |
75 | 80 | | |
| |||
78 | 83 | | |
79 | 84 | | |
80 | 85 | | |
81 | | - | |
| 86 | + | |
82 | 87 | | |
83 | 88 | | |
84 | 89 | | |
| |||
87 | 92 | | |
88 | 93 | | |
89 | 94 | | |
90 | | - | |
| 95 | + | |
91 | 96 | | |
92 | 97 | | |
93 | 98 | | |
| |||
106 | 111 | | |
107 | 112 | | |
108 | 113 | | |
109 | | - | |
| 114 | + | |
110 | 115 | | |
111 | 116 | | |
112 | 117 | | |
| |||
124 | 129 | | |
125 | 130 | | |
126 | 131 | | |
127 | | - | |
| 132 | + | |
128 | 133 | | |
129 | 134 | | |
130 | 135 | | |
| |||
133 | 138 | | |
134 | 139 | | |
135 | 140 | | |
136 | | - | |
| 141 | + | |
137 | 142 | | |
138 | 143 | | |
139 | 144 | | |
| |||
254 | 259 | | |
255 | 260 | | |
256 | 261 | | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
267 | 274 | | |
268 | | - | |
269 | | - | |
270 | | - | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
271 | 294 | | |
272 | | - | |
| 295 | + | |
273 | 296 | | |
274 | | - | |
275 | | - | |
276 | | - | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
277 | 307 | | |
278 | 308 | | |
0 commit comments