Commit e0dcb3c
[SPARK-24013][SQL] Remove unneeded compress in ApproximatePercentile
## What changes were proposed in this pull request?
`ApproximatePercentile` contains a workaround logic to compress the samples since at the beginning `QuantileSummaries` was ignoring the compression threshold. This problem was fixed in SPARK-17439, but the workaround logic was not removed. So we are compressing the samples many more times than needed: this could lead to critical performance degradation.
This can create serious performance issues in queries like:
```
select approx_percentile(id, array(0.1)) from range(10000000)
```
## How was this patch tested?
added UT
Author: Marco Gaido <[email protected]>
Closes apache#21133 from mgaido91/SPARK-24013.1 parent cc96c94 commit e0dcb3c
File tree
3 files changed
+26
-31
lines changed- sql
- catalyst/src/main/scala/org/apache/spark/sql/catalyst
- expressions/aggregate
- util
- core/src/test/scala/org/apache/spark/sql
3 files changed
+26
-31
lines changedLines changed: 6 additions & 27 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
206 | 206 | | |
207 | 207 | | |
208 | 208 | | |
209 | | - | |
210 | | - | |
211 | 209 | | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
| 210 | + | |
225 | 211 | | |
226 | 212 | | |
227 | | - | |
| 213 | + | |
228 | 214 | | |
229 | 215 | | |
| 216 | + | |
| 217 | + | |
230 | 218 | | |
231 | 219 | | |
232 | 220 | | |
| |||
236 | 224 | | |
237 | 225 | | |
238 | 226 | | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | 227 | | |
248 | 228 | | |
249 | 229 | | |
| |||
280 | 260 | | |
281 | 261 | | |
282 | 262 | | |
283 | | - | |
284 | 263 | | |
285 | 264 | | |
286 | 265 | | |
| |||
335 | 314 | | |
336 | 315 | | |
337 | 316 | | |
338 | | - | |
339 | | - | |
| 317 | + | |
| 318 | + | |
340 | 319 | | |
341 | 320 | | |
342 | 321 | | |
| |||
Lines changed: 7 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| 43 | + | |
43 | 44 | | |
44 | 45 | | |
45 | 46 | | |
46 | 47 | | |
47 | 48 | | |
48 | | - | |
| 49 | + | |
| 50 | + | |
49 | 51 | | |
50 | 52 | | |
51 | 53 | | |
| |||
60 | 62 | | |
61 | 63 | | |
62 | 64 | | |
| 65 | + | |
63 | 66 | | |
64 | 67 | | |
65 | 68 | | |
| |||
135 | 138 | | |
136 | 139 | | |
137 | 140 | | |
138 | | - | |
| 141 | + | |
139 | 142 | | |
140 | 143 | | |
141 | 144 | | |
142 | | - | |
| 145 | + | |
143 | 146 | | |
144 | 147 | | |
145 | 148 | | |
| |||
163 | 166 | | |
164 | 167 | | |
165 | 168 | | |
166 | | - | |
| 169 | + | |
167 | 170 | | |
168 | 171 | | |
169 | 172 | | |
| |||
Lines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
279 | 280 | | |
280 | 281 | | |
281 | 282 | | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
282 | 295 | | |
0 commit comments