Commit e45ffcc
[SPARK-14739][PYSPARK] Fix Vectors parser bugs
## What changes were proposed in this pull request?
The PySpark deserialization has a bug that shows while deserializing all zero sparse vectors. This fix filters out empty string tokens before casting, hence properly stringified SparseVectors successfully get parsed.
## How was this patch tested?
Standard unit-tests similar to other methods.
Author: Arash Parsa <arash@ip-192-168-50-106.ec2.internal>
Author: Arash Parsa <arashpa@gmail.com>
Author: Vishnu Prasad <vishnu667@gmail.com>
Author: Vishnu Prasad S <vishnu667@gmail.com>
Closes apache#12516 from arashpa/SPARK-14739.
(cherry picked from commit 2b8906c)
Signed-off-by: Sean Owen <sowen@cloudera.com>
(cherry picked from commit 1cda10b)1 parent 4d90ecf commit e45ffcc
2 files changed
+14
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
293 | 293 | | |
294 | 294 | | |
295 | 295 | | |
296 | | - | |
| 296 | + | |
297 | 297 | | |
298 | 298 | | |
299 | 299 | | |
| |||
584 | 584 | | |
585 | 585 | | |
586 | 586 | | |
587 | | - | |
| 587 | + | |
588 | 588 | | |
589 | 589 | | |
590 | 590 | | |
| |||
597 | 597 | | |
598 | 598 | | |
599 | 599 | | |
600 | | - | |
| 600 | + | |
601 | 601 | | |
602 | 602 | | |
603 | 603 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
388 | 388 | | |
389 | 389 | | |
390 | 390 | | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
391 | 394 | | |
392 | | - | |
393 | | - | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
394 | 400 | | |
395 | | - | |
396 | | - | |
| 401 | + | |
| 402 | + | |
397 | 403 | | |
398 | | - | |
| 404 | + | |
399 | 405 | | |
400 | 406 | | |
401 | 407 | | |
| |||
0 commit comments