|
2 | 2 | Benchmark to measure CSV read/write performance |
3 | 3 | ================================================================================================ |
4 | 4 |
|
5 | | -OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 4.15.0-1063-aws |
6 | | -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz |
| 5 | +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7 |
| 6 | +Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz |
7 | 7 | Parsing quoted values: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
8 | 8 | ------------------------------------------------------------------------------------------------------------------------ |
9 | | -One quoted string 47588 47831 244 0.0 951755.4 1.0X |
| 9 | +One quoted string 24185 24195 10 0.0 483694.2 1.0X |
10 | 10 |
|
11 | | -OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 4.15.0-1063-aws |
12 | | -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz |
| 11 | +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7 |
| 12 | +Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz |
13 | 13 | Wide rows with 1000 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
14 | 14 | ------------------------------------------------------------------------------------------------------------------------ |
15 | | -Select 1000 columns 129509 130323 1388 0.0 129509.4 1.0X |
16 | | -Select 100 columns 42474 42572 108 0.0 42473.6 3.0X |
17 | | -Select one column 35479 35586 93 0.0 35479.1 3.7X |
18 | | -count() 11021 11071 47 0.1 11021.3 11.8X |
19 | | -Select 100 columns, one bad input field 94652 94795 134 0.0 94652.0 1.4X |
20 | | -Select 100 columns, corrupt record field 115336 115542 350 0.0 115336.0 1.1X |
| 15 | +Select 1000 columns 61793 62388 532 0.0 61793.4 1.0X |
| 16 | +Select 100 columns 21958 21993 34 0.0 21957.9 2.8X |
| 17 | +Select one column 18215 18515 505 0.1 18215.0 3.4X |
| 18 | +count() 5865 6168 296 0.2 5865.1 10.5X |
| 19 | +Select 100 columns, one bad input field 39638 39739 124 0.0 39637.5 1.6X |
| 20 | +Select 100 columns, corrupt record field 47290 48133 741 0.0 47290.0 1.3X |
21 | 21 |
|
22 | | -OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 4.15.0-1063-aws |
23 | | -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz |
| 22 | +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7 |
| 23 | +Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz |
24 | 24 | Count a dataset with 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
25 | 25 | ------------------------------------------------------------------------------------------------------------------------ |
26 | | -Select 10 columns + count() 19959 20022 76 0.5 1995.9 1.0X |
27 | | -Select 1 column + count() 13920 13968 54 0.7 1392.0 1.4X |
28 | | -count() 3928 3938 11 2.5 392.8 5.1X |
| 26 | +Select 10 columns + count() 9935 10460 461 1.0 993.5 1.0X |
| 27 | +Select 1 column + count() 6786 7179 342 1.5 678.6 1.5X |
| 28 | +count() 2281 2458 165 4.4 228.1 4.4X |
29 | 29 |
|
30 | | -OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 4.15.0-1063-aws |
31 | | -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz |
| 30 | +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7 |
| 31 | +Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz |
32 | 32 | Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
33 | 33 | ------------------------------------------------------------------------------------------------------------------------ |
34 | | -Create a dataset of timestamps 1940 1977 56 5.2 194.0 1.0X |
35 | | -to_csv(timestamp) 15398 15669 458 0.6 1539.8 0.1X |
36 | | -write timestamps to files 12438 12454 19 0.8 1243.8 0.2X |
37 | | -Create a dataset of dates 2157 2171 18 4.6 215.7 0.9X |
38 | | -to_csv(date) 11764 11839 95 0.9 1176.4 0.2X |
39 | | -write dates to files 8893 8907 12 1.1 889.3 0.2X |
| 34 | +Create a dataset of timestamps 812 826 14 12.3 81.2 1.0X |
| 35 | +to_csv(timestamp) 7548 7764 192 1.3 754.8 0.1X |
| 36 | +write timestamps to files 7052 7193 141 1.4 705.2 0.1X |
| 37 | +Create a dataset of dates 897 909 13 11.1 89.7 0.9X |
| 38 | +to_csv(date) 4778 4787 10 2.1 477.8 0.2X |
| 39 | +write dates to files 3853 3891 33 2.6 385.3 0.2X |
40 | 40 |
|
41 | | -OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 4.15.0-1063-aws |
42 | | -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz |
| 41 | +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7 |
| 42 | +Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz |
43 | 43 | Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
44 | 44 | ------------------------------------------------------------------------------------------------------------------------ |
45 | | -read timestamp text from files 2219 2230 11 4.5 221.9 1.0X |
46 | | -read timestamps from files 51519 51725 192 0.2 5151.9 0.0X |
47 | | -infer timestamps from files 104744 104885 124 0.1 10474.4 0.0X |
48 | | -read date text from files 1940 1943 4 5.2 194.0 1.1X |
49 | | -read date from files 27099 27118 33 0.4 2709.9 0.1X |
50 | | -infer date from files 27662 27703 61 0.4 2766.2 0.1X |
51 | | -timestamp strings 4225 4242 15 2.4 422.5 0.5X |
52 | | -parse timestamps from Dataset[String] 56090 56479 376 0.2 5609.0 0.0X |
53 | | -infer timestamps from Dataset[String] 115629 116245 1049 0.1 11562.9 0.0X |
54 | | -date strings 4337 4344 10 2.3 433.7 0.5X |
55 | | -parse dates from Dataset[String] 32373 32476 120 0.3 3237.3 0.1X |
56 | | -from_csv(timestamp) 54952 55157 300 0.2 5495.2 0.0X |
57 | | -from_csv(date) 30924 30985 66 0.3 3092.4 0.1X |
| 45 | +read timestamp text from files 1259 1262 4 7.9 125.9 1.0X |
| 46 | +read timestamps from files 20030 20105 80 0.5 2003.0 0.1X |
| 47 | +infer timestamps from files 39621 39674 61 0.3 3962.1 0.0X |
| 48 | +read date text from files 1039 1068 40 9.6 103.9 1.2X |
| 49 | +read date from files 9352 9363 10 1.1 935.2 0.1X |
| 50 | +infer date from files 11465 11485 23 0.9 1146.5 0.1X |
| 51 | +timestamp strings 1759 1812 59 5.7 175.9 0.7X |
| 52 | +parse timestamps from Dataset[String] 20806 20858 75 0.5 2080.6 0.1X |
| 53 | +infer timestamps from Dataset[String] 40537 40821 258 0.2 4053.7 0.0X |
| 54 | +date strings 1808 1816 12 5.5 180.8 0.7X |
| 55 | +parse dates from Dataset[String] 12080 12311 245 0.8 1208.0 0.1X |
| 56 | +from_csv(timestamp) 20120 21503 1224 0.5 2012.0 0.1X |
| 57 | +from_csv(date) 10607 10768 246 0.9 1060.7 0.1X |
58 | 58 |
|
59 | | -OpenJDK 64-Bit Server VM 1.8.0_252-8u252-b09-1~18.04-b09 on Linux 4.15.0-1063-aws |
60 | | -Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz |
| 59 | +Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7 |
| 60 | +Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz |
61 | 61 | Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
62 | 62 | ------------------------------------------------------------------------------------------------------------------------ |
63 | | -w/o filters 25630 25636 8 0.0 256301.4 1.0X |
64 | | -pushdown disabled 25673 25681 9 0.0 256734.0 1.0X |
65 | | -w/ filters 1873 1886 15 0.1 18733.1 13.7X |
| 63 | +w/o filters 13109 13249 151 0.0 131086.4 1.0X |
| 64 | +pushdown disabled 12951 12994 63 0.0 129509.7 1.0X |
| 65 | +w/ filters 1095 1113 15 0.1 10953.7 12.0X |
66 | 66 |
|
67 | 67 |
|
0 commit comments