Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Jun 8, 2020

What changes were proposed in this pull request?

Add benchmarks for HiveResult.hiveResultString()/toHiveString() to measure throughput of toHiveString for the date/timestamp types:

  • java.sql.Date/Timestamp
  • java.time.Instant
  • java.time.LocalDate

Benchmark results were generated in the environment:

Item Description
Region us-west-2 (Oregon)
Instance r3.xlarge
AMI ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 (ami-06f2f779464715dc5)
Java OpenJDK 64-Bit Server VM 1.8.0_242 and OpenJDK 64-Bit Server VM 11.0.6+10

Why are the changes needed?

To detect perf regressions of toHiveString in the future.

Does this PR introduce any user-facing change?

No

How was this patch tested?

By running DateTimeBenchmark and check dataset content.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jun 8, 2020

@cloud-fan @juliuszsompolski @HyukjinKwon Please, review this PR.

@MaxGekk MaxGekk changed the title [SPARK-31932][SQL][TESTS] Add date/timestamp benchmarks for toHiveString [SPARK-31932][SQL][TESTS] Add date/timestamp benchmarks for HiveResult.hiveResultString() Jun 8, 2020
@SparkQA
Copy link

SparkQA commented Jun 8, 2020

Test build #123646 has finished for PR 28757 at commit 37f5ba1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

I'm merging it to 3.0 as well, so that it's easier to compare the perf between 3.0 and 3.1 in the future.

thanks, merging to master/3.0!

@cloud-fan cloud-fan closed this in ddd8d5f Jun 9, 2020
cloud-fan pushed a commit that referenced this pull request Jun 9, 2020
…lt.hiveResultString()`

### What changes were proposed in this pull request?
Add benchmarks for `HiveResult.hiveResultString()/toHiveString()` to measure throughput of `toHiveString` for the date/timestamp types:
- java.sql.Date/Timestamp
- java.time.Instant
- java.time.LocalDate

Benchmark results were generated in the environment:

| Item | Description |
| ---- | ----|
| Region | us-west-2 (Oregon) |
| Instance | r3.xlarge |
| AMI | ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 (ami-06f2f779464715dc5) |
| Java | OpenJDK 64-Bit Server VM 1.8.0_242 and OpenJDK 64-Bit Server VM 11.0.6+10 |

### Why are the changes needed?
To detect perf regressions of `toHiveString` in the future.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running `DateTimeBenchmark` and check dataset content.

Closes #28757 from MaxGekk/benchmark-toHiveString.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit ddd8d5f)
Signed-off-by: Wenchen Fan <[email protected]>
@MaxGekk MaxGekk deleted the benchmark-toHiveString branch December 11, 2020 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants