-
Notifications
You must be signed in to change notification settings - Fork 771
Add repartition workload #608
Conversation
carsonwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this!
| hibench.repartition.large.datasize 100000000 | ||
| hibench.repartition.huge.datasize 1000000000 | ||
| hibench.repartition.gigantic.datasize 10000000000 | ||
| hibench.repartition.bigdata.datasize 60000000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this to the same size defined in TeraSort to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I made them bigger because it takes less time than Terasort. But since we have an output by default, durations should be on same level.
| private def reparition(previous: RDD[Array[Byte]], numReducers: Int): ShuffledRDD[Int, Array[Byte], Array[Byte]] = { | ||
| /** Distributes elements evenly across output partitions, starting from a random partition. */ | ||
| val distributePartition = (index: Int, items: Iterator[Array[Byte]]) => { | ||
| var position = (new Random(index)).nextInt(numReducers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In spark code, I noticed hashing.byteswap32(index) is used for the seed. the hashing is removed here by purpose as there is no difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I copied that from Spark 2.0.0, and the code you mentioned is introduced in apache/spark#18990, solving the skewed repartition when numReducers is power of 2.
README.md
Outdated
|
|
||
| 4. Repartition (micro/repartition) | ||
|
|
||
| This workload benchmarks shuffle performance. Input data is generated by Hadoop TeraGen. It is firstly cached in memory by default, then shuffle write and read in order to repartition. The last 2 stages solely reflects shuffle's performance, excluding I/O and other compute. Note: The parameter hibench.repartition.cacheinmemory(default is true) is provided, to allow reading from storage in the 1st stage without caching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about setting this to hibench.repartition.cacheinmemory to false by default? HiBench measures the execution time of the entire workload and calculates the throughput. Caching in memory seems to be for our own need to measure the shuffle write and shuffle read. So we need to look at the stage level execution time ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
|
Why travis is not triggered ? |
| /** Distributes elements evenly across output partitions, starting from a random partition. */ | ||
| val distributePartition = (index: Int, items: Iterator[Array[Byte]]) => { | ||
| var position = (new Random(index)).nextInt(numReducers) | ||
| var position = new Random(hashing.byteswap32(index)).nextInt(numReducers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(new Random(hashing.byteswap32(index))) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Merged this. Thanks! |
We need a workload to solely benchmark shuffle's performance, excluding any I/O operations and non-shuffle-related compute. Without repartition, we often use Terasort as a replacement, but the problem is that I/O and sorts can take most of the time, blurring our eyes to measure shuffle performance.
This workload is only for Spark, it contains following process:
hibench.repartition.cacheinmemoryinmicro/repartition.conf)hibench.repartition.disableoutput)