Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Mark the combineByKeyWithClassTag methods as experimental
  • Loading branch information
massie committed Sep 10, 2015
commit adcdfafdfbc5cad3a77aace8900fefa95962ec30
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
{

/**
* :: Experimental ::
* Generic function to combine the elements for each key using a custom set of aggregation
* functions. Turns an RDD[(K, V)] into a result of type RDD[(K, C)], for a "combined type" C
* Note that V and C can be different -- for example, one might group an RDD of type
Expand All @@ -71,6 +72,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
* In addition, users can control the partitioning of the output RDD, and whether to perform
* map-side aggregation (if a mapper can produce multiple items with the same key).
*/
@Experimental
def combineByKeyWithClassTag[C](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, why call this something else? Does it not compile if you just called this combineByKey as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because PairRDDFunctions is a stable API, we can't change the method signature of combineByKey. Adding the ClassTag, would add an implicit argument. If we leave the old combineByKey methods and add new combineByKey methods with ClassTags, then we get compiler errors being unable to resolve the combineByKey symbol.

If you know as way of doing this more cleanly, I would be happy to make that change.

createCombiner: V => C,
mergeValue: (C, V) => C,
Expand Down Expand Up @@ -138,8 +140,10 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}

/**
* :: Experimental ::
* Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
*/
@Experimental
def combineByKeyWithClassTag[C](
createCombiner: V => C,
mergeValue: (C, V) => C,
Expand Down Expand Up @@ -619,9 +623,11 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
}

/**
* :: Experimental ::
* Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the
* existing partitioner/parallelism level.
*/
@Experimental
def combineByKeyWithClassTag[C](
createCombiner: V => C,
mergeValue: (C, V) => C,
Expand Down