Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
add sample functions with seeds
  • Loading branch information
gatorsmile committed Dec 5, 2015
commit ec770100452ca1a869058e448b1b41c8efb810d9
31 changes: 25 additions & 6 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -677,25 +677,44 @@ setMethod("unique",
#' collect(sample(df, TRUE, 0.5))
#'}
setMethod("sample",
# TODO : Figure out how to send integer as java.lang.Long to JVM so
# we can send seed as an argument through callJMethod
signature(x = "DataFrame", withReplacement = "logical",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't delete this comment. move it close to as.integer(seed). This is a known limitation of serde now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Added it back.

fraction = "numeric"),
function(x, withReplacement, fraction) {
fraction = "numeric", seed = "missing"),
function(x, withReplacement, fraction, seed) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixcheung Shouldn't we document this param in the roxygen doc above ? Otherwise how would anybody know we support a seed ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we should add a @param seed above, thanks for catching it

if (fraction < 0.0) stop(cat("Negative fraction value:", fraction))
sdf <- callJMethod(x@sdf, "sample", withReplacement, fraction)
dataFrame(sdf)
})

#' @rdname sample
#' @name sample
setMethod("sample",
# we can send seed as an argument through callJMethod
signature(x = "DataFrame", withReplacement = "logical",
fraction = "numeric", seed = "numeric"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could in fact merge these overload/variant into one. Please see this for an example:

if (!missing(j)) {

if (!missing(seed)) {
   sdf <- callJMethod(x@sdf, "sample", withReplacement, fraction, as.integer(seed))
} else {
   sdf <- callJMethod(x@sdf, "sample", withReplacement, fraction)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

function(x, withReplacement, fraction, seed) {
if (fraction < 0.0) stop(cat("Negative fraction value:", fraction))
sdf <- callJMethod(x@sdf, "sample", withReplacement, fraction, as.integer(seed))
dataFrame(sdf)
})

#' @rdname sample
#' @name sample_frac
setMethod("sample_frac",
signature(x = "DataFrame", withReplacement = "logical",
fraction = "numeric"),
function(x, withReplacement, fraction) {
fraction = "numeric", seed = "missing"),
function(x, withReplacement, fraction, seed) {
sample(x, withReplacement, fraction)
})

#' @rdname sample
#' @name sample_frac
setMethod("sample_frac",
signature(x = "DataFrame", withReplacement = "logical",
fraction = "numeric", seed = "numeric"),
function(x, withReplacement, fraction, seed) {
sample(x, withReplacement, fraction, as.integer(seed))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

})

#' nrow
#'
#' Returns the number of rows in a DataFrame
Expand Down