Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
82e2f09
Fix part of undocumented/duplicated arguments warnings by CRAN-check
junyangq Aug 9, 2016
41d9dca
[SPARK-16950] [PYSPARK] fromOffsets parameter support in KafkaUtils.c…
Aug 9, 2016
44115e9
[SPARK-16956] Make ApplicationState.MAX_NUM_RETRY configurable
JoshRosen Aug 9, 2016
2d136db
[SPARK-16905] SQL DDL: MSCK REPAIR TABLE
Aug 9, 2016
901edbb
More fixes of the docs.
junyangq Aug 10, 2016
475ee38
Fixed typo
jupblb Aug 10, 2016
2285de7
[SPARK-16522][MESOS] Spark application throws exception on exit.
sun-rui Aug 10, 2016
20efb79
[SPARK-16324][SQL] regexp_extract should doc that it returns empty st…
srowen Aug 10, 2016
719ac5f
[SPARK-15899][SQL] Fix the construction of the file path with hadoop …
avulanov Aug 10, 2016
15637f7
Revert "[SPARK-15899][SQL] Fix the construction of the file path with…
srowen Aug 10, 2016
977fbbf
[SPARK-15639] [SPARK-16321] [SQL] Push down filter at RowGroups level…
viirya Aug 10, 2016
d3a30d2
[SPARK-16579][SPARKR] add install.spark function
junyangq Aug 10, 2016
1e40135
[SPARK-17010][MINOR][DOC] Wrong description in memory management docu…
WangTaoTheTonic Aug 11, 2016
8611bc2
[SPARK-16866][SQL] Infrastructure for file-based SQL end-to-end tests
petermaxlee Aug 10, 2016
51b1016
[SPARK-17008][SPARK-17009][SQL] Normalization and isolation in SQLQue…
petermaxlee Aug 11, 2016
ea8a198
[SPARK-17007][SQL] Move test data files into a test-data folder
petermaxlee Aug 11, 2016
4b434e7
[SPARK-17011][SQL] Support testing exceptions in SQLQueryTestSuite
petermaxlee Aug 11, 2016
0ed6236
Correct example value for spark.ssl.YYY.XXX settings
ash211 Aug 11, 2016
33a213f
[SPARK-15899][SQL] Fix the construction of the file path with hadoop …
avulanov Aug 11, 2016
b87ba8f
Fix remaining undocumented/duplicated warnings
junyangq Aug 11, 2016
6bf20cd
[SPARK-17015][SQL] group-by/order-by ordinal and arithmetic tests
petermaxlee Aug 11, 2016
bc683f0
[SPARK-17018][SQL] literals.sql for testing literal parsing
petermaxlee Aug 11, 2016
0fb0149
[SPARK-17022][YARN] Handle potential deadlock in driver handling mess…
WangTaoTheTonic Aug 11, 2016
d2c1d64
Keep to the convention where we have docs for generic and the function.
junyangq Aug 12, 2016
b4047fc
[SPARK-16975][SQL] Column-partition path starting '_' should be handl…
dongjoon-hyun Aug 12, 2016
bde94cd
[SPARK-17013][SQL] Parse negative numeric literals
petermaxlee Aug 12, 2016
38378f5
[SPARK-12370][DOCUMENTATION] Documentation should link to examples …
jagadeesanas2 Aug 13, 2016
a21ecc9
[SPARK-17023][BUILD] Upgrade to Kafka 0.10.0.1 release
lresende Aug 13, 2016
750f880
[SPARK-16966][SQL][CORE] App Name is a randomUUID even when "spark.ap…
srowen Aug 13, 2016
e02d0d0
[SPARK-17027][ML] Avoid integer overflow in PolynomialExpansion.getPo…
zero323 Aug 14, 2016
8f4cacd
[SPARK-16508][SPARKR] Split docs for arrange and orderBy methods
junyangq Aug 15, 2016
4503632
[SPARK-17065][SQL] Improve the error message when encountering an inc…
zsxwing Aug 15, 2016
e5771a1
Fix docs for window functions
junyangq Aug 16, 2016
2e2c787
[SPARK-16964][SQL] Remove private[hive] from sql.hive.execution package
hvanhovell Aug 16, 2016
237ae54
Revert "[SPARK-16964][SQL] Remove private[hive] from sql.hive.executi…
rxin Aug 16, 2016
1c56971
[SPARK-16964][SQL] Remove private[sql] and private[spark] from sql.ex…
hvanhovell Aug 16, 2016
022230c
[SPARK-16519][SPARKR] Handle SparkR RDD generics that create warnings…
felixcheung Aug 16, 2016
6cb3eab
[SPARK-17089][DOCS] Remove api doc link for mapReduceTriplets operator
phalodi Aug 16, 2016
3e0163b
[SPARK-17084][SQL] Rename ParserUtils.assert to validate
hvanhovell Aug 17, 2016
68a24d3
[MINOR][DOC] Fix the descriptions for `properties` argument in the do…
Aug 17, 2016
22c7660
[SPARK-15285][SQL] Generated SpecificSafeProjection.apply method grow…
kiszk Aug 17, 2016
394d598
[SPARK-17102][SQL] bypass UserDefinedGenerator for json format check
cloud-fan Aug 17, 2016
9406f82
[SPARK-17096][SQL][STREAMING] Improve exception string reported throu…
tdas Aug 17, 2016
585d1d9
[SPARK-17038][STREAMING] fix metrics retrieval source of 'lastReceive…
keypointt Aug 17, 2016
91aa532
[SPARK-16995][SQL] TreeNodeException when flat mapping RelationalGrou…
viirya Aug 18, 2016
5735b8b
[SPARK-16391][SQL] Support partial aggregation for reduceGroups
rxin Aug 18, 2016
ec5f157
[SPARK-17117][SQL] 1 / NULL should not fail analysis
petermaxlee Aug 18, 2016
0bc3753
Fix part of undocumented/duplicated arguments warnings by CRAN-check
junyangq Aug 9, 2016
6d5233e
More fixes of the docs.
junyangq Aug 10, 2016
0edfd7d
Fix remaining undocumented/duplicated warnings
junyangq Aug 11, 2016
e72a6aa
Keep to the convention where we have docs for generic and the function.
junyangq Aug 12, 2016
afa69ed
Fix docs for window functions
junyangq Aug 16, 2016
c9cfe43
some fixes of R doc
junyangq Aug 18, 2016
3aafaa7
Move param docs from generic function to method definition.
junyangq Aug 18, 2016
315a0dd
some fixes of R doc
junyangq Aug 18, 2016
aa3d233
Move param docs from generic function to method definition.
junyangq Aug 18, 2016
71170e9
Solve conflicts.
junyangq Aug 18, 2016
2682719
Revert "Fix docs for window functions"
junyangq Aug 18, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,8 @@ spark-warehouse/
# For R session data
.RData
.RHistory
.Rhistory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate line. Can these be pushed down into a .gitignore in the subdirectory? this file is a mess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Is the upper/lower case difference caused by different R versions or platforms?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is essential for this PR. I'd suggest leaving this out

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. This is not part of this PR.

*.Rproj
*.Rproj.*

.Rproj.user
132 changes: 67 additions & 65 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,6 @@ setMethod("schema",
#'
#' Print the logical and physical Catalyst plans to the console for debugging.
#'
#' @param x A SparkDataFrame
#' @param extended Logical. If extended is FALSE, explain() only prints the physical plan.
#' @family SparkDataFrame functions
#' @aliases explain,SparkDataFrame-method
Expand Down Expand Up @@ -177,11 +176,10 @@ setMethod("isLocal",
#'
#' Print the first numRows rows of a SparkDataFrame
#'
#' @param x A SparkDataFrame
#' @param numRows The number of rows to print. Defaults to 20.
#' @param truncate Whether truncate long strings. If true, strings more than 20 characters will be
#' truncated and all cells will be aligned right
#'
#' @param numRows the number of rows to print. Defaults to 20.
#' @param truncate whether truncate long strings. If true, strings more than 20 characters will be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true -> TRUE

#' truncated. However, if set greater than zero, truncates strings longer than `truncate`
#' characters and all cells will be aligned right.
#' @family SparkDataFrame functions
#' @aliases showDF,SparkDataFrame-method
#' @rdname showDF
Expand All @@ -206,7 +204,7 @@ setMethod("showDF",
#'
#' Print the SparkDataFrame column names and types
#'
#' @param x A SparkDataFrame
#' @param object a SparkDataFrame.
#'
#' @family SparkDataFrame functions
#' @rdname show
Expand Down Expand Up @@ -318,6 +316,7 @@ setMethod("colnames",
columns(x)
})

#' @param value a character vector. Must have the same length as the number of columns in the SparkDataFrame.
#' @rdname columns
#' @aliases colnames<-,SparkDataFrame-method
#' @name colnames<-
Expand Down Expand Up @@ -406,7 +405,6 @@ setMethod("coltypes",
#'
#' Set the column types of a SparkDataFrame.
#'
#' @param x A SparkDataFrame
#' @param value A character vector with the target column types for the given
#' SparkDataFrame. Column types can be one of integer, numeric/double, character, logical, or NA
#' to keep that column as-is.
Expand Down Expand Up @@ -510,9 +508,9 @@ setMethod("registerTempTable",
#'
#' Insert the contents of a SparkDataFrame into a table registered in the current SparkSession.
#'
#' @param x A SparkDataFrame
#' @param tableName A character vector containing the name of the table
#' @param overwrite A logical argument indicating whether or not to overwrite
#' @param x a SparkDataFrame.
#' @param tableName a character vector containing the name of the table.
#' @param overwrite a logical argument indicating whether or not to overwrite.
#' the existing rows in the table.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is tableName moved?

Copy link
Contributor Author

@junyangq junyangq Aug 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasons are

  1. Its generic counterpart has ... and we have to document for that.
  2. I think it may be preferred that ... follows other explicit arguments in order?

Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

We typically have

setGeneric("foo", function(x, ...) {...}

setMethod("foo", signature(x = "A"), function(x, a, b, moreParameters) {...}

In many cases we have setGeneric the fewer number of parameters - whether it is because it has to match an existing generic or some parameters we don't put in the signature with type - so we would have a parameter and ... in the generic.

I see your point about if the parameter is in the generic then perhaps we should @param document it there as well.

I think we should try to keep it near the function definition/body though because

  1. that's where the parameter is actually being used and its meaning could change
  2. it would be easier to review - we have a few changes out where the first parameter does not have a @param (it should help when we turn on check-cran in Jenkins)
  3. many of our functions are like this so if we are to move the @param for the first parameter to generics.R (if there is ... there) then there will be dozens or hundreds of lines changing (eg. bround, contains, subset and many more)

with the exception where we've talked about the generic applies to multiple function definitions with different classes and that first parameter could be in different classes so it needs a central place.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that ideally the doc should be kept near the function definition/body, and that's consistent with many other functions. So then the only issue is ... would become first in the argument list of the doc. I'm not sure if that would bother the user to some extent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. that is an unfortunate side-effect. I wonder if there is a way to order this in the Rd generated roxygen; maybe this could be a reasonable PR to change roxygen for special casing ....
The only other solution I've seen would be to hand craft a .Rd file...

For now I think we should prioritize maintainability and keep the doc close to the function as much as possible.

#'
#' @family SparkDataFrame functions
Expand Down Expand Up @@ -571,7 +569,9 @@ setMethod("cache",
#' supported storage levels, refer to
#' \url{http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence}.
#'
#' @param x The SparkDataFrame to persist
#' @param x the SparkDataFrame to persist.
#' @param newLevel storage level chosen for the persistance. See available options in
#' the description.
#'
#' @family SparkDataFrame functions
#' @rdname persist
Expand Down Expand Up @@ -634,9 +634,10 @@ setMethod("unpersist",
#' \item{3.} {Return a new SparkDataFrame partitioned by the given column(s),
#' using `spark.sql.shuffle.partitions` as number of partitions.}
#'}
#' @param x A SparkDataFrame
#' @param numPartitions The number of partitions to use.
#' @param col The column by which the partitioning will be performed.
#' @param x a SparkDataFrame.
#' @param numPartitions the number of partitions to use.
#' @param col the column by which the partitioning will be performed.
#' @param ... additional column(s) to be used in the partitioning.
#'
#' @family SparkDataFrame functions
#' @rdname repartition
Expand Down Expand Up @@ -915,8 +916,6 @@ setMethod("sample_frac",

#' Returns the number of rows in a SparkDataFrame
#'
#' @param x A SparkDataFrame
#'
#' @family SparkDataFrame functions
#' @rdname nrow
#' @name count
Expand Down Expand Up @@ -1092,8 +1091,10 @@ setMethod("limit",
dataFrame(res)
})

#' Take the first NUM rows of a SparkDataFrame and return a the results as a R data.frame
#' Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame
#'
#' @param x a SparkDataFrame.
#' @param num number of rows to take.
#' @family SparkDataFrame functions
#' @rdname take
#' @name take
Expand All @@ -1120,9 +1121,9 @@ setMethod("take",
#' then head() returns the first 6 rows in keeping with the current data.frame
#' convention in R.
#'
#' @param x A SparkDataFrame
#' @param num The number of rows to return. Default is 6.
#' @return A data.frame
#' @param x a SparkDataFrame.
#' @param num the number of rows to return. Default is 6.
#' @return A data.frame.
#'
#' @family SparkDataFrame functions
#' @aliases head,SparkDataFrame-method
Expand All @@ -1146,7 +1147,7 @@ setMethod("head",

#' Return the first row of a SparkDataFrame
#'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think similar to what you have for other functions, this could go to generic.R - do you have any other idea how to have functions working with multiple class documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a good answer yet. When I tried to move the param there, it seems that the generic functions for RDD Actions and Transformations are not exposed. Do you know specific reason for that by chance?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - they are not - RDD functions are not exported from the packages (not public) and we don't want Rd file generated for them. Please see PR #14626 - we want a separate SetGeneric for non-RDD functions, and then this line documenting both DataFrame and Column parameter can then go to generics.R

#' @param x A SparkDataFrame
#' @param x a SparkDataFrame or a column used in aggregation function.
#'
#' @family SparkDataFrame functions
#' @aliases first,SparkDataFrame-method
Expand Down Expand Up @@ -1240,7 +1241,6 @@ setMethod("group_by",
#'
#' Compute aggregates by specifying a list of columns
#'
#' @param x a SparkDataFrame
#' @family SparkDataFrame functions
#' @aliases agg,SparkDataFrame-method
#' @rdname summarize
Expand Down Expand Up @@ -1387,16 +1387,15 @@ setMethod("dapplyCollect",
#' Groups the SparkDataFrame using the specified columns and applies the R function to each
#' group.
#'
#' @param x A SparkDataFrame
#' @param cols Grouping columns
#' @param func A function to be applied to each group partition specified by grouping
#' @param cols grouping columns.
#' @param func a function to be applied to each group partition specified by grouping
#' column of the SparkDataFrame. The function `func` takes as argument
#' a key - grouping columns and a data frame - a local R data.frame.
#' The output of `func` is a local R data.frame.
#' @param schema The schema of the resulting SparkDataFrame after the function is applied.
#' @param schema the schema of the resulting SparkDataFrame after the function is applied.
#' The schema must match to output of `func`. It has to be defined for each
#' output column with preferred output column name and corresponding data type.
#' @return a SparkDataFrame
#' @return A SparkDataFrame.
#' @family SparkDataFrame functions
#' @aliases gapply,SparkDataFrame-method
#' @rdname gapply
Expand Down Expand Up @@ -1479,13 +1478,12 @@ setMethod("gapply",
#' Groups the SparkDataFrame using the specified columns, applies the R function to each
#' group and collects the result back to R as data.frame.
#'
#' @param x A SparkDataFrame
#' @param cols Grouping columns
#' @param func A function to be applied to each group partition specified by grouping
#' @param cols grouping columns.
#' @param func a function to be applied to each group partition specified by grouping
#' column of the SparkDataFrame. The function `func` takes as argument
#' a key - grouping columns and a data frame - a local R data.frame.
#' The output of `func` is a local R data.frame.
#' @return a data.frame
#' @return A data.frame.
#' @family SparkDataFrame functions
#' @aliases gapplyCollect,SparkDataFrame-method
#' @rdname gapplyCollect
Expand Down Expand Up @@ -2461,8 +2459,8 @@ setMethod("unionAll",
#' Union two or more SparkDataFrames. This is equivalent to `UNION ALL` in SQL.
#' Note that this does not remove duplicate rows across the two SparkDataFrames.
#'
#' @param x A SparkDataFrame
#' @param ... Additional SparkDataFrame
#' @param x a SparkDataFrame.
#' @param ... additional SparkDataFrame(s).
#' @return A SparkDataFrame containing the result of the union.
#' @family SparkDataFrame functions
#' @aliases rbind,SparkDataFrame-method
Expand Down Expand Up @@ -2519,8 +2517,8 @@ setMethod("intersect",
#' Return a new SparkDataFrame containing rows in this SparkDataFrame
#' but not in another SparkDataFrame. This is equivalent to `EXCEPT` in SQL.
#'
#' @param x A SparkDataFrame
#' @param y A SparkDataFrame
#' @param x a SparkDataFrame.
#' @param y a SparkDataFrame.
#' @return A SparkDataFrame containing the result of the except operation.
#' @family SparkDataFrame functions
#' @aliases except,SparkDataFrame,SparkDataFrame-method
Expand Down Expand Up @@ -2561,10 +2559,11 @@ setMethod("except",
#' and to not change the existing data.
#' }
#'
#' @param df A SparkDataFrame
#' @param path A name for the table
#' @param source A name for external data source
#' @param mode One of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default)
#' @param df a SparkDataFrame.
#' @param path a name for the table.
#' @param source a name for external data source.
#' @param mode one of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default)
#' @param ... additional argument(s) passed to the method.
#'
#' @family SparkDataFrame functions
#' @aliases write.df,SparkDataFrame,character-method
Expand Down Expand Up @@ -2623,10 +2622,11 @@ setMethod("saveDF",
#' ignore: The save operation is expected to not save the contents of the SparkDataFrame
#' and to not change the existing data. \cr
#'
#' @param df A SparkDataFrame
#' @param tableName A name for the table
#' @param source A name for external data source
#' @param mode One of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default)
#' @param df a SparkDataFrame.
#' @param tableName a name for the table.
#' @param source a name for external data source.
#' @param mode one of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default).
#' @param ... additional option(s) passed to the method.
#'
#' @family SparkDataFrame functions
#' @aliases saveAsTable,SparkDataFrame,character-method
Expand Down Expand Up @@ -2662,10 +2662,10 @@ setMethod("saveAsTable",
#' Computes statistics for numeric columns.
#' If no columns are given, this function computes statistics for all numerical columns.
#'
#' @param x A SparkDataFrame to be computed.
#' @param col A string of name
#' @param ... Additional expressions
#' @return A SparkDataFrame
#' @param x a SparkDataFrame to be computed.
#' @param col a string of name.
#' @param ... additional expressions.
#' @return A SparkDataFrame.
#' @family SparkDataFrame functions
#' @aliases describe,SparkDataFrame,character-method describe,SparkDataFrame,ANY-method
#' @rdname summary
Expand Down Expand Up @@ -2700,6 +2700,7 @@ setMethod("describe",
dataFrame(sdf)
})

#' @param object a SparkDataFrame to be summarized.
#' @rdname summary
#' @name summary
#' @aliases summary,SparkDataFrame-method
Expand All @@ -2715,16 +2716,20 @@ setMethod("summary",
#'
#' dropna, na.omit - Returns a new SparkDataFrame omitting rows with null values.
#'
#' @param x A SparkDataFrame.
#' @param x a SparkDataFrame.
#' @param how "any" or "all".
#' if "any", drop a row if it contains any nulls.
#' if "all", drop a row only if all its values are null.
#' if minNonNulls is specified, how is ignored.
#' @param minNonNulls If specified, drop rows that have less than
#' @param minNonNulls if specified, drop rows that have less than
#' minNonNulls non-null values.
#' This overwrites the how parameter.
#' @param cols Optional list of column names to consider.
#' @return A SparkDataFrame
#' @param cols optional list of column names to consider. In `fillna`,
#' columns specified in cols that do not have matching data
#' type are ignored. For example, if value is a character, and
#' subset contains a non-character column, then the non-character
#' column is simply ignored.
#' @return A SparkDataFrame.
#'
#' @family SparkDataFrame functions
#' @rdname nafunctions
Expand Down Expand Up @@ -2769,18 +2774,12 @@ setMethod("na.omit",

#' fillna - Replace null values.
#'
#' @param x A SparkDataFrame.
#' @param value Value to replace null values with.
#' @param value value to replace null values with.
#' Should be an integer, numeric, character or named list.
#' If the value is a named list, then cols is ignored and
#' value must be a mapping from column name (character) to
#' replacement value. The replacement value must be an
#' integer, numeric or character.
#' @param cols optional list of column names to consider.
#' Columns specified in cols that do not have matching data
#' type are ignored. For example, if value is a character, and
#' subset contains a non-character column, then the non-character
#' column is simply ignored.
#'
#' @rdname nafunctions
#' @name fillna
Expand Down Expand Up @@ -2845,8 +2844,11 @@ setMethod("fillna",
#' Since data.frames are held in memory, ensure that you have enough memory
#' in your system to accommodate the contents.
#'
#' @param x a SparkDataFrame
#' @return a data.frame
#' @param x a SparkDataFrame.
#' @param row.names NULL or a character vector giving the row names for the data frame.
#' @param optional If `TRUE`, converting column names is optional.
#' @param ... additional arguments passed to the method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case "additional arguments to pass to base::as.data.frame"

#' @return A data.frame.
#' @family SparkDataFrame functions
#' @aliases as.data.frame,SparkDataFrame-method
#' @rdname as.data.frame
Expand Down Expand Up @@ -3000,9 +3002,8 @@ setMethod("str",
#' Returns a new SparkDataFrame with columns dropped.
#' This is a no-op if schema doesn't contain column name(s).
#'
#' @param x A SparkDataFrame.
#' @param cols A character vector of column names or a Column.
#' @return A SparkDataFrame
#' @param col a character vector of column names or a Column.
#' @return A SparkDataFrame.
#'
#' @family SparkDataFrame functions
#' @rdname drop
Expand Down Expand Up @@ -3049,8 +3050,8 @@ setMethod("drop",
#'
#' @name histogram
#' @param nbins the number of bins (optional). Default value is 10.
#' @param col the column (described by character or Column object) to build the histogram from.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we generally don't say "Column object" - object system in R is a bit different and I think here "S4 class" would make more sense?
but I'd suggest simplifying it to:
"@param col the column as Character string or a Column to build the histogram from"

#' @param df the SparkDataFrame containing the Column to build the histogram from.
#' @param colname the name of the column to build the histogram from.
#' @return a data.frame with the histogram statistics, i.e., counts and centroids.
#' @rdname histogram
#' @aliases histogram,SparkDataFrame,characterOrColumn-method
Expand Down Expand Up @@ -3184,6 +3185,7 @@ setMethod("histogram",
#' @param x A SparkDataFrame
#' @param url JDBC database url of the form `jdbc:subprotocol:subname`
#' @param tableName The name of the table in the external database
#' @param ... additional argument(s) passed to the method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, something like "additional JDBC database connection properties"

#' @param mode One of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by default)
#' @family SparkDataFrame functions
#' @rdname write.jdbc
Expand Down
Loading