Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update docs
  • Loading branch information
adrian-ionescu committed Nov 29, 2017
commit 012d617430befd8b028562c4d8f7b49cf8776659
Original file line number Diff line number Diff line change
Expand Up @@ -846,10 +846,10 @@ case class RepartitionByExpression(
s"${getClass.getSimpleName} expects that either all its `partitionExpressions` are of type " +
"`SortOrder`, which means `RangePartitioning`, or none of them are `SortOrder`, which " +
"means `HashPartitioning`. In this case we have:" +
s""""
|SortOrder: ${sortOrder}
|NonSortOrder: ${nonSortOrder}
""".stripMargin)
s"""
|SortOrder: ${sortOrder}
|NonSortOrder: ${nonSortOrder}
""".stripMargin)

if (sortOrder.nonEmpty) {
RangePartitioning(sortOrder.map(_.asInstanceOf[SortOrder]), numPartitions)
Expand Down
20 changes: 10 additions & 10 deletions sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
Original file line number Diff line number Diff line change
Expand Up @@ -2723,8 +2723,8 @@ class Dataset[T] private[sql](
}

/**
* Returns a new Dataset partitioned by the given partitioning expressions into
* `numPartitions`. The resulting Dataset is hash partitioned.
* Returns a new Dataset that is hash partitioned by the given expressions into `numPartitions`.
* If no expressions are specified, round robin partitioning is used.
*
* This is the same operation as "DISTRIBUTE BY" in SQL (Hive QL).
*
Expand All @@ -2745,9 +2745,9 @@ class Dataset[T] private[sql](
}

/**
* Returns a new Dataset partitioned by the given partitioning expressions, using
* `spark.sql.shuffle.partitions` as number of partitions.
* The resulting Dataset is hash partitioned.
* Returns a new Dataset that is hash partitioned by the given expressions, using
* `spark.sql.shuffle.partitions` as the number of partitions. If no expressions are specified,
* round robin partitioning is used.
*
* This is the same operation as "DISTRIBUTE BY" in SQL (Hive QL).
*
Expand All @@ -2760,8 +2760,8 @@ class Dataset[T] private[sql](
}

/**
* Returns a new Dataset partitioned by the given partitioning expressions into
* `numPartitions`. The resulting Dataset is range partitioned.
* Returns a new Dataset that is hash partitioned by the given expressions into `numPartitions`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hash -> range

* If no expressions are specified, round robin partitioning is used.
*
* @group typedrel
* @since 2.3.0
Expand All @@ -2780,9 +2780,9 @@ class Dataset[T] private[sql](
}

/**
* Returns a new Dataset partitioned by the given partitioning expressions, using
* `spark.sql.shuffle.partitions` as number of partitions.
* The resulting Dataset is range partitioned.
* Returns a new Dataset that is range partitioned by the given expressions, using
* `spark.sql.shuffle.partitions` as the number of partitions. If no expressions are specified,
* round robin partitioning is used.
*
* @group typedrel
* @since 2.3.0
Expand Down