Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
5384f7a
[SPARK-21213][SQL] Support collecting partition-level statistics: row…
mbasmanova Jun 12, 2017
3ee5ebf
[SPARK-21213][SQL] review comments
mbasmanova Jun 28, 2017
d17aa4b
[SPARK-21213][SQL] improved comments per review feedback
mbasmanova Jun 28, 2017
e0e351e
[SPARK-21213][SQL] typo
mbasmanova Jun 28, 2017
8dad9bc
[SPARK-21213][SQL] add support for partial partition specs
mbasmanova Jun 29, 2017
4fdefd5
[SPARK-21213][SQL] add support for partition specs where some partiti…
mbasmanova Jun 29, 2017
1d696c3
[SPARK-21213][SQL] comment update
mbasmanova Jun 29, 2017
89c0767
[SPARK-21213][SQL] removed extra space
mbasmanova Jun 29, 2017
7210568
[SPARK-21213][SQL] addressed easy review comments
mbasmanova Jul 5, 2017
9aa2a1e
[SPARK-21213][SQL] addressed remaining review comments
mbasmanova Jul 5, 2017
fa21860
[SPARK-21213][SQL] added test case for (ds, hr=11) partition spec
mbasmanova Jul 5, 2017
f76f49f
[SPARK-21213][SQL] addressed review comments; fixed PARTITION (ds, hr…
mbasmanova Jul 11, 2017
8f31f53
[SPARK-21213][SQL] shorted new test
mbasmanova Jul 11, 2017
fae6d49
[SPARK-21213][SQL] added documentation; added test for an empty table
mbasmanova Jul 11, 2017
8880fbd
[SPARK-21213][SQL] review comments
mbasmanova Jul 31, 2017
1053991
[SPARK-21213][SQL] fixed bad merge of SPARK-21599
mbasmanova Aug 7, 2017
41ab30d
[SPARK-21213][SQL] added support for spark.sql.caseSensitive; address…
mbasmanova Aug 8, 2017
dc488e5
[SPARK-21213][SQL] addressed remaining review comments
mbasmanova Aug 8, 2017
c839855
[SPARK-21213][SQL] Added a test for DESC PARTITION after ANALYZE; rev…
mbasmanova Aug 10, 2017
72e2cd5
[SPARK-21213][SQL] added DROP TABLE to describe-part-after-analyze.sql
mbasmanova Aug 10, 2017
87594d6
[SPARK-21213][SQL] check that partition columns in the partition spec…
mbasmanova Aug 17, 2017
3353afa
[SPARK-21213][SQL] use PartitioningUtils.normalizePartitionSpec to ha…
mbasmanova Aug 18, 2017
8ffb140
[SPARK-21213][SQL] review comments
mbasmanova Aug 18, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-21213][SQL] review comments
  • Loading branch information
mbasmanova committed Aug 16, 2017
commit 3ee5ebf2b8ab35b4122849f0e15153bfd079585e
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,8 @@ object CatalogStorageFormat {
*
* @param spec partition spec values indexed by column name
* @param storage storage format of the partition
* @param parameters some parameters for the partition, for example, stats.
* @param parameters some parameters for the partition
* @param stats optional statistics (number of rows, total size, etc.)
*/
case class CatalogTablePartition(
spec: CatalogTypes.TablePartitionSpec,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,30 +95,26 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder(conf) {
* {{{
* ANALYZE TABLE table COMPUTE STATISTICS [NOSCAN];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here.

* }}}
* Example SQL for analyzing a single partition :
* {{{
* ANALYZE TABLE table PARTITION (key=value,..) COMPUTE STATISTICS [NOSCAN];
Copy link
Member

@gatorsmile gatorsmile Jun 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing syntax is not very clear. Could you improve them?

ANALYZE TABLE [db_name.]tablename [PARTITION(partcol1[=val1], partcol2[=val2], ...)] 
COMPUTE STATISTICS [NOSCAN]

In addition, since we have a restriction, please do not call visitNonOptionalPartitionSpec. Instead, we can capture the non-set partition column and issue a more user-friendly exception message.

Copy link
Contributor Author

@mbasmanova mbasmanova Jun 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing that visitNonOptionalPartitionSpec detects unset partition columns and throws an exception: Found an empty partition key '$key'. Is this sufficient or do you have something else in mind?

/**

  • Create a partition specification map without optional values.
    */
    protected def visitNonOptionalPartitionSpec(
    ctx: PartitionSpecContext): Map[String, String] = withOrigin(ctx) {
    visitPartitionSpec(ctx).map {
    case (key, None) => throw new ParseException(s"Found an empty partition key '$key'.", ctx)
    case (key, Some(value)) => key -> value
    }
    }

* }}}
* Example SQL for analyzing columns :
* {{{
* ANALYZE TABLE table COMPUTE STATISTICS FOR COLUMNS column1, column2;
* }}}
*/
override def visitAnalyze(ctx: AnalyzeContext): LogicalPlan = withOrigin(ctx) {
val noscan = if (ctx.identifier != null) {
if (ctx.identifier.getText.toLowerCase(Locale.ROOT) != "noscan") {
throw new ParseException(s"Expected `NOSCAN` instead of `${ctx.identifier.getText}`", ctx)
}
true
} else {
false
if (ctx.identifier != null &&
ctx.identifier.getText.toLowerCase(Locale.ROOT) != "noscan") {
throw new ParseException(s"Expected `NOSCAN` instead of `${ctx.identifier.getText}`", ctx)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    if (ctx.identifier != null &&
        ctx.identifier.getText.toLowerCase(Locale.ROOT) != "noscan") {
      throw new ParseException(s"Expected `NOSCAN` instead of `${ctx.identifier.getText}`", ctx)
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val partitionSpec = Option(ctx.partitionSpec).map(visitNonOptionalPartitionSpec)


val partitionSpec = if (ctx.partitionSpec != null) {
Option(ctx.partitionSpec).map(visitNonOptionalPartitionSpec)
} else {
None
}
val partitionSpec = Option(ctx.partitionSpec).map(visitNonOptionalPartitionSpec)

val table = visitTableIdentifier(ctx.tableIdentifier)
if (ctx.identifierSeq() == null) {
AnalyzeTableCommand(table, noscan, partitionSpec)
AnalyzeTableCommand(table, noscan = ctx.identifier != null, partitionSpec)
} else {
if (partitionSpec.isDefined) {
logWarning(s"Partition specification is ignored: ${ctx.partitionSpec.getText}")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ import org.apache.spark.sql.catalyst.expressions.{And, EqualTo, Expression, Lite
/**
* Analyzes the given table or partition to generate statistics, which will be used in
* query optimizations.
*
* If certain partition spec is specified, then statistics are gathered for only that partition.
*/
case class AnalyzeTableCommand(
tableIdent: TableIdentifier,
Expand Down