Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Pivot should fail when the number of distinct values is too large

Why are the changes needed?

Following check is missing in Spark Connect

    if (values.length > maxValues) {
      throw QueryCompilationErrors.aggregationFunctionAppliedOnNonNumericColumnError(
        pivotColumn.toString, maxValues)
    }

Does this PR introduce any user-facing change?

pivot will fail when the number of distinct values exceed DATAFRAME_PIVOT_MAX_VALUES

and so will be consistent with classic spark.

How was this patch tested?

added test

Was this patch authored or co-authored using generative AI tooling?

no

@HyukjinKwon
Copy link
Member

Merged to master.

@zhengruifeng zhengruifeng deleted the connect_pivot branch April 8, 2024 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants