Skip to content

Conversation

@jovanpavl-db
Copy link
Contributor

What changes were proposed in this pull request?

Introducing new specifier for trim collations (both leading and trailing trimming). These are initial changes so that trim specifier is recognized and put under feature flag (all code paths blocked).

Why are the changes needed?

Support for trailing space trimming is one of the requested feature by users.

Does this PR introduce any user-facing change?

This is guarded by feature flag.

How was this patch tested?

Added tests to CollationSuite, SqlConfSuite and QueryCompilationErrorSuite.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Sep 24, 2024
Copy link
Contributor

@stefankandic stefankandic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Left some minor comments but looks good overall

Copy link
Contributor

@stefankandic stefankandic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending scalastyle fixes!

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in c54c017 Sep 30, 2024
MaxGekk pushed a commit that referenced this pull request Nov 14, 2024
…ationNameToId` outside of cases

### What changes were proposed in this pull request?
In this PR, UTF8_BINARY performance regression is addressed, that was first identified here #48721. The regression is traced back to this PR #48222 when it first occurred, however this isn't the actual source of performance degradation.

### Why are the changes needed?
The PR #48222 caused the regression because it changed the `collationNameToId` function and made it slightly slower by removing a short-circuit for fetching the UTF8_BINARY collation. However this function should be called fixed amount of times for each query and from the benchmark framework at most once - this was not the case and it was the largest contributor to performance regression.

This PR addresses the benchmarking framework to not call this function at each expression, but once per the test case.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing testing surface, benchmarks.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #48804 from stevomitric/stevomitric/fix-utf8_binary-regression.

Authored-by: Stevo Mitric <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants