[SPARK-40585][SQL] Double-quoted identifiers should be available only in ANSI mode #38147

gengliangwang · 2022-10-07T05:38:02Z

What changes were proposed in this pull request?

#38022 introduces an optional feature for supporting double-quoted identifiers. The feature is controlled by a flag spark.sql.ansi.double_quoted_identifiers which is independent from the flag spark.sql.ansi.enabled.
This is inconsistent with another ANSI SQL feature "Enforce ANSI reserved keywords": https://spark.apache.org/docs/latest/sql-ref-ansi-compliance.html#sql-keywords-optional-disabled-by-default, which is only available when spark.sql.ansi.enabled is true.

Thus, to make the ANSI flags consistent, I suggest making double-quoted identifiers only available under ANSI SQL mode.
Other than that, this PR renames it from spark.sql.ansi.double_quoted_identifiers to spark.sql.ansi.doubleQuotedIdentifiers

Why are the changes needed?

To make the ANSI SQL related features consistent.

Does this PR introduce any user-facing change?

No, the feature is not released yet.

How was this patch tested?

New SQL test input file under ANSI mode.

gengliangwang · 2022-10-07T05:39:28Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

      .booleanConf
      .createWithDefault(false)

-  val ENFORCE_RESERVED_KEYWORDS = buildConf("spark.sql.ansi.enforceReservedKeywords")


Note related to this PR, but I move this code next to the ANSI_ENABLED flag for grouping the ANSI SQL related flags.

gengliangwang · 2022-10-07T05:40:07Z

cc @srielau @entong as well

cloud-fan · 2022-10-07T14:12:40Z

sql/core/src/test/resources/sql-tests/results/double-quoted-identifiers.sql.out

 -- !query output
-org.apache.spark.sql.AnalysisException
-Table or view not found: not_exist; line 1 pos 14
+org.apache.spark.sql.catalyst.parser.ParseException


hmm, do we really need to run this SQL gold file in non-ansi mode? We will just see a bunch of parser errors.

I intend to ensure it won't work when ANSI mode is off.

I am seeing there might be an extra testing here:

so this test file itself contains both double_quoted_identifiers=false and double_quoted_identifiers=true and now we run it into both ANSI and non-ANSI.

So I think what the unique testing coverage is:

non-ANSI and double_quoted_identifiers=false, so double quoted is still a string.

non-ANSI and double_quoted_identifiers=true, so we see parser exception.

ANSI and double_quoted_identifiers=true, this feature is on and being tested.

But for ANSI and double_quoted_identifiers=false which seems to be the same as number 2 above?

Again, the test set is small. I don't see the downside of testing all the combinations.
Let's consider about reducing the tests when it becomes bigger, which is quite unlikely.

I think it's time to refactor it now, as we run it again in the ansi mode tests.

It's not just unnecessary test time, but also brings confusion: why do we test double_quoted_identifiers=false in ansi mode?

but also brings confusion: why do we test double_quoted_identifiers=false in ansi mode?

Do you mean "in non-ansi mode"?

@cloud-fan so we at least need to test the 3 cases below:

non-ansi mode and double-quoted id enabled

ansi mode and double-quoted id enabled

ansi mode and double-quoted id disabled

Any suggestion on the refactor?

We can have a double-quoted-identifiers.sql without duplicate queries. In .../inputs/ansi/, we add 2 files to import the golden file with different configs

// file double-quoted-identifiers-enabled.sql --SET spark.sql.ansi.doubleQuotedIdentifiers = true --IMPORT double-quoted-identifiers.sql

// file double-quoted-identifiers-disabled.sql --SET spark.sql.ansi.doubleQuotedIdentifiers = false --IMPORT double-quoted-identifiers.sql

non-ansi mode and double-quoted id enabled

I don't think so. This is a subconfig of ansi mode, and we only need to test the default case in non-ansi mode.

@cloud-fan Thanks for the suggestion. I updated the tests.

gengliangwang · 2022-10-11T17:43:16Z

Merging to master

move the flag under spark.sql.ansi.enabled

028a804

gengliangwang force-pushed the doubleQuoteFlag branch from 41dc5be to 028a804 Compare October 7, 2022 05:38

github-actions bot added the SQL label Oct 7, 2022

gengliangwang commented Oct 7, 2022

View reviewed changes

gengliangwang requested a review from cloud-fan October 7, 2022 05:39

cloud-fan reviewed Oct 7, 2022

View reviewed changes

gengliangwang added 2 commits October 7, 2022 14:47

fix gloden files

086775b

refactor tests

e68a18a

cloud-fan approved these changes Oct 11, 2022

View reviewed changes

gengliangwang closed this in 6603b82 Oct 11, 2022

dengziming mentioned this pull request Jun 23, 2025

[SPARK-52545][SQL] Standardize double-quote escaping to follow SQL specification #51242

Closed

[SPARK-40585][SQL] Double-quoted identifiers should be available only in ANSI mode #38147

[SPARK-40585][SQL] Double-quoted identifiers should be available only in ANSI mode #38147

Uh oh!

Conversation

gengliangwang commented Oct 7, 2022

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gengliangwang Oct 7, 2022

Choose a reason for hiding this comment

Uh oh!

gengliangwang commented Oct 7, 2022

Uh oh!

cloud-fan Oct 7, 2022

Choose a reason for hiding this comment

Uh oh!

gengliangwang Oct 7, 2022

Choose a reason for hiding this comment

Uh oh!

amaliujia Oct 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gengliangwang Oct 7, 2022

Choose a reason for hiding this comment

Uh oh!

cloud-fan Oct 10, 2022

Choose a reason for hiding this comment

Uh oh!

gengliangwang Oct 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gengliangwang Oct 10, 2022

Choose a reason for hiding this comment

Uh oh!

cloud-fan Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Oct 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gengliangwang Oct 11, 2022

Choose a reason for hiding this comment

Uh oh!

gengliangwang commented Oct 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amaliujia Oct 7, 2022 •

edited

Loading

gengliangwang Oct 10, 2022 •

edited

Loading

cloud-fan Oct 11, 2022 •

edited

Loading

cloud-fan Oct 11, 2022 •

edited

Loading