Skip to content

Conversation

@gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Currently, try_to_timestamp will throw an exception on legacy timestamp input.

> SELECT try_to_timestamp('2016-12-1', 'yyyy-MM-dd')

org.apache.spark.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get a different result due to the upgrading to Spark >= 3.0:
Fail to parse '2016-12-1' in the new parser.
You can set "spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime string. SQLSTATE: 42K0B 

It should return null instead of error.

Why are the changes needed?

Fix a bug in function try_to_timestamp

Does this PR introduce any user-facing change?

Yes, this PR introduces a bug fix: try_to_timestamp will return null instead of throwing an error on legacy timestamp string input.

How was this patch tested?

New UT

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Apr 2, 2024
ParseToTimestamp(
// The expression ParseToTimestamp will throw an SparkUpgradeException if the input is invalid
// even when failOnError is false. We need to catch the exception and return null.
TryEval(ParseToTimestamp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we simply update ToTimestamp#eval and codegen to catch SparkUpgradeException additionally? I'm a bit worried about increasing the scope and swallow all errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in ToTimestamp, there is supposed to be a SparkUpgradeException even when failOnErroris false.
To avoid swallowing all errors, we can add another flag isTryFunction besides failOnError. It will make the code more complicated though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we need to maintain 3 behaviors:

  1. try_to_timestamp never fails
  2. to_timestamp ansi mode always fail with invalid input
  3. to_timestamp non-ansi mode only fails with SparkUpgradeException

Do I understand it correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why try_to_timestamp need a different behavior with to_timestamp non-ansi mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why try_to_timestamp need a different behavior with to_timestamp non-ansi mode.

The try_* functions are supposed to return null on invalid inputs, instead of compatible with non-ansi mode. cc @srielau on this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This upgrade issue seems special and we should always throw it to provide upgrade instructions. How about #45853 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan after #45853, users will face this bug when setting the timeParserPolicy as legacy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can argue this is not a bug as the upgrade exception must be thrown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan ok, let me close this one for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants