Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1235,11 +1235,13 @@ object TryToTimestampExpressionBuilder extends ExpressionBuilder {
override def build(funcName: String, expressions: Seq[Expression]): Expression = {
val numArgs = expressions.length
if (numArgs == 1 || numArgs == 2) {
ParseToTimestamp(
// The expression ParseToTimestamp will throw an SparkUpgradeException if the input is invalid
// even when failOnError is false. We need to catch the exception and return null.
TryEval(ParseToTimestamp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we simply update ToTimestamp#eval and codegen to catch SparkUpgradeException additionally? I'm a bit worried about increasing the scope and swallow all errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in ToTimestamp, there is supposed to be a SparkUpgradeException even when failOnErroris false.
To avoid swallowing all errors, we can add another flag isTryFunction besides failOnError. It will make the code more complicated though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we need to maintain 3 behaviors:

  1. try_to_timestamp never fails
  2. to_timestamp ansi mode always fail with invalid input
  3. to_timestamp non-ansi mode only fails with SparkUpgradeException

Do I understand it correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why try_to_timestamp need a different behavior with to_timestamp non-ansi mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why try_to_timestamp need a different behavior with to_timestamp non-ansi mode.

The try_* functions are supposed to return null on invalid inputs, instead of compatible with non-ansi mode. cc @srielau on this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This upgrade issue seems special and we should always throw it to provide upgrade instructions. How about #45853 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan after #45853, users will face this bug when setting the timeParserPolicy as legacy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can argue this is not a bug as the upgrade exception must be thrown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan ok, let me close this one for now.

expressions.head,
expressions.drop(1).lastOption,
SQLConf.get.timestampType,
failOnError = false)
failOnError = false))
} else {
throw QueryCompilationErrors.wrongNumArgsError(funcName, Seq(1, 2), numArgs)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1392,4 +1392,9 @@ class DateFunctionsSuite extends QueryTest with SharedSparkSession {
checkAnswer(df.selectExpr("try_to_timestamp(a)"), Seq(Row(ts)))
checkAnswer(df.select(try_to_timestamp(col("a"))), Seq(Row(ts)))
}

test("try_to_timestamp: return null on SparkUpgradeException") {
val df = spark.sql("SELECT try_to_timestamp('2016-12-1', 'yyyy-MM-dd')")
checkAnswer(df, Seq(Row(null)))
}
}