Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Improve error reporting for expression data type mismatch
  • Loading branch information
cloud-fan committed Jun 1, 2015
commit cb77e4f644441b137a9ab8ed2568ddce3bc0f053
Original file line number Diff line number Diff line change
Expand Up @@ -62,15 +62,15 @@ trait CheckAnalysis {
val from = operator.inputSet.map(_.name).mkString(", ")
a.failAnalysis(s"cannot resolve '${a.prettyString}' given input columns $from")

case e: Expression if !e.validInputTypes =>
e.failAnalysis(
s"cannot resolve '${t.prettyString}' due to data type mismatch: " +
e.typeMismatchErrorMessage.get)

case c: Cast if !c.resolved =>
failAnalysis(
s"invalid cast from ${c.child.dataType.simpleString} to ${c.dataType.simpleString}")

case b: BinaryExpression if !b.resolved =>
failAnalysis(
s"invalid expression ${b.prettyString} " +
s"between ${b.left.dataType.simpleString} and ${b.right.dataType.simpleString}")

case WindowExpression(UnresolvedWindowFunction(name, _), _) =>
failAnalysis(
s"Could not resolve window function '$name'. " +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ trait HiveTypeCoercion {
Union(newLeft, newRight)

// fix decimal precision for expressions
case q => q.transformExpressions {
case q => q.transformExpressionsUp {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we separate type checking logic from resolved, we need to ensure all children have valid input types before we do type coercion(think about Add(Add(decimal1, decimal2), decimal3)). Here I just use transformExpressionsUp as a quick fix(and only for DecimalPrecision rule as it breaks a test...), should we skip not only !childrenResolved but also !children.forall(_.validInputTypes) during all type coercion, or still checking types in resolved? @rxin @marmbrus

// Skip nodes whose children have not been resolved yet
case e if !e.childrenResolved => e

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,16 @@ abstract class Expression extends TreeNode[Expression] {
case (i1, i2) => i1 == i2
}
}

def typeMismatchErrorMessage: Option[String] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about rename this

checkInputDataTypes(): Option[String]

and remove validInputTypes function.


def validInputTypes: Boolean = typeMismatchErrorMessage.isEmpty
}

abstract class BinaryExpression extends Expression with trees.BinaryNode[Expression] {
self: Product =>

def symbol: String
def symbol: String = sys.error(s"BinaryExpressions must either override toString or symbol")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please swap "either" and "override".


override def foldable: Boolean = left.foldable && right.foldable

Expand All @@ -106,6 +110,10 @@ abstract class LeafExpression extends Expression with trees.LeafNode[Expression]

abstract class UnaryExpression extends Expression with trees.UnaryNode[Expression] {
self: Product =>

override def foldable: Boolean = child.foldable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not always correct, is it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for nulalble

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just revert this change, and push it into the individual expressions or their base classes. Otherwise it is too error prone. If we add an expression that has different behavior, we will likely forget to change foldable/nullable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also remove the foldable and nullable methods from BinaryExpressions?


override def nullable: Boolean = child.nullable
}

// TODO Semantically we probably not need GroupExpression
Expand All @@ -125,7 +133,9 @@ case class GroupExpression(children: Seq[Expression]) extends Expression {
* so that the proper type conversions can be performed in the analyzer.
*/
trait ExpectsInputTypes {
self: Expression =>

def expectedChildTypes: Seq[DataType]

override def validInputTypes: Boolean = children.map(_.dataType) == expectedChildTypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happened to error reporting if the type is invalid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will always do type casting for ExpectsInputTypes, so the error will be reported for Cast, not ExpectsInputTypes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I don't like the handling of ExpectsInputTypes: we do type casting anyway, then if something is wrong, error will be reported for Cast, not the root cause -- wrong input types for a expression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be fine right?

Also - add inline comment explaining this (i.e. things get casted).

}
Loading