Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
[SPARK-12558] AnalysisException when multiple functions applied in GR…
…OUP BY clause
  • Loading branch information
dilipbiswal committed Jan 8, 2016
commit c9f2e92527aa552900a48e6ab8dc503e5f7dea89
12 changes: 12 additions & 0 deletions sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,18 @@ private[hive] case class HiveGenericUDF(funcWrapper: HiveFunctionWrapper, childr
udfType != null && udfType.deterministic()
}

override def semanticEquals(other: Expression): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment why the version in Expression doesn't work? What about it is too strict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nongli Thanks a lot for your feedback. In this case , we have two instances of HiveGenericUDF one from the grouping expression and other from aggregation expression. While doing a semantic equality between them , we unwrap the case class and fall through to the last case statement which does equality between two HiveFunctionWrapper(s) and is failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nongli @cloud-fan We could also override equals in HiveFunctionWrapper to just compare the functionClassName. Would this be a better option ? Please let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to override HiveFunctionWrapper.equals, if the HiveFunctionWrapper.instance has no contribution to equality.

val eqClass = other.isInstanceOf[HiveGenericUDF] &&
funcWrapper.functionClassName ==
other.asInstanceOf[HiveGenericUDF].funcWrapper.functionClassName

val isEqual = eqClass && children.zip(other.asInstanceOf[HiveGenericUDF].children).forall {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to check children.length == other.children.length.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nongli Sure. Will make the change.

case (e1: Expression, e2: Expression) => e1 semanticEquals e2
case (i1, i2) => i1 == i2
}
isEqual
}

@transient
private lazy val deferedObjects = argumentInspectors.zip(children).map { case (inspect, child) =>
new DeferredObjectAdapter(inspect, child.dataType)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,5 +31,10 @@ class UDFSuite extends QueryTest with TestHiveSingleton {
assert(hiveContext.sql("SELECT RANDOM0() FROM src LIMIT 1").head().getDouble(0) >= 0.0)
assert(hiveContext.sql("SELECT RANDOm1() FROM src LIMIT 1").head().getDouble(0) >= 0.0)
assert(hiveContext.sql("SELECT strlenscala('test', 1) FROM src LIMIT 1").head().getInt(0) === 5)

assert(hiveContext.sql(
"select date(cast('1997-01-01 10:10:10' as timestamp)) from src" +
" group by date(cast('1997-01-01 10:10:10' as timestamp))")
.head().getDate(0).toString == "1997-01-01")
}
}