Skip to content

Conversation

@gatorsmile
Copy link
Member

What changes were proposed in this pull request?

Decimal is a logical type of AVRO. We need to ensure the support of Hive's AVRO serde works well in Spark

How was this patch tested?

N/A


if (isPartitioned) {
val insertStmt = s"INSERT OVERWRITE TABLE $tableName partition (ds='a') SELECT 1.3"
if (version == "0.12" || version == "0.13") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.13 or prior does not support the logical type Decimal. See https://issues.apache.org/jira/browse/HIVE-5823

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: I was trying to understand why this depends on the metastore version and realized that in case of schema mismatch (such as DecimalType and Binary here), HiveExternalCatalog always respects the table schema from hive over spark SQL. Is it worth having this limitation for all generalized cases?

Copy link
Member Author

@gatorsmile gatorsmile Aug 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Our dear Hive metastore change the schema we inferred. We got the warning messages for 0.12 and 0.13

12:04:09.816 WARN org.apache.spark.sql.hive.test.TestHiveExternalCatalog: The table schema given by Hive metastore(structf0:binary,ds:string) is different from the schema when this table was created by Spark SQL(structf0:decimal(38,2),ds:string). We have to fall back to the table schema from Hive metastore which is not case preserving.

Copy link
Member Author

@gatorsmile gatorsmile Aug 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will submit a separate PR for adding a conf to ignore the schema overwritten by hive metastore.

@gatorsmile
Copy link
Member Author

cc @sameeragarwal @cloud-fan

}
}

test(s"$version: Decimal support of Avro Hive serde") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider generalizing this test for all supported avro logical types: https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types. Perhaps add a TODO?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Let me post a todo there.

@SparkQA
Copy link

SparkQA commented Aug 17, 2017

Test build #80800 has finished for PR 18977 at commit db15653.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sameeragarwal
Copy link
Member

Thanks, LGTM!

@SparkQA
Copy link

SparkQA commented Aug 17, 2017

Test build #80808 has finished for PR 18977 at commit 5d6b616.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

Thanks! Merging to master.

@asfgit asfgit closed this in 2caaed9 Aug 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants