[SPARK-26538][SQL] Set default precision and scale for elements of postgres numeric array #23456

a-shkarupin · 2019-01-04T21:12:33Z

What changes were proposed in this pull request?

When determining CatalystType for postgres columns with type numeric[] set the type of array element to DecimalType(38, 18) instead of DecimalType(0,0).

How was this patch tested?

Tested with modified org.apache.spark.sql.jdbc.JDBCSuite.
Ran the PostgresIntegrationSuite manually.

…stgres numeric array

mgaido91

just a minor comment, otherwise seems reasonable. cc @gatorsmile @srowen for triggering the build when they are comfortable with the change.

mgaido91 · 2019-01-04T21:28:29Z

sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala

    case "timestamp" | "timestamptz" | "time" | "timetz" => Some(TimestampType)
    case "date" => Some(DateType)
-    case "numeric" | "decimal" => Some(DecimalType.bounded(precision, scale))
+    case "numeric" | "decimal" if precision != 0 => Some(DecimalType.bounded(precision, scale))


what about

case "numeric" | "decimal" => if (precision > 0) { Some(DecimalType.bounded(precision, scale)) } else { // Here a small comment explaining when this can happen and why we do this. Some(DecimalType. SYSTEM_DEFAULT) }

Updated per your suggestion.

a-shkarupin · 2019-01-08T17:20:27Z

Would it make sense to add the tests from 74215de as well ?

mgaido91 · 2019-01-08T20:02:08Z

@a-shkarupin yes, I think so. cc @dongjoon-hyun who prepared that patch and can have other/better suggestions.

maropu · 2019-01-09T10:00:03Z

ok to test

maropu · 2019-01-09T10:04:31Z

@a-shkarupin Please add the @dongjoon-hyun 's test in PostgresIntegrationSuite.scala? Also, since Jenkins don't run the test, please check if the test passed in your env?

maropu · 2019-01-09T10:19:46Z

sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala

    case "timestamp" | "timestamptz" | "time" | "timetz" => Some(TimestampType)
    case "date" => Some(DateType)
-    case "numeric" | "decimal" => Some(DecimalType.bounded(precision, scale))
+    case "numeric" | "decimal" => if (precision > 0) {


I would like to confirm just in case; we don't check scale in this pr? Probably, this might be related to the discussion: #23458 (comment)

Postgres doc says that

The precision must be positive, the scale zero or positive.

What the postgres jdbc driver returned in case of numeric was 0 for both scale and precision.
The condition proposed in the linked ticket and currently used here was roughly precision > 0 || scale > 0, but I can not come up with a valid case having precision <=0 while having scale > 0.
Is there another case where we would have a decimal with precision 0?
Could someone explain?

Yea, but, I think we'd be better to add the check scale > 0, too, just for safeguards.

I think this is fine actually,. We do not support decimals with precision < 0, so this is most likely enough.

Actually, I still agree with @maropu (#23456 (comment)), but it looks okay because this is PostgresDialect.scala.

if we'd add || scale > 0, we'd allow a percision <= 0,which doesn't make any sense and it is not supported by Spark's decimal. So I think this is fine.

maropu · 2019-01-09T10:33:16Z

nit: Could you clean up the title (plz move …stgres numeric array into the title)?

SparkQA · 2019-01-09T11:36:40Z

Test build #100956 has finished for PR 23456 at commit 31b0b04.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

a-shkarupin · 2019-01-09T15:58:52Z

@a-shkarupin Please add the @dongjoon-hyun 's test in PostgresIntegrationSuite.scala? Also, since Jenkins don't run the test, please check if the test passed in your env?

Added the test.

Ran tests as follows:

./build/mvn install -DskipTests
./build/mvn test -Pdocker-integration-tests -pl :spark-docker-integration-tests_2.12

Got following result:

Run completed in 3 minutes, 36 seconds.
Total number of tests run: 21
Suites: completed 5, aborted 0
Tests: succeeded 21, failed 0, canceled 0, ignored 5, pending 0
All tests passed.

nit: Could you clean up the title (plz move …stgres numeric array into the title)?

Cleaned up.

SparkQA · 2019-01-09T20:28:42Z

Test build #100976 has finished for PR 23456 at commit 77bbcb5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2019-01-09T22:44:02Z

Can you update the PR description (How was this patch tested?) ?

dongjoon-hyun · 2019-01-10T07:55:15Z

sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala

+    } else {
+      // SPARK-26538: handle numeric without explicit precision and scale.
+      Some(DecimalType. SYSTEM_DEFAULT)
+    }


Hi, @a-shkarupin . Thank you for your first contribution.
Could you follow the existing succinct style? What I mean is having two case "numeric" | "decimal"s.

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L215-L217

Updated per your suggestion, kept the comment as suggested here.

a-shkarupin · 2019-01-10T17:54:37Z

Can you update the PR description (How was this patch tested?) ?

Updated.

SparkQA · 2019-01-10T22:14:41Z

Test build #101030 has finished for PR 23456 at commit c72e214.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2019-01-10T23:19:17Z

LGTM

mgaido91 · 2019-01-11T09:51:51Z

LGTM too, thanks.

dongjoon-hyun

+1, LGTM

dongjoon-hyun · 2019-01-12T19:06:05Z

Thank you all! Merged to master/branch-2.4/branch-2.3.

…stgres numeric array ## What changes were proposed in this pull request? When determining CatalystType for postgres columns with type `numeric[]` set the type of array element to `DecimalType(38, 18)` instead of `DecimalType(0,0)`. ## How was this patch tested? Tested with modified `org.apache.spark.sql.jdbc.JDBCSuite`. Ran the `PostgresIntegrationSuite` manually. Closes #23456 from a-shkarupin/postgres_numeric_array. Lead-authored-by: Oleksii Shkarupin <[email protected]> Co-authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 5b37092) Signed-off-by: Dongjoon Hyun <[email protected]>

dongjoon-hyun · 2019-01-12T19:12:38Z

Hi, @a-shkarupin . What is your Apache JIRA id?
I'm trying to add you to Apache Spark contributor group.

a-shkarupin · 2019-01-14T09:06:49Z

Hi @dongjoon-hyun . My Apache JIRA username is alsh. I reported SPARK-26538.
Thanks.

dongjoon-hyun · 2019-01-15T07:57:15Z

HI, @a-shkarupin . Yep. alsh is added to Spark contributor group and SPARK-26538 is assigned to you. If you are not in the contributor group, we cannot assign you an issue. Since you are added now, there is no problem in assigning.

…stgres numeric array ## What changes were proposed in this pull request? When determining CatalystType for postgres columns with type `numeric[]` set the type of array element to `DecimalType(38, 18)` instead of `DecimalType(0,0)`. ## How was this patch tested? Tested with modified `org.apache.spark.sql.jdbc.JDBCSuite`. Ran the `PostgresIntegrationSuite` manually. Closes apache#23456 from a-shkarupin/postgres_numeric_array. Lead-authored-by: Oleksii Shkarupin <[email protected]> Co-authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…stgres numeric array ## What changes were proposed in this pull request? When determining CatalystType for postgres columns with type `numeric[]` set the type of array element to `DecimalType(38, 18)` instead of `DecimalType(0,0)`. ## How was this patch tested? Tested with modified `org.apache.spark.sql.jdbc.JDBCSuite`. Ran the `PostgresIntegrationSuite` manually. Closes apache#23456 from a-shkarupin/postgres_numeric_array. Lead-authored-by: Oleksii Shkarupin <[email protected]> Co-authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 5b37092) Signed-off-by: Dongjoon Hyun <[email protected]>

[SPARK-26538][SQL] Set default precision and scale for elements of po…

b004ee3

…stgres numeric array

mgaido91 reviewed Jan 4, 2019

View reviewed changes

maropu mentioned this pull request Jan 5, 2019

[SPARK-26540][SQL] Support PostgreSQL numeric arrays without precision/scale #23458

Closed

[SPARK-26538][SQL] add a comment explaining the issue

31b0b04

maropu reviewed Jan 9, 2019

View reviewed changes

a-shkarupin changed the title ~~[SPARK-26538][SQL] Set default precision and scale for elements of po…~~ [SPARK-26538][SQL] Set default precision and scale for elements of postgres numeric array Jan 9, 2019

Add a test at PostgresIntegrationSuite

77bbcb5

dongjoon-hyun reviewed Jan 10, 2019

View reviewed changes

[SPARK-26538][SQL] follow existing style

c72e214

dongjoon-hyun approved these changes Jan 11, 2019

View reviewed changes

dongjoon-hyun closed this in 5b37092 Jan 12, 2019

fpompermaier mentioned this pull request Apr 26, 2020

[FLINK-17385][jdbc][postgres] Handled problem of numeric with 0 precision apache/flink#11914

Closed

[SPARK-26538][SQL] Set default precision and scale for elements of postgres numeric array #23456

[SPARK-26538][SQL] Set default precision and scale for elements of postgres numeric array #23456

Uh oh!

Conversation

a-shkarupin commented Jan 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

mgaido91 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

a-shkarupin commented Jan 8, 2019

Uh oh!

mgaido91 commented Jan 8, 2019

Uh oh!

maropu commented Jan 9, 2019

Uh oh!

maropu commented Jan 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maropu commented Jan 9, 2019

Uh oh!

SparkQA commented Jan 9, 2019

Uh oh!

a-shkarupin commented Jan 9, 2019

Uh oh!

SparkQA commented Jan 9, 2019

Uh oh!

maropu commented Jan 9, 2019

Uh oh!

dongjoon-hyun Jan 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

a-shkarupin commented Jan 10, 2019

Uh oh!

SparkQA commented Jan 10, 2019

Uh oh!

maropu commented Jan 10, 2019

Uh oh!

mgaido91 commented Jan 11, 2019

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Jan 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jan 12, 2019

Uh oh!

a-shkarupin commented Jan 14, 2019

Uh oh!

dongjoon-hyun commented Jan 15, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

a-shkarupin commented Jan 4, 2019 •

edited

Loading

maropu commented Jan 9, 2019 •

edited

Loading

dongjoon-hyun Jan 10, 2019 •

edited

Loading

dongjoon-hyun commented Jan 12, 2019 •

edited

Loading