fix(go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false #3295

praveentandra · 2025-08-17T00:53:15Z

When use_high_precision is false, the type for NUMBER columns with non-zero scale are incorrectly returned as Int64 instead of Float64, causing data discrepancy.

This seems to be a corner case, affecting schema only operations at the client while data path seems to be good.

…lse : When use_high_precision is false, NUMBER columns with non-zero scale are incorrectly returned as Int64 instead of Float64, causing data descrepency. This fix checks the scale value to determine the appropriate Arrow type (Int64 vs Float64) to set the behaviour per documentation at https://arrow.apache.org/adbc/main/driver/snowflake.html

lidavidm

Wouldn't it make more sense to fall back to decimal?

zeroshade · 2025-08-18T17:43:17Z

Can we add a unit test for this?

@lidavidm

… when use_high_precision=false Previously, when use_high_precision=false, NUMBER columns with scale>0 were returned as scaled Int64 from Snowflake. This mismatch caused data corruption at the clients with the decimal data showing up as Int64. The fix changes the behavior to use Decimal128 for all non-integer NUMBER types (scale>0) even when use_high_precision=false, ensuring: - Type consistency between schema and data - Exact precision preservation (no floating-point approximation) - The Int64 optimization for NUMBER(p,0) is preserved Changes: - record_reader.go: Use Decimal128 for NUMBER(p,s>0) when use_high_precision=false - connection.go: Update schema inference to match record_reader logic - driver_test.go: Add comprehensive tests for NUMBER type handling - snowflake.rst: Update documentation to reflect new behavior This is a different issue from apache#1242 (fixed in PR apache#1267), which addressed the Int64→Decimal128 conversion for use_high_precision=true. This fix addresses the type mismatch in the use_high_precision=false code path. Breaking change: Applications expecting Float64 for NUMBER(p,s>0) with use_high_precision=false will now receive Decimal128. While this is a breaking change, the previous functionality was returning incorrect values (as scaled Int64) to the client. The documentation is changed accordingly. I don't think returning decimal data as float is right since float/double are in seperate category. This is per obervation by @lidavidm at apache#3295

CurtHagenlocher · 2025-08-25T14:51:26Z

Wouldn't it make more sense to fall back to decimal?

As I understand it, the premise behind the original implementation was that not all consumers are able to meaningfully use a decimal128 value. So the driver was using the "best possible non-decimal128" type to store the value -- with possible loss of precision but no loss of scale. If we assume that all consumers can work with decimal128 then I think the flag is effectively obsolete.

zeroshade · 2025-08-25T14:55:02Z

As I understand it, the premise behind the original implementation was that not all consumers are able to meaningfully use a decimal128 value. So the driver was using the "best possible non-decimal128" type to store the value -- with possible loss of precision but no loss of scale.

Precisely

CurtHagenlocher · 2025-08-25T19:01:01Z

As I understand it, the premise behind the original implementation was that not all consumers are able to meaningfully use a decimal128 value. So the driver was using the "best possible non-decimal128" type to store the value -- with possible loss of precision but no loss of scale.

Precisely

Right, so given this I think that the original change which simply picked float64 instead of int64 when the scale was nonzero is a better choice.

praveentandra · 2025-08-25T23:11:26Z

As I understand it, the premise behind the original implementation was that not all consumers are able to meaningfully use a decimal128 value. So the driver was using the "best possible non-decimal128" type to store the value -- with possible loss of precision but no loss of scale.

Precisely

Right, so given this I think that the original change which simply picked float64 instead of int64 when the scale was nonzero is a better choice.

That's what I thought about 128 vs 64 and it makes sense, but it has a bug as described below. As you may know, snowflake doesn't have a good way to reference integers vs decimals as it doesn't retain aliases post table creation. I was looking for this flag as a way to come around that problem. Unable to use the falg=false setting because of the bug. Below is how the data shows up in duckdb after querying from Snowflake via adbc.

D select c_custkey, c_name, c_acctbal from sf_db.tpch_sf1.customer order by c_custkey limit 5;

use_high_precision = true - inefficient type at client for integers (c_custkey)
┌───────────────┬────────────────────┬───────────────┐
│ C_CUSTKEY │ C_NAME │ C_ACCTBAL │
│ decimal(38,0) │ varchar │ decimal(12,2) │
├───────────────┼────────────────────┼───────────────┤
│ 1 │ Customer#000000001 │ 711.56 │
│ 2 │ Customer#000000002 │ 121.65 │
│ 3 │ Customer#000000003 │ 7498.12 │
│ 4 │ Customer#000000004 │ 2866.83 │
│ 5 │ Customer#000000005 │ 794.47 │
└───────────────┴────────────────────┴───────────────┘
use_high_precision = false - incorrect type and value at client for decimals (c_acctbal)
┌───────────┬────────────────────┬─────────────────────┐
│ C_CUSTKEY │ C_NAME │ C_ACCTBAL │
│ int64 │ varchar │ int64 │
├───────────┼────────────────────┼─────────────────────┤
│ 1 │ Customer#000000001 │ 4649470163769863700 │
│ 2 │ Customer#000000002 │ 4638260774666082714 │
│ 3 │ Customer#000000003 │ 4664966284827552645 │
│ 4 │ Customer#000000004 │ 4658522640913436508 │
│ 5 │ Customer#000000005 │ 4650199447842334966 │
└───────────┴────────────────────┴─────────────────────┘

Given this behavior, I am not sure if any of the clients are using use_high_precision = false.

Please advise on one of the below:

Fix the issue for float64
Fix the issue and retain decimal128
Leave it as is, lets solve decimal(p, s=0)->int conversion separately

zeroshade · 2025-08-27T19:21:08Z

In my opinion this should be the behavior:

If useHighPrecision == true then just use the decimal type as-is that was given to us
If useHighPrecision == false then: for scale == 0 use int64, otherwise use float64.

…=false

praveentandra · 2025-08-28T02:31:01Z

In my opinion this should be the behavior:

If useHighPrecision == true then just use the decimal type as-is that was given to us

If useHighPrecision == false then: for scale == 0 use int64, otherwise use float64.

I had updated the PR accordingly

lidavidm · 2025-09-01T04:01:59Z

@zeroshade any comments?

praveentandra requested a review from zeroshade as a code owner August 17, 2025 00:53

github-actions bot added this to the ADBC Libraries 20 milestone Aug 17, 2025

praveentandra changed the title ~~Fix for returning ARROW numeric type correctly when use_high_precision is false~~ fix (go/adbc/driver/snowflake): returning ARROW numeric type correctly when use_high_precision is false Aug 17, 2025

praveentandra changed the title ~~fix (go/adbc/driver/snowflake): returning ARROW numeric type correctly when use_high_precision is false~~ fix (go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false Aug 17, 2025

praveentandra changed the title ~~fix (go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false~~ fix(go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false Aug 17, 2025

lidavidm reviewed Aug 17, 2025

View reviewed changes

praveentandra force-pushed the fix-snowflake-number-scale branch from 1007db7 to 7687b15 Compare August 27, 2025 21:43

praveentandra added 2 commits August 27, 2025 18:26

Add test to verify types during ExecuteSchema when use_high_precision…

d8ee74d

…=false

fix: Apply go fmt to driver_test.go

1a550c3

fix: Remove trailing whitespace in driver_test.go

e9a247d

lidavidm approved these changes Aug 29, 2025

View reviewed changes

lidavidm modified the milestones: ADBC Libraries 20, ADBC Libraries 21 Sep 9, 2025

zeroshade approved these changes Oct 6, 2025

View reviewed changes

lidavidm merged commit 206e02f into apache:main Oct 7, 2025
41 of 42 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false #3295

fix(go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false #3295

Uh oh!

praveentandra commented Aug 17, 2025 •

edited

Loading

Uh oh!

lidavidm left a comment

Uh oh!

zeroshade commented Aug 18, 2025

Uh oh!

CurtHagenlocher commented Aug 25, 2025

Uh oh!

zeroshade commented Aug 25, 2025

Uh oh!

CurtHagenlocher commented Aug 25, 2025

Uh oh!

praveentandra commented Aug 25, 2025

Uh oh!

zeroshade commented Aug 27, 2025

Uh oh!

praveentandra commented Aug 28, 2025

Uh oh!

lidavidm commented Sep 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix(go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false #3295

fix(go/adbc/driver/snowflake): return arrow numeric type correctly when use_high_precision is false #3295

Uh oh!

Conversation

praveentandra commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lidavidm left a comment

Choose a reason for hiding this comment

Uh oh!

zeroshade commented Aug 18, 2025

Uh oh!

CurtHagenlocher commented Aug 25, 2025

Uh oh!

zeroshade commented Aug 25, 2025

Uh oh!

CurtHagenlocher commented Aug 25, 2025

Uh oh!

praveentandra commented Aug 25, 2025

Uh oh!

zeroshade commented Aug 27, 2025

Uh oh!

praveentandra commented Aug 28, 2025

Uh oh!

lidavidm commented Sep 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

praveentandra commented Aug 17, 2025 •

edited

Loading