-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22036][SQL] Decimal multiplication with high precision/scale often returns NULL #20023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 16 commits
3037d4a
6701a54
20616fd
ecd6a2a
cb62433
77e445f
519571d
1f36cf6
1ff819d
c1c5781
3e79a07
090659f
cf3b372
03644fe
7653e6d
2b66098
b4b0350
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1795,6 +1795,11 @@ options. | |
|
|
||
| - Since Spark 2.3, when all inputs are binary, SQL `elt()` returns an output as binary. Otherwise, it returns as a string. Until Spark 2.3, it always returns as a string despite of input types. To keep the old behavior, set `spark.sql.function.eltOutputAsString` to `true`. | ||
|
|
||
| - Since Spark 2.3, by default arithmetic operations between decimals return a rounded value if an exact representation is not possible. This is compliant to SQL standards and Hive's behavior introduced in HIVE-15331. This involves the following changes | ||
| - The rules to determine the result type of an arithmetic operation have been updated. In particular, if the precision / scale needed are out of the range of available values, the scale is reduced up to 6, in order to prevent the truncation of the integer part of the decimals. | ||
|
||
| - Literal values used in SQL operations are converted to DECIMAL with the exact precision and scale needed by them. | ||
| - The configuration `spark.sql.decimalOperations.allowPrecisionLoss` has been introduced. It defaults to `true`, which means the new behavior described here; if set to `false`, Spark will use the previous rules and behavior. | ||
|
||
|
|
||
| ## Upgrading From Spark SQL 2.1 to 2.2 | ||
|
|
||
| - Spark 2.1.1 introduced a new configuration key: `spark.sql.hive.caseSensitiveInferenceMode`. It had a default setting of `NEVER_INFER`, which kept behavior identical to 2.1.0. However, Spark 2.2.0 changes this setting's default value to `INFER_AND_SAVE` to restore compatibility with reading Hive metastore tables whose underlying file schema have mixed-case column names. With the `INFER_AND_SAVE` configuration value, on first access Spark will perform schema inference on any Hive metastore table for which it has not already saved an inferred schema. Note that schema inference can be a very time consuming operation for tables with thousands of partitions. If compatibility with mixed-case column names is not a concern, you can safely set `spark.sql.hive.caseSensitiveInferenceMode` to `NEVER_INFER` to avoid the initial overhead of schema inference. Note that with the new default `INFER_AND_SAVE` setting, the results of the schema inference are saved as a metastore key for future use. Therefore, the initial schema inference occurs only at a table's first access. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,6 +21,7 @@ import org.apache.spark.sql.catalyst.expressions._ | |
| import org.apache.spark.sql.catalyst.expressions.Literal._ | ||
| import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan | ||
| import org.apache.spark.sql.catalyst.rules.Rule | ||
| import org.apache.spark.sql.internal.SQLConf | ||
| import org.apache.spark.sql.types._ | ||
|
|
||
|
|
||
|
|
@@ -42,8 +43,10 @@ import org.apache.spark.sql.types._ | |
| * e1 / e2 p1 - s1 + s2 + max(6, s1 + p2 + 1) max(6, s1 + p2 + 1) | ||
| * e1 % e2 min(p1-s1, p2-s2) + max(s1, s2) max(s1, s2) | ||
| * e1 union e2 max(s1, s2) + max(p1-s1, p2-s2) max(s1, s2) | ||
| * sum(e1) p1 + 10 s1 | ||
| * avg(e1) p1 + 4 s1 + 4 | ||
| * | ||
| * When `spark.sql.decimalOperations.allowPrecisionLoss` is set to true, if the precision / scale | ||
| * needed are out of the range of available values, the scale is reduced up to 6, in order to | ||
| * prevent the truncation of the integer part of the decimals. | ||
| * | ||
| * To implement the rules for fixed-precision types, we introduce casts to turn them to unlimited | ||
| * precision, do the math on unlimited-precision numbers, then introduce casts back to the | ||
|
|
@@ -56,6 +59,7 @@ import org.apache.spark.sql.types._ | |
| * - INT gets turned into DECIMAL(10, 0) | ||
| * - LONG gets turned into DECIMAL(20, 0) | ||
| * - FLOAT and DOUBLE cause fixed-length decimals to turn into DOUBLE | ||
| * - Literals INT and LONG get turned into DECIMAL with the precision strictly needed by the value | ||
| */ | ||
| // scalastyle:on | ||
| object DecimalPrecision extends TypeCoercionRule { | ||
|
|
@@ -93,41 +97,76 @@ object DecimalPrecision extends TypeCoercionRule { | |
| case e: BinaryArithmetic if e.left.isInstanceOf[PromotePrecision] => e | ||
|
|
||
| case Add(e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2)) => | ||
| val dt = DecimalType.bounded(max(s1, s2) + max(p1 - s1, p2 - s2) + 1, max(s1, s2)) | ||
| CheckOverflow(Add(promotePrecision(e1, dt), promotePrecision(e2, dt)), dt) | ||
| val resultScale = max(s1, s2) | ||
| val resultType = if (SQLConf.get.decimalOperationsAllowPrecisionLoss) { | ||
| DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is an example.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, that may be a difference indeed. But I think it is a minor one, since 99% of the cases the precision is exceeded only in multiplications and divisions.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to make a decision. You know, we try our best to keep our rule as stable as possible.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm ok to do the adjustment for all operations, which is same as Hive. |
||
| resultScale) | ||
| } else { | ||
| DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale) | ||
| } | ||
| CheckOverflow(Add(promotePrecision(e1, resultType), promotePrecision(e2, resultType)), | ||
| resultType) | ||
|
|
||
| case Subtract(e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2)) => | ||
| val dt = DecimalType.bounded(max(s1, s2) + max(p1 - s1, p2 - s2) + 1, max(s1, s2)) | ||
| CheckOverflow(Subtract(promotePrecision(e1, dt), promotePrecision(e2, dt)), dt) | ||
| val resultScale = max(s1, s2) | ||
| val resultType = if (SQLConf.get.decimalOperationsAllowPrecisionLoss) { | ||
| DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1, | ||
| resultScale) | ||
| } else { | ||
| DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale) | ||
| } | ||
| CheckOverflow(Subtract(promotePrecision(e1, resultType), promotePrecision(e2, resultType)), | ||
| resultType) | ||
|
|
||
| case Multiply(e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2)) => | ||
| val resultType = DecimalType.bounded(p1 + p2 + 1, s1 + s2) | ||
| val resultType = if (SQLConf.get.decimalOperationsAllowPrecisionLoss) { | ||
| DecimalType.adjustPrecisionScale(p1 + p2 + 1, s1 + s2) | ||
| } else { | ||
| DecimalType.bounded(p1 + p2 + 1, s1 + s2) | ||
| } | ||
| val widerType = widerDecimalType(p1, s1, p2, s2) | ||
| CheckOverflow(Multiply(promotePrecision(e1, widerType), promotePrecision(e2, widerType)), | ||
| resultType) | ||
|
|
||
| case Divide(e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2)) => | ||
| var intDig = min(DecimalType.MAX_SCALE, p1 - s1 + s2) | ||
| var decDig = min(DecimalType.MAX_SCALE, max(6, s1 + p2 + 1)) | ||
| val diff = (intDig + decDig) - DecimalType.MAX_SCALE | ||
| if (diff > 0) { | ||
| decDig -= diff / 2 + 1 | ||
| intDig = DecimalType.MAX_SCALE - decDig | ||
| val resultType = if (SQLConf.get.decimalOperationsAllowPrecisionLoss) { | ||
| // Precision: p1 - s1 + s2 + max(6, s1 + p2 + 1) | ||
| // Scale: max(6, s1 + p2 + 1) | ||
| val intDig = p1 - s1 + s2 | ||
| val scale = max(DecimalType.MINIMUM_ADJUSTED_SCALE, s1 + p2 + 1) | ||
| val prec = intDig + scale | ||
| DecimalType.adjustPrecisionScale(prec, scale) | ||
| } else { | ||
| var intDig = min(DecimalType.MAX_SCALE, p1 - s1 + s2) | ||
| var decDig = min(DecimalType.MAX_SCALE, max(6, s1 + p2 + 1)) | ||
| val diff = (intDig + decDig) - DecimalType.MAX_SCALE | ||
| if (diff > 0) { | ||
| decDig -= diff / 2 + 1 | ||
| intDig = DecimalType.MAX_SCALE - decDig | ||
| } | ||
| DecimalType.bounded(intDig + decDig, decDig) | ||
| } | ||
| val resultType = DecimalType.bounded(intDig + decDig, decDig) | ||
| val widerType = widerDecimalType(p1, s1, p2, s2) | ||
| CheckOverflow(Divide(promotePrecision(e1, widerType), promotePrecision(e2, widerType)), | ||
| resultType) | ||
|
|
||
| case Remainder(e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2)) => | ||
| val resultType = DecimalType.bounded(min(p1 - s1, p2 - s2) + max(s1, s2), max(s1, s2)) | ||
| val resultType = if (SQLConf.get.decimalOperationsAllowPrecisionLoss) { | ||
| DecimalType.adjustPrecisionScale(min(p1 - s1, p2 - s2) + max(s1, s2), max(s1, s2)) | ||
| } else { | ||
| DecimalType.bounded(min(p1 - s1, p2 - s2) + max(s1, s2), max(s1, s2)) | ||
| } | ||
| // resultType may have lower precision, so we cast them into wider type first. | ||
| val widerType = widerDecimalType(p1, s1, p2, s2) | ||
| CheckOverflow(Remainder(promotePrecision(e1, widerType), promotePrecision(e2, widerType)), | ||
| resultType) | ||
|
|
||
| case Pmod(e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2)) => | ||
| val resultType = DecimalType.bounded(min(p1 - s1, p2 - s2) + max(s1, s2), max(s1, s2)) | ||
| val resultType = if (SQLConf.get.decimalOperationsAllowPrecisionLoss) { | ||
| DecimalType.adjustPrecisionScale(min(p1 - s1, p2 - s2) + max(s1, s2), max(s1, s2)) | ||
| } else { | ||
| DecimalType.bounded(min(p1 - s1, p2 - s2) + max(s1, s2), max(s1, s2)) | ||
| } | ||
| // resultType may have lower precision, so we cast them into wider type first. | ||
| val widerType = widerDecimalType(p1, s1, p2, s2) | ||
| CheckOverflow(Pmod(promotePrecision(e1, widerType), promotePrecision(e2, widerType)), | ||
|
|
@@ -137,9 +176,6 @@ object DecimalPrecision extends TypeCoercionRule { | |
| e2 @ DecimalType.Expression(p2, s2)) if p1 != p2 || s1 != s2 => | ||
| val resultType = widerDecimalType(p1, s1, p2, s2) | ||
| b.makeCopy(Array(Cast(e1, resultType), Cast(e2, resultType))) | ||
|
|
||
| // TODO: MaxOf, MinOf, etc might want other rules | ||
| // SUM and AVERAGE are handled by the implementations of those expressions | ||
| } | ||
|
|
||
| /** | ||
|
|
@@ -243,17 +279,35 @@ object DecimalPrecision extends TypeCoercionRule { | |
| // Promote integers inside a binary expression with fixed-precision decimals to decimals, | ||
| // and fixed-precision decimals in an expression with floats / doubles to doubles | ||
| case b @ BinaryOperator(left, right) if left.dataType != right.dataType => | ||
| (left.dataType, right.dataType) match { | ||
| case (t: IntegralType, DecimalType.Fixed(p, s)) => | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: I feel it's more readable to just put the new cases for literal before these 4 cases.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unfortunately this is not really feasible since we match on different thigs: here we match on
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can do |
||
| b.makeCopy(Array(Cast(left, DecimalType.forType(t)), right)) | ||
| case (DecimalType.Fixed(p, s), t: IntegralType) => | ||
| b.makeCopy(Array(left, Cast(right, DecimalType.forType(t)))) | ||
| case (t, DecimalType.Fixed(p, s)) if isFloat(t) => | ||
| b.makeCopy(Array(left, Cast(right, DoubleType))) | ||
| case (DecimalType.Fixed(p, s), t) if isFloat(t) => | ||
| b.makeCopy(Array(Cast(left, DoubleType), right)) | ||
| case _ => | ||
| b | ||
| (left, right) match { | ||
| // Promote literal integers inside a binary expression with fixed-precision decimals to | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We only need to deal with integers? how about float and double?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when float and double are involved, the decimal is converted to double |
||
| // decimals. The precision and scale are the ones strictly needed by the integer value. | ||
| // Requiring more precision than necessary may lead to a useless loss of precision. | ||
| // Consider the following example: multiplying a column which is DECIMAL(38, 18) by 2. | ||
| // If we use the default precision and scale for the integer type, 2 is considered a | ||
| // DECIMAL(10, 0). According to the rules, the result would be DECIMAL(38 + 10 + 1, 18), | ||
| // which is out of range and therefore it will becomes DECIMAL(38, 7), leading to | ||
| // potentially loosing 11 digits of the fractional part. Using only the precision needed | ||
| // by the Literal, instead, the result would be DECIMAL(38 + 1 + 1, 18), which would | ||
| // become DECIMAL(38, 16), safely having a much lower precision loss. | ||
| case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] | ||
| && l.dataType.isInstanceOf[IntegralType] => | ||
| b.makeCopy(Array(Cast(l, DecimalType.fromLiteral(l)), r)) | ||
| case (l, r: Literal) if l.dataType.isInstanceOf[DecimalType] | ||
| && r.dataType.isInstanceOf[IntegralType] => | ||
| b.makeCopy(Array(l, Cast(r, DecimalType.fromLiteral(r)))) | ||
| // Promote integers inside a binary expression with fixed-precision decimals to decimals, | ||
| // and fixed-precision decimals in an expression with floats / doubles to doubles | ||
| case (l @ IntegralType(), r @ DecimalType.Expression(_, _)) => | ||
| b.makeCopy(Array(Cast(l, DecimalType.forType(l.dataType)), r)) | ||
| case (l @ DecimalType.Expression(_, _), r @ IntegralType()) => | ||
| b.makeCopy(Array(l, Cast(r, DecimalType.forType(r.dataType)))) | ||
| case (l, r @ DecimalType.Expression(_, _)) if isFloat(l.dataType) => | ||
| b.makeCopy(Array(l, Cast(r, DoubleType))) | ||
| case (l @ DecimalType.Expression(_, _), r) if isFloat(r.dataType) => | ||
| b.makeCopy(Array(Cast(l, DoubleType), r)) | ||
| case _ => b | ||
| } | ||
| } | ||
|
|
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1048,6 +1048,16 @@ object SQLConf { | |
| .booleanConf | ||
| .createWithDefault(true) | ||
|
|
||
| val DECIMAL_OPERATIONS_ALLOW_PREC_LOSS = | ||
| buildConf("spark.sql.decimalOperations.allowPrecisionLoss") | ||
| .internal() | ||
| .doc("When true (default), establishing the result type of an arithmetic operation " + | ||
| "happens according to Hive behavior and SQL ANSI 2011 specification, ie. rounding the " + | ||
| "decimal part of the result if an exact representation is not possible. Otherwise, NULL " + | ||
| "is returned in those cases, as previously.") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah. This is better. |
||
| .booleanConf | ||
| .createWithDefault(true) | ||
|
|
||
| val SQL_STRING_REDACTION_PATTERN = | ||
| ConfigBuilder("spark.sql.redaction.string.regex") | ||
| .doc("Regex to decide which parts of strings produced by Spark contain sensitive " + | ||
|
|
@@ -1423,6 +1433,8 @@ class SQLConf extends Serializable with Logging { | |
|
|
||
| def replaceExceptWithFilter: Boolean = getConf(REPLACE_EXCEPT_WITH_FILTER) | ||
|
|
||
| def decimalOperationsAllowPrecisionLoss: Boolean = getConf(DECIMAL_OPERATIONS_ALLOW_PREC_LOSS) | ||
|
|
||
| def continuousStreamingExecutorQueueSize: Int = getConf(CONTINUOUS_STREAMING_EXECUTOR_QUEUE_SIZE) | ||
|
|
||
| def continuousStreamingExecutorPollIntervalMs: Long = | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -23,7 +23,7 @@ import scala.reflect.runtime.universe.typeTag | |
|
|
||
| import org.apache.spark.annotation.InterfaceStability | ||
| import org.apache.spark.sql.AnalysisException | ||
| import org.apache.spark.sql.catalyst.expressions.Expression | ||
| import org.apache.spark.sql.catalyst.expressions.{Expression, Literal} | ||
|
|
||
|
|
||
| /** | ||
|
|
@@ -117,6 +117,7 @@ object DecimalType extends AbstractDataType { | |
| val MAX_SCALE = 38 | ||
| val SYSTEM_DEFAULT: DecimalType = DecimalType(MAX_PRECISION, 18) | ||
| val USER_DEFAULT: DecimalType = DecimalType(10, 0) | ||
| val MINIMUM_ADJUSTED_SCALE = 6 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before naming a conf, I need to understand the rule you are following. https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql The SQL Server only applies
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I followed Hive's implementation which works like this and applies this 6 digits minimum to all operations. This means that SQLServer allows to round more digits than us in those cases, ie. we ensure at least 6 digits for the scale, while SQLServer doesn't.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gatorsmile what about
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how about
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should make it an internal conf and remove it after some releases.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, I'll go with that, thanks @cloud-fan. |
||
|
|
||
| // The decimal types compatible with other numeric types | ||
| private[sql] val ByteDecimal = DecimalType(3, 0) | ||
|
|
@@ -136,10 +137,52 @@ object DecimalType extends AbstractDataType { | |
| case DoubleType => DoubleDecimal | ||
| } | ||
|
|
||
| private[sql] def fromLiteral(literal: Literal): DecimalType = literal.value match { | ||
| case v: Short => fromBigDecimal(BigDecimal(v)) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we just use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, please see my comments above. |
||
| case v: Int => fromBigDecimal(BigDecimal(v)) | ||
| case v: Long => fromBigDecimal(BigDecimal(v)) | ||
| case _ => forType(literal.dataType) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This list is incomplete. Is that possible, the input literal is
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this problem was present before this PR. Should we fix it here? Is this fix needed? I guess that if it would have been a problem, it would already have been reported. |
||
| } | ||
|
|
||
| private[sql] def fromBigDecimal(d: BigDecimal): DecimalType = { | ||
| DecimalType(Math.max(d.precision, d.scale), d.scale) | ||
| } | ||
|
|
||
| private[sql] def bounded(precision: Int, scale: Int): DecimalType = { | ||
| DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE)) | ||
| } | ||
|
|
||
| /** | ||
| * Scale adjustment implementation is based on Hive's one, which is itself inspired to | ||
| * SQLServer's one. In particular, when a result precision is greater than | ||
| * {@link #MAX_PRECISION}, the corresponding scale is reduced to prevent the integral part of a | ||
| * result from being truncated. | ||
| * | ||
| * This method is used only when `spark.sql.decimalOperations.allowPrecisionLoss` is set to true. | ||
| */ | ||
| private[sql] def adjustPrecisionScale(precision: Int, scale: Int): DecimalType = { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logics in this adjustment function is also different from the MS SQL Server docs.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry, but I think this is exactly the same which is described there. The implementation might seem doing different things but actually the result will be the same. They both take the min between 6 and the desired scale if the precision is not enough to represent the whole scale.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the rule in document is I think this is a little different from the current rule Think aboout the case
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @cloud-fan yes, but you have to keep in mind that we are doing so only when precision is > 38. With some simple math (given
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah i see, makes sense
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, this part is consistent. |
||
| // Assumptions: | ||
| assert(precision >= scale) | ||
| assert(scale >= 0) | ||
|
|
||
| if (precision <= MAX_PRECISION) { | ||
| // Adjustment only needed when we exceed max precision | ||
| DecimalType(precision, scale) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't we also prevent
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is prevented outside this function. |
||
| } else { | ||
| // Precision/scale exceed maximum precision. Result must be adjusted to MAX_PRECISION. | ||
| val intDigits = precision - scale | ||
| // If original scale is less than MINIMUM_ADJUSTED_SCALE, use original scale value; otherwise | ||
| // preserve at least MINIMUM_ADJUSTED_SCALE fractional digits | ||
| val minScaleValue = Math.min(scale, MINIMUM_ADJUSTED_SCALE) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds like
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Don't you get a scale lower than
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, sorry, my answer was very poor, I will rephrase. |
||
| // The resulting scale is the maximum between what is available without causing a loss of | ||
| // digits for the integer part of the decimal and the minimum guaranteed scale, which is | ||
| // computed above | ||
| val adjustedScale = Math.max(MAX_PRECISION - intDigits, minScaleValue) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds like
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line needs some comments. |
||
|
|
||
| DecimalType(MAX_PRECISION, adjustedScale) | ||
| } | ||
| } | ||
|
|
||
| override private[sql] def defaultConcreteType: DataType = SYSTEM_DEFAULT | ||
|
|
||
| override private[sql] def acceptsType(other: DataType): Boolean = { | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the new behavior introduced in Hive 2.2. We have to emphasize it.