Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
af2e7ab
Add multiply to CalendarInterval
MaxGekk Oct 15, 2019
4227915
Test for multiply()
MaxGekk Oct 15, 2019
379a20a
Add MultiplyInterval expression
MaxGekk Oct 15, 2019
3e9ed0f
Add divide to CalendarInterval
MaxGekk Oct 15, 2019
670a7c6
Test for divide()
MaxGekk Oct 15, 2019
3ae94cf
Handle ArithmeticException in MultiplyInterval
MaxGekk Oct 15, 2019
3bce68e
Test for the MultiplyInterval expression
MaxGekk Oct 15, 2019
754109c
Add DivideInterval expression
MaxGekk Oct 15, 2019
166dbd8
Test for the DivideInterval expression
MaxGekk Oct 15, 2019
b4dc59a
Remove unused import
MaxGekk Oct 15, 2019
1e2a9a6
Add new rules
MaxGekk Oct 15, 2019
a6b6d81
Tests for new rules
MaxGekk Oct 15, 2019
6e569c0
Add tests to datetime.sql
MaxGekk Oct 15, 2019
69a3cc7
Regen datetime.sql.out
MaxGekk Oct 15, 2019
001d17b
Long -> Double
MaxGekk Oct 15, 2019
014cde5
Fix comment
MaxGekk Oct 15, 2019
049f428
Merge branch 'master' into interval-mul-div
MaxGekk Oct 16, 2019
1ca7c89
Regen datetime.sql.out
MaxGekk Oct 16, 2019
9e6745a
Implement multiply and divide as PostgreSQL does
MaxGekk Oct 17, 2019
b428070
rounded -> truncated
MaxGekk Oct 17, 2019
8ad4001
Merge remote-tracking branch 'origin/master' into interval-mul-div
MaxGekk Oct 18, 2019
91337e5
Merge remote-tracking branch 'remotes/origin/master' into interval-mu…
MaxGekk Oct 25, 2019
2bb916f
Add new line at the end of datetime.sql
MaxGekk Oct 25, 2019
6ba53f0
Avoid fromString() in CalendarIntervalSuite
MaxGekk Oct 25, 2019
719fe6c
Merge remote-tracking branch 'remotes/origin/master' into interval-mu…
MaxGekk Nov 1, 2019
d05ffa4
Rebase on interval with days
MaxGekk Nov 1, 2019
34f6605
Regenerate datetime.sql.out
MaxGekk Nov 1, 2019
00ede6c
Check round micros
MaxGekk Nov 1, 2019
690d9c1
Modify div test to check days div
MaxGekk Nov 1, 2019
5b25432
Regenerate datetime.sql.out
MaxGekk Nov 1, 2019
2265449
Merge remote-tracking branch 'remotes/origin/master' into interval-mu…
MaxGekk Nov 4, 2019
e559fb9
Move multiply() and divide() to IntervalUtils
MaxGekk Nov 5, 2019
35ab9c0
Use DAYS_PER_MONTH
MaxGekk Nov 5, 2019
dbc39e8
Simplify MultiplyInterval and DivideInterval
MaxGekk Nov 5, 2019
8244460
Minor
MaxGekk Nov 5, 2019
b70c0f8
Simplify tests
MaxGekk Nov 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Long -> Double
  • Loading branch information
MaxGekk committed Oct 15, 2019
commit 001d17b3b01c0782d6c73b67bfb7c9d7bf50a5be
Original file line number Diff line number Diff line change
Expand Up @@ -355,15 +355,15 @@ public CalendarInterval subtract(CalendarInterval that) {
return new CalendarInterval(months, microseconds);
}

public CalendarInterval multiply(long num) {
int months = Math.toIntExact(Math.multiplyExact(this.months, num));
long microseconds = Math.multiplyExact(this.microseconds, num);
public CalendarInterval multiply(double num) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use decimal instead? double is approximate value and we may truncate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • In that case, our implementation will deviate from postgesql which uses double internally. At the moment, we return the same results as postgresql (or I haven't found yet the case when the results are different).
  • most likely, it will be slower

If it is ok, I will switch to decimals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let's use double if it's what pgsql uses.

Can we move the add, subtract, multiply and divide to IntervalUtils? In case we want to change them in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the add, subtract ... to IntervalUtils?

Do you want to move + and - in this PR? I just want to double check this because those methods are not related to this PR directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both are fine. We can move them tother, or have a followup PR to move +/-

Copy link
Member

@yaooqinn yaooqinn Nov 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double is not big enough to support the average aggregate for interval #26347, I prefer decimal personally

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double is not big enough to support the average aggregate ...

@yaooqinn Could you explain what do you mean? I could image that double is not precise enough but not big though ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, precise, not big.

int months = Math.toIntExact((long)(num * this.months));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when dividing an interval of 1 month by, say, 2? You'd end up with an interval of 0 time. I suppose the right answer is whatever PostgreSQL does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check what it does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PosgreSQL does this differently. I have re-implemented the operations.

long microseconds = (long)(num * this.microseconds);
return new CalendarInterval(months, microseconds);
}

public CalendarInterval divide(long num) {
int months = Math.toIntExact(this.months / num);
long microseconds = this.microseconds / num;
public CalendarInterval divide(double num) {
int months = Math.toIntExact((long)(this.months / num));
long microseconds = (long)(this.microseconds / num);
return new CalendarInterval(months, microseconds);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -302,13 +302,17 @@ public void fromStringCaseSensitivityTest() {
@Test
public void multiplyTest() {
CalendarInterval interval = new CalendarInterval(0, 0);
assertEquals(interval.multiply(0), interval);
assertEquals(interval, interval.multiply(0));

interval = new CalendarInterval(123, 456);
assertEquals(interval.multiply(42), new CalendarInterval(123 * 42, 456 * 42));
assertEquals(new CalendarInterval(123 * 42, 456 * 42), interval.multiply(42));

interval = new CalendarInterval(-123, -456);
assertEquals(interval.multiply(42), new CalendarInterval(-123 * 42, -456 * 42));
assertEquals(new CalendarInterval(-123 * 42, -456 * 42), interval.multiply(42));

assertEquals(
new CalendarInterval((-123 * 3) / 2, (-456 * 3) / 2),
interval.multiply(1.5));

try {
interval = new CalendarInterval(2, 0);
Expand All @@ -317,33 +321,27 @@ public void multiplyTest() {
} catch (java.lang.ArithmeticException e) {
assertTrue(e.getMessage().contains("overflow"));
}

try {
interval = new CalendarInterval(0, 2);
interval.multiply(Long.MAX_VALUE);
fail("Expected to throw an exception on microseconds overflow");
} catch (java.lang.ArithmeticException e) {
assertTrue(e.getMessage().contains("overflow"));
}
}

@Test
public void divideTest() {
CalendarInterval interval = new CalendarInterval(0, 0);
assertEquals(interval.divide(10), interval);
assertEquals(interval, interval.divide(10));

interval = new CalendarInterval(10, 100);
assertEquals(interval.divide(3), new CalendarInterval(3, 33));
assertEquals(new CalendarInterval(3, 33), interval.divide(3));
assertEquals(new CalendarInterval(20, 200), interval.divide(0.5));

interval = new CalendarInterval(-10, -100);
assertEquals(interval.divide(3), new CalendarInterval(-3, -33));
assertEquals(new CalendarInterval(-3, -33), interval.divide(3));
assertEquals(new CalendarInterval(-6, -66), interval.divide(1.5));

try {
interval = new CalendarInterval(123, 456);
interval.divide(0);
fail("Expected to throw an exception on divide by zero");
} catch (java.lang.ArithmeticException e) {
assertTrue(e.getMessage().contains("by zero"));
assertTrue(e.getMessage().contains("overflow"));
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -847,11 +847,11 @@ object TypeCoercion {
Cast(TimeAdd(l, r), l.dataType)
case Subtract(l, r @ CalendarIntervalType()) if acceptedTypes.contains(l.dataType) =>
Cast(TimeSub(l, r), l.dataType)
case Multiply(l @ CalendarIntervalType(), r @ IntegralType()) =>
case Multiply(l @ CalendarIntervalType(), r @ NumericType()) =>
MultiplyInterval(l, r)
case Multiply(l @ IntegralType(), r @ CalendarIntervalType()) =>
case Multiply(l @ NumericType(), r @ CalendarIntervalType()) =>
MultiplyInterval(r, l)
case Divide(l @ CalendarIntervalType(), r @ IntegralType()) =>
case Divide(l @ CalendarIntervalType(), r @ NumericType()) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgres=# select interval '1 year' / '365';
   ?column?
---------------
 23:40:16.4064
(1 row)

could this be supported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking into account the discussion in #26165, I am not sure. @cloud-fan Should I support this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, but this should only apply to literals, not string columns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to support it, please open another PR.

DivideInterval(l, r)

case Add(l @ DateType(), r @ IntegerType()) => DateAdd(l, r)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,26 +18,26 @@
package org.apache.spark.sql.catalyst.expressions

import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, ExprCode}
import org.apache.spark.sql.types.{AbstractDataType, CalendarIntervalType, DataType, LongType}
import org.apache.spark.sql.types.{AbstractDataType, CalendarIntervalType, DataType, DoubleType}
import org.apache.spark.unsafe.types.CalendarInterval

abstract class IntervalNumOperation(
interval: Expression,
num: Expression,
operation: (CalendarInterval, Long) => CalendarInterval,
operation: (CalendarInterval, Double) => CalendarInterval,
operationName: String)
extends BinaryExpression with ImplicitCastInputTypes with Serializable {
override def left: Expression = interval
override def right: Expression = num

override def inputTypes: Seq[AbstractDataType] = Seq(CalendarIntervalType, LongType)
override def inputTypes: Seq[AbstractDataType] = Seq(CalendarIntervalType, DoubleType)
override def dataType: DataType = CalendarIntervalType

override def nullable: Boolean = true

override def nullSafeEval(interval: Any, num: Any): Any = {
try {
operation(interval.asInstanceOf[CalendarInterval], num.asInstanceOf[Long])
operation(interval.asInstanceOf[CalendarInterval], num.asInstanceOf[Double])
} catch {
case _: java.lang.ArithmeticException => null
}
Expand All @@ -62,12 +62,12 @@ case class MultiplyInterval(interval: Expression, num: Expression)
extends IntervalNumOperation(
interval,
num,
(i: CalendarInterval, n: Long) => i.multiply(n),
(i: CalendarInterval, n: Double) => i.multiply(n),
"multiply")

case class DivideInterval(interval: Expression, num: Expression)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extends IntervalNumOperation(
interval,
num,
(i: CalendarInterval, n: Long) => i.divide(n),
(i: CalendarInterval, n: Double) => i.divide(n),
"divide")
Original file line number Diff line number Diff line change
Expand Up @@ -1601,14 +1601,22 @@ class TypeCoercionSuite extends AnalysisTest {
test("rule for interval operations") {
val dateTimeOperations = TypeCoercion.DateTimeOperations
val interval = Literal(new CalendarInterval(0, 0))
val longValue = Literal(10L, LongType)

ruleTest(dateTimeOperations, Multiply(interval, longValue),
MultiplyInterval(interval, longValue))
ruleTest(dateTimeOperations, Multiply(longValue, interval),
MultiplyInterval(interval, longValue))
ruleTest(dateTimeOperations, Divide(interval, longValue),
DivideInterval(interval, longValue))

Seq(
Literal(10.toByte, ByteType),
Literal(10.toShort, ShortType),
Literal(10, IntegerType),
Literal(10L, LongType),
Literal(Decimal(10), DecimalType.SYSTEM_DEFAULT),
Literal(10.5.toFloat, FloatType),
Literal(10.5, DoubleType)).foreach { num =>
ruleTest(dateTimeOperations, Multiply(interval, num),
MultiplyInterval(interval, num))
ruleTest(dateTimeOperations, Multiply(num, interval),
MultiplyInterval(interval, num))
ruleTest(dateTimeOperations, Divide(interval, num),
DivideInterval(interval, num))
}
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,19 +22,25 @@ import org.apache.spark.unsafe.types.CalendarInterval.fromString

class IntervalExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
test("multiply") {
def multiply(interval: String, num: Long): Expression = {
def multiply(interval: String, num: Double): Expression = {
MultiplyInterval(Literal(fromString(interval)), Literal(num))
}
checkEvaluation(multiply("0 seconds", 10), fromString("0 seconds"))
checkEvaluation(multiply("10 hours", 0), fromString("0 hours"))
checkEvaluation(multiply("12 months 1 microseconds", 2), fromString("2 years 2 microseconds"))
checkEvaluation(multiply("-5 year 3 seconds", 3), fromString("-15 years 9 seconds"))
checkEvaluation(multiply("1 year 1 second", 0.5), fromString("6 months 500 milliseconds"))
checkEvaluation(
multiply("-100 years -1 millisecond", 0.5),
fromString("-50 years -500 microseconds"))
checkEvaluation(
multiply("2 months 4 seconds", -0.5),
fromString("-1 months -2 seconds"))
checkEvaluation(multiply("2 months", Int.MaxValue), null)
checkEvaluation(multiply("2 days", Long.MaxValue), null)
}

test("divide") {
def divide(interval: String, num: Long): Expression = {
def divide(interval: String, num: Double): Expression = {
DivideInterval(Literal(fromString(interval)), Literal(num))
}
checkEvaluation(divide("0 seconds", 10), fromString("0 seconds"))
Expand All @@ -47,6 +53,12 @@ class IntervalExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
checkEvaluation(
divide("6 years -7 seconds", 3),
fromString("2 years -2 seconds -333 milliseconds -333 microseconds"))
checkEvaluation(
divide("2 years -8 seconds", 0.5),
fromString("4 years -16 seconds"))
checkEvaluation(
divide("-1 month 2 microseconds", -0.25),
fromString("4 months -8 microseconds"))
checkEvaluation(divide("2 months", 0), null)
}
}
4 changes: 2 additions & 2 deletions sql/core/src/test/resources/sql-tests/inputs/datetime.sql
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,5 @@ select timestamp'2019-10-06 10:11:12.345678' - date'2020-01-01';

-- interval operations
select 3 * (timestamp'2019-10-15 10:11:12.001002' - date'2019-10-15');
select interval 1 month 2 weeks 3 microseconds * 2;
select (3 * (timestamp'2019-10-15' - timestamp'2019-10-14')) / 2;
select interval 4 month 2 weeks 3 microseconds * 1.5;
select (timestamp'2019-10-15' - timestamp'2019-10-14') / 1.5;
14 changes: 7 additions & 7 deletions sql/core/src/test/resources/sql-tests/results/datetime.sql.out
Original file line number Diff line number Diff line change
Expand Up @@ -150,22 +150,22 @@ interval -12 weeks -2 days -14 hours -48 minutes -47 seconds -654 milliseconds -
-- !query 17
select 3 * (timestamp'2019-10-15 10:11:12.001002' - date'2019-10-15')
-- !query 17 schema
struct<multiply_interval(timestampdiff(TIMESTAMP('2019-10-15 10:11:12.001002'), CAST(DATE '2019-10-15' AS TIMESTAMP)), CAST(3 AS BIGINT)):interval>
struct<multiply_interval(timestampdiff(TIMESTAMP('2019-10-15 10:11:12.001002'), CAST(DATE '2019-10-15' AS TIMESTAMP)), CAST(3 AS DOUBLE)):interval>
-- !query 17 output
interval 1 days 6 hours 33 minutes 36 seconds 3 milliseconds 6 microseconds


-- !query 18
select interval 1 month 2 weeks 3 microseconds * 2
select interval 4 month 2 weeks 3 microseconds * 1.5
-- !query 18 schema
struct<multiply_interval(interval 1 months 2 weeks 3 microseconds, CAST(2 AS BIGINT)):interval>
struct<multiply_interval(interval 4 months 2 weeks 3 microseconds, CAST(1.5 AS DOUBLE)):interval>
-- !query 18 output
interval 2 months 4 weeks 6 microseconds
interval 6 months 3 weeks 4 microseconds


-- !query 19
select (3 * (timestamp'2019-10-15' - timestamp'2019-10-14')) / 2
select (timestamp'2019-10-15' - timestamp'2019-10-14') / 1.5
-- !query 19 schema
struct<divide_interval(multiply_interval(timestampdiff(TIMESTAMP('2019-10-15 00:00:00'), TIMESTAMP('2019-10-14 00:00:00')), CAST(3 AS BIGINT)), CAST(2 AS BIGINT)):interval>
struct<divide_interval(timestampdiff(TIMESTAMP('2019-10-15 00:00:00'), TIMESTAMP('2019-10-14 00:00:00')), CAST(1.5 AS DOUBLE)):interval>
-- !query 19 output
interval 1 days 12 hours
interval 16 hours