Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Simplify the tests
  • Loading branch information
dbtsai committed May 29, 2018
commit fed2846fe7c9ca2cb4534b23803cd29d5a18d4f9
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,14 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
checkAnswer(df.filter($"b".isin("z", "y")),
df.collect().toSeq.filter(r => r.getString(1) == "z" || r.getString(1) == "y"))

// Auto casting should work with mixture of different types in collections
checkAnswer(df.filter($"a".isin(1.toShort, "2")),
df.collect().toSeq.filter(r => r.getInt(0) == 1 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isin("3", 2.toLong)),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isin(3, "1")),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 1))

val df2 = Seq((1, Seq(1)), (2, Seq(2)), (3, Seq(3))).toDF("a", "b")

val e = intercept[AnalysisException] {
Expand All @@ -407,29 +415,9 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {

test("isInCollection: Scala Collection") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we simplify the test cases? you are just testing this api as a wrapper. you don't need to run so many queries for type coercion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

val df = Seq((1, "x"), (2, "y"), (3, "z")).toDF("a", "b")
checkAnswer(df.filter($"a".isInCollection(Seq(1, 2))),
df.collect().toSeq.filter(r => r.getInt(0) == 1 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq(3, 2))),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 2))
// Test with different types of collections
checkAnswer(df.filter($"a".isInCollection(Seq(3, 1))),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 1))

// Auto casting should work with mixture of different types in collections
checkAnswer(df.filter($"a".isInCollection(Seq(1.toShort, "2"))),
df.collect().toSeq.filter(r => r.getInt(0) == 1 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq("3", 2.toLong))),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq(3, "1"))),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 1))

checkAnswer(df.filter($"b".isInCollection(Seq("y", "x"))),
df.collect().toSeq.filter(r => r.getString(1) == "y" || r.getString(1) == "x"))
checkAnswer(df.filter($"b".isInCollection(Seq("z", "x"))),
df.collect().toSeq.filter(r => r.getString(1) == "z" || r.getString(1) == "x"))
checkAnswer(df.filter($"b".isInCollection(Seq("z", "y"))),
df.collect().toSeq.filter(r => r.getString(1) == "z" || r.getString(1) == "y"))

// Test with different types of collections
checkAnswer(df.filter($"a".isInCollection(Seq(1, 2).toSet)),
df.collect().toSeq.filter(r => r.getInt(0) == 1 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq(3, 2).toArray)),
Expand All @@ -450,29 +438,9 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {

test("isInCollection: Java Collection") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As stated up above, maybe this would make sense to do in Java, but your call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree with you that we should have tests natively in Java instead of converting the types to Java in Scala and hope the best that it will work in Java. Let's do it in the followup PR.

val df = Seq((1, "x"), (2, "y"), (3, "z")).toDF("a", "b")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing here. just run a single test case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// Test with different types of collections
checkAnswer(df.filter($"a".isInCollection(Seq(1, 2).asJava)),
df.collect().toSeq.filter(r => r.getInt(0) == 1 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq(3, 2).asJava)),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq(3, 1).asJava)),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 1))

// Auto casting should work with mixture of different types in collections
checkAnswer(df.filter($"a".isInCollection(Seq(1.toShort, "2").asJava)),
df.collect().toSeq.filter(r => r.getInt(0) == 1 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq("3", 2.toLong).asJava)),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq(3, "1").asJava)),
df.collect().toSeq.filter(r => r.getInt(0) == 3 || r.getInt(0) == 1))

checkAnswer(df.filter($"b".isInCollection(Seq("y", "x").asJava)),
df.collect().toSeq.filter(r => r.getString(1) == "y" || r.getString(1) == "x"))
checkAnswer(df.filter($"b".isInCollection(Seq("z", "x").asJava)),
df.collect().toSeq.filter(r => r.getString(1) == "z" || r.getString(1) == "x"))
checkAnswer(df.filter($"b".isInCollection(Seq("z", "y").asJava)),
df.collect().toSeq.filter(r => r.getString(1) == "z" || r.getString(1) == "y"))

// Test with different types of collections
checkAnswer(df.filter($"a".isInCollection(Seq(1, 2).toSet.asJava)),
df.collect().toSeq.filter(r => r.getInt(0) == 1 || r.getInt(0) == 2))
checkAnswer(df.filter($"a".isInCollection(Seq(3, 1).toList.asJava)),
Expand Down