Skip to content

Conversation

@adrian-wang
Copy link
Contributor

I'll add test case in #4040

@SparkQA
Copy link

SparkQA commented Jan 15, 2015

Test build #25599 has started for PR 4057 at commit 1b81275.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 15, 2015

Test build #25599 has finished for PR 4057 at commit 1b81275.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25599/
Test PASSed.

@adrian-wang adrian-wang changed the title [SPARK-5262] [SQL] coalesce should allow NullType and 1 another type in parameters [SPARK-5262] [SQL] [WIP] coalesce should allow NullType and 1 another type in parameters Jan 20, 2015
@adrian-wang adrian-wang changed the title [SPARK-5262] [SQL] [WIP] coalesce should allow NullType and 1 another type in parameters [SPARK-5262] [SQL] coalesce should allow NullType and 1 another type in parameters Jan 20, 2015
@yhuai
Copy link
Contributor

yhuai commented Jan 21, 2015

Thank you for working it!

Can you explain why your change can make Coalesce(null, 1, null) work? To me, it looks like an issue in constant folding instead of the return type.

Also, if we want to parse Coalesce, I think we really need to correctly set the return type of it (and correctly set nullable). Just taking the first expression's type seems not right. Can you take a look at this issue?

@marmbrus
Copy link
Contributor

Also, I already merged #4040, can you add the test here?

@yhuai
Copy link
Contributor

yhuai commented Jan 22, 2015

@adrian-wang Seems we need to add rule in org.apache.spark.sql.catalyst.analysis.HiveTypeCoercion.FunctionArgumentConversion to cast the data types of expressions in Coalesce. Also, seems ExpressionEvaluationSuite will be a good place to add more tests for evaluating Coalesce.

@adrian-wang
Copy link
Contributor Author

@yhuai I don't think we need to cast data types, since coalesce() only return the first non-null value from its parameters, so use the first parameter's datatype is OK. I was adding tests in ExpressionEvaluationSuite but was blocked by some code-gen tests, and I am working on that.

@marmbrus it seems #4040 is reverted due to some conflicts... should I port all the code into this PR together?

@yhuai
Copy link
Contributor

yhuai commented Jan 22, 2015

Although coalesce only returns the first non-null value, but we need to correctly set the type of coalesce. Let's say we have two columns. The type of one is LongType and the type of another one is IntType, when we do coalesce, the return type of coalesce should be LongType. Right now, there will be an error when we are trying to call coalesce for these two columns. In FunctionArgumentConversion, we should make sure all expressions of coalesce are in the same type and if they are not, we should add cast.

@SparkQA
Copy link

SparkQA commented Jan 23, 2015

Test build #26016 has started for PR 4057 at commit e47c03a.

  • This patch merges cleanly.

@adrian-wang
Copy link
Contributor Author

Yes, I moved my work to FunctionArgumentConversion, and since #4040 is reverted due to conflicts, I added the code together here. So I leave Coalesce() untouched, since we would have the same type in Coalesce for sure. I'll change the title accordingly.

@adrian-wang adrian-wang changed the title [SPARK-5262] [SQL] coalesce should allow NullType and 1 another type in parameters [SPARK-5262] [SPARK-5244] [SQL] add coalesce in SQLParser and widen types for parameters of coalesce Jan 23, 2015
@SparkQA
Copy link

SparkQA commented Jan 23, 2015

Test build #26016 has finished for PR 4057 at commit e47c03a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Rating(userId: Int, movieId: Int, rating: Float, timestamp: Long)
    • case class Movie(movieId: Int, title: String, genres: Seq[String])
    • case class Params(
    • class ALS extends Estimator[ALSModel] with ALSParams
    • case class RatingBlock(srcIds: Array[Int], dstIds: Array[Int], ratings: Array[Float])

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26016/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Jan 23, 2015

Test build #26017 has started for PR 4057 at commit c393e18.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 23, 2015

Test build #26017 has finished for PR 4057 at commit c393e18.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26017/
Test PASSed.

@adrian-wang
Copy link
Contributor Author

ping @yhuai

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be good to add a comment to explain what we are doing at here.

@yhuai
Copy link
Contributor

yhuai commented Jan 28, 2015

A few things I feel we should add in this PR.

  • Parsing Coalesce in HiveQL.
  • Unit tests to make sure we can correctly determine the return type (if we do not have these tests)
  • Unit tests to make sure we can throw the correct exception when we get cases that we do not handle (for example, complex types).

I think we also need to add a rule in DecimalPrecision for Coalesce. But, I think it is fine to do it in a follow-up PR (Can you create a PR for it?).

@adrian-wang
Copy link
Contributor Author

@yhuai Thanks for your comments! I'll modify this PR accordingly.

@SparkQA
Copy link

SparkQA commented Jan 30, 2015

Test build #26385 has started for PR 4057 at commit 4d0111a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 30, 2015

Test build #26385 has finished for PR 4057 at commit 4d0111a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26385/
Test PASSed.

@marmbrus
Copy link
Contributor

marmbrus commented Feb 2, 2015

Thanks! Merged to master.

@asfgit asfgit closed this in 8cf4a1f Feb 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants