Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented Aug 19, 2015

DataFrame.withColumn in Python should be consistent with the Scala one (replacing the existing column that has the same name).

cc @marmbrus

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #41205 has finished for PR 8300 at commit 9eb857b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not consistent with scala version, see the doc, we should not report error here.

Actually I'm wondering why we need to do checking at python side(not only this one)? Can we just call the scala API and catch the java exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will remove it. I'm really surprised by the behavior in Scala.

The reason we want to have some check on Python side is that the Java exception is not easy to understand for Python programmer (nested inside a Py4j exception). The Python exception or messages does improve the experience for Python programmer, especially beginners.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive my lack of python skills... could have have a generic wrapper that we use whenever we call back into scala that catches and unwraps certain expressions nicely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already did this for some of them, for example, AnalysisException and IllegalArgumentException, these could help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool, ideally we will pretty much only throw those two unless there is a bug. If there are cases where that is not true we should consider fixing on the scala side.

@SparkQA
Copy link

SparkQA commented Aug 19, 2015

Test build #41260 has finished for PR 8300 at commit 68ff0a9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Aug 19, 2015
DataFrame.withColumn in Python should be consistent with the Scala one (replacing the existing column  that has the same name).

cc marmbrus

Author: Davies Liu <[email protected]>

Closes #8300 from davies/with_column.

(cherry picked from commit 0888736)
Signed-off-by: Michael Armbrust <[email protected]>
@marmbrus
Copy link
Contributor

Thanks! Merging to master and 1.5.

@asfgit asfgit closed this in 0888736 Aug 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants