-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-10073] [SQL] Python withColumn should replace the old column #8300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #41205 has finished for PR 8300 at commit
|
python/pyspark/sql/dataframe.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not consistent with scala version, see the doc, we should not report error here.
Actually I'm wondering why we need to do checking at python side(not only this one)? Can we just call the scala API and catch the java exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, will remove it. I'm really surprised by the behavior in Scala.
The reason we want to have some check on Python side is that the Java exception is not easy to understand for Python programmer (nested inside a Py4j exception). The Python exception or messages does improve the experience for Python programmer, especially beginners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive my lack of python skills... could have have a generic wrapper that we use whenever we call back into scala that catches and unwraps certain expressions nicely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already did this for some of them, for example, AnalysisException and IllegalArgumentException, these could help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh cool, ideally we will pretty much only throw those two unless there is a bug. If there are cases where that is not true we should consider fixing on the scala side.
|
Test build #41260 has finished for PR 8300 at commit
|
DataFrame.withColumn in Python should be consistent with the Scala one (replacing the existing column that has the same name). cc marmbrus Author: Davies Liu <[email protected]> Closes #8300 from davies/with_column. (cherry picked from commit 0888736) Signed-off-by: Michael Armbrust <[email protected]>
|
Thanks! Merging to master and 1.5. |
DataFrame.withColumn in Python should be consistent with the Scala one (replacing the existing column that has the same name).
cc @marmbrus