-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40587][CONNECT][FOLLOW-UP] Make sure python client support select * #38218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best is to support dataframe.schema then check the schema. However the StructType support in proto is under development: #38200, and there could be other blocking issues there.
Once schema is supported in proto, I will follow up to use .schema on python side whenever it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it can not support selecting struct.* now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No not yet. That is on TODO list. We need proto change accordingly to support it.
Right now we don't even have a complete Struct support yet :)
python/pyspark/sql/connect/plan.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did this not break before in the mypy checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. This breaks when code is running.
I should ask you probably as I remember you enabled mypy :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it can not support selecting struct.* now?
|
Can one of the admins verify this patch? |
704b953 to
1af9f98
Compare
zhengruifeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending CI
33b16a6 to
fefc9cb
Compare
fefc9cb to
b9b9bb3
Compare
| for c in self._raw_columns: | ||
| if isinstance(c, Expression): | ||
| proj_exprs.append(c.to_plan(session)) | ||
| elif c == "*": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about t1.*?
cloud-fan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as this is a followup, but we need to think more about how to support star completely in spark connect.
|
thanks, merging to master! |
…ect * ### What changes were proposed in this pull request? 1. Sync newest proto to python client. 2. Update Aggregate to match proto change. 3. Change `select` to have it accept both `column` and `str` 4. Make sure `*` pass through the entire path which has been implemented on the server side apache#38023 ### Why are the changes needed? Update python client side to match the change in connect proto. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT Closes apache#38218 from amaliujia/select_start_in_python. Authored-by: Rui Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
selectto have it accept bothcolumnandstr*pass through the entire path which has been implemented on the server side [SPARK-40587][CONNECT] Support SELECT * in an explicit way in connect proto #38023Why are the changes needed?
Update python client side to match the change in connect proto.
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT