-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16734][EXAMPLES][SQL] Revise examples of all language bindings #14368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #62883 has finished for PR 14368 at commit
|
| ## +----+-------+ | ||
| ## | age| name| | ||
| ## +----+-------+ | ||
| ## |null|Michael| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.. this is correct? I think it says NA in R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with SparkR, so please correct me if I'm wrong. IIRC, showDF is mapped to the Scala method Dataset.show(), which executes in the JVM, where NA isn't available, and we have to use null as the only reasonable alternative. You may see that the previous head(df) call does print NA instead of null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I simply executed the example R script and copied/pasted the output here from console.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I checked the code and saw why. head and take are collecting the DataFrame in which the JVM->R conversion takes place. As you said, showDF is calling a JVM method. I guess it's not super friendly to R users but null is used in some Spark DataFrame SQL functions.
ok for this doc update then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
|
@liancheng after second thought, I think it makes sense to also merge it to branch 2.0 to avoid potential conflicts on doc fixes. |
|
|
||
| # $example on:init_session$ | ||
| sparkR.session(appName = "MyApp", sparkConfig = list(spark.executor.memory = "1g")) | ||
| sparkR.session(appName = "MyApp", sparkConfig = list(spark.some.config.option = "some-value")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do we read this config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just an example for how to set extra configuration options. It's not read anywhere.
|
retest this please |
|
LGTM |
|
Test build #63103 has finished for PR 14368 at commit
|
## What changes were proposed in this pull request? This PR makes various minor updates to examples of all language bindings to make sure they are consistent with each other. Some typos and missing parts (JDBC example in Scala/Java/Python) are also fixed. ## How was this patch tested? Manually tested. Author: Cheng Lian <[email protected]> Closes #14368 from liancheng/revise-examples. (cherry picked from commit 10e1c0e) Signed-off-by: Wenchen Fan <[email protected]>
|
thanks, merging to master and 2.0! |
What changes were proposed in this pull request?
This PR makes various minor updates to examples of all language bindings to make sure they are consistent with each other. Some typos and missing parts (JDBC example in Scala/Java/Python) are also fixed.
How was this patch tested?
Manually tested.