Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-16381] make sql_query example runnable
  • Loading branch information
keypointt committed Jul 8, 2016
commit 05ee46bc46ddcb6855ab85ea79f256b1d6d27b90
5 changes: 5 additions & 0 deletions examples/src/main/r/RSparkSQLExample.R
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ head(count(groupBy(df, "age")))
# $example off:dataframe_operations$


# Create a DataFrame from json file
path <- file.path(Sys.getenv("SPARK_HOME"), "examples/src/main/resources/people.json")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use the same df from before or do we need to create a new one here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use the df from before, basically peopleDF here is the same as df before

but right after this line is df <- sql("SELECT * FROM table"), I think it's better use another name in this example for clarity.

what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I see. since this is a select * query the contents are not changing. Would something like

createOrReplaceTempView(df, "table")
df <- sql("SELECT * FROM table")

work ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great, I'll do it

peopleDF <- read.json(path)
# Register this DataFrame as a table.
createOrReplaceTempView(peopleDF, "table")
# $example on:sql_query$
df <- sql("SELECT * FROM table")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example line is fine in doc but it won't run in an example R file - I think it does illustrate how to run a SQL query but there is no setup to create a temp view table before using it...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here should I add more to create the table? or just leave it since it's only for demonstration purpose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets register df from above using createExternalTable and then run the query. We should aim for a case where this R file should be executable on its own

# $example off:sql_query$
Expand Down