-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-12562][SQL] DataFrame.write.format(text) requires the column name to be called value #10515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -58,6 +58,17 @@ class TextSuite extends QueryTest with SharedSQLContext { | |
| } | ||
| } | ||
|
|
||
| test("SPARK-12562 verify write.text() can handle column name beyond `value`") { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why don't you just change the existing test case to rename the dataframe column and leave the following as a comment there? SPARK-12562 verify write.text() can handle column name beyond
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rxin I thought about it, but was not sure if it was a good idea to change the existing testcase. In the existing test, should I add a second dataframe with column renamed, or just replace the original dataframe with column renaming?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just replace the original one to something weird, like "adwrasdf" |
||
| val df = sqlContext.read.text(testFile).withColumnRenamed("value", "col1") | ||
|
|
||
| val tempFile = Utils.createTempDir() | ||
| tempFile.delete() | ||
| df.write.text(tempFile.getCanonicalPath) | ||
| verifyFrame(sqlContext.read.text(tempFile.getCanonicalPath)) | ||
|
|
||
| Utils.deleteRecursively(tempFile) | ||
| } | ||
|
|
||
| private def testFile: String = { | ||
| Thread.currentThread().getContextClassLoader.getResource("text-suite.txt").toString | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make sure that
textSchemais a struct type that has only one string field?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan DefaultSource.scala is the only place that creates a TextRelation, and it verifies that the schema is size 1 and of type string before creating a TextRelation. So I think it is fine not to verify again here. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, then it's fine