-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-3011][SQL] _temporary directory should be filtered out by sqlContext.parquetFile #1924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
test this please |
|
QA tests have started for PR 1924. This patch merges cleanly. |
|
QA results for PR 1924: |
|
BTW you can check style on your local machine by running |
|
code fixed. |
|
test this please |
|
QA tests have started for PR 1924. This patch merges cleanly. |
|
QA results for PR 1924: |
|
The test failures look unrelated. I've merge this to master and 1.1. Thanks! |
…ontext.parquetFile Author: Chia-Yung Su <[email protected]> Closes #1924 from joesu/bugfix-spark3011 and squashes the following commits: c7e44f2 [Chia-Yung Su] match syntax f8fc32a [Chia-Yung Su] filter out tmp dir (cherry picked from commit 078f3fb) Signed-off-by: Michael Armbrust <[email protected]>
|
I'm seeing build failures from this change on hadoop 0.23. I haven't had time to dig into it. sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala:382: value TEMP_DIR_NAME is not a member of object org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter |
|
I'll revert this change. |
…by sqlContext.parquetFile Reverts #1924 due to build failures with hadoop 0.23. Author: Michael Armbrust <[email protected]> Closes #1949 from marmbrus/revert1924 and squashes the following commits: 6bff940 [Michael Armbrust] Revert "[SPARK-3011][SQL] _temporary directory should be filtered out by sqlContext.parquetFile" (cherry picked from commit a7f8a4f) Signed-off-by: Michael Armbrust <[email protected]>
…by sqlContext.parquetFile Reverts #1924 due to build failures with hadoop 0.23. Author: Michael Armbrust <[email protected]> Closes #1949 from marmbrus/revert1924 and squashes the following commits: 6bff940 [Michael Armbrust] Revert "[SPARK-3011][SQL] _temporary directory should be filtered out by sqlContext.parquetFile"
|
Thanks, I took a quick look to see what the problem is and it appears that was removed in hadoop 0.23 and then put back in in hadoop 2.x to be compatible with hadoop 1.x: https://issues.apache.org/jira/browse/MAPREDUCE-5229 So unfortunately we would need a way to have it not look at that on hadoop 0.23. |
|
We could just hard code the string _temporary with a note about why.
|
|
Submitted another pull request: #1959 |
…ontext.parquetFile fix compile error on hadoop 0.23 for the pull request #1924. Author: Chia-Yung Su <[email protected]> Closes #1959 from joesu/bugfix-spark3011 and squashes the following commits: be30793 [Chia-Yung Su] remove .* and _* except _metadata 8fe2398 [Chia-Yung Su] add note to explain 40ea9bd [Chia-Yung Su] fix hadoop-0.23 compile error c7e44f2 [Chia-Yung Su] match syntax f8fc32a [Chia-Yung Su] filter out tmp dir (cherry picked from commit 4243bb6) Signed-off-by: Michael Armbrust <[email protected]>
…ontext.parquetFile fix compile error on hadoop 0.23 for the pull request #1924. Author: Chia-Yung Su <[email protected]> Closes #1959 from joesu/bugfix-spark3011 and squashes the following commits: be30793 [Chia-Yung Su] remove .* and _* except _metadata 8fe2398 [Chia-Yung Su] add note to explain 40ea9bd [Chia-Yung Su] fix hadoop-0.23 compile error c7e44f2 [Chia-Yung Su] match syntax f8fc32a [Chia-Yung Su] filter out tmp dir
…ontext.parquetFile Author: Chia-Yung Su <[email protected]> Closes apache#1924 from joesu/bugfix-spark3011 and squashes the following commits: c7e44f2 [Chia-Yung Su] match syntax f8fc32a [Chia-Yung Su] filter out tmp dir
…by sqlContext.parquetFile Reverts apache#1924 due to build failures with hadoop 0.23. Author: Michael Armbrust <[email protected]> Closes apache#1949 from marmbrus/revert1924 and squashes the following commits: 6bff940 [Michael Armbrust] Revert "[SPARK-3011][SQL] _temporary directory should be filtered out by sqlContext.parquetFile"
…ontext.parquetFile fix compile error on hadoop 0.23 for the pull request apache#1924. Author: Chia-Yung Su <[email protected]> Closes apache#1959 from joesu/bugfix-spark3011 and squashes the following commits: be30793 [Chia-Yung Su] remove .* and _* except _metadata 8fe2398 [Chia-Yung Su] add note to explain 40ea9bd [Chia-Yung Su] fix hadoop-0.23 compile error c7e44f2 [Chia-Yung Su] match syntax f8fc32a [Chia-Yung Su] filter out tmp dir
No description provided.