Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions dev/deps/spark-deps-hadoop-2.6
Original file line number Diff line number Diff line change
Expand Up @@ -161,13 +161,13 @@ orc-mapreduce-1.4.3-nohive.jar
oro-2.0.8.jar
osgi-resource-locator-1.0.1.jar
paranamer-2.8.jar
parquet-column-1.8.2.jar
parquet-common-1.8.2.jar
parquet-encoding-1.8.2.jar
parquet-column-1.8.3.jar
parquet-common-1.8.3.jar
parquet-encoding-1.8.3.jar
parquet-format-2.3.1.jar
parquet-hadoop-1.8.2.jar
parquet-hadoop-1.8.3.jar
parquet-hadoop-bundle-1.6.0.jar
parquet-jackson-1.8.2.jar
parquet-jackson-1.8.3.jar
protobuf-java-2.5.0.jar
py4j-0.10.7.jar
pyrolite-4.13.jar
Expand Down
10 changes: 5 additions & 5 deletions dev/deps/spark-deps-hadoop-2.7
Original file line number Diff line number Diff line change
Expand Up @@ -162,13 +162,13 @@ orc-mapreduce-1.4.3-nohive.jar
oro-2.0.8.jar
osgi-resource-locator-1.0.1.jar
paranamer-2.8.jar
parquet-column-1.8.2.jar
parquet-common-1.8.2.jar
parquet-encoding-1.8.2.jar
parquet-column-1.8.3.jar
parquet-common-1.8.3.jar
parquet-encoding-1.8.3.jar
parquet-format-2.3.1.jar
parquet-hadoop-1.8.2.jar
parquet-hadoop-1.8.3.jar
parquet-hadoop-bundle-1.6.0.jar
parquet-jackson-1.8.2.jar
parquet-jackson-1.8.3.jar
protobuf-java-2.5.0.jar
py4j-0.10.7.jar
pyrolite-4.13.jar
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@
<!-- Version used for internal directory structure -->
<hive.version.short>1.2.1</hive.version.short>
<derby.version>10.12.1.1</derby.version>
<parquet.version>1.8.2</parquet.version>
<parquet.version>1.8.3</parquet.version>
<orc.version>1.4.3</orc.version>
<orc.classifier>nohive</orc.classifier>
<hive.parquet.version>1.6.0</hive.parquet.version>
Expand Down
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,16 @@ class ParquetFilterSuite extends QueryTest with ParquetTest with SharedSQLContex
}
}
}

test("SPARK-23852: Broken Parquet push-down for partially-written stats") {
// parquet-1217.parquet contains a single column with values -1, 0, 1, 2 and null.
// The row-group statistics include null counts, but not min and max values, which
// triggers PARQUET-1217.
val df = readResourceParquetFile("test-data/parquet-1217.parquet")
Copy link
Member

@dongjoon-hyun dongjoon-hyun May 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this test case assumes spark.sql.parquet.filterPushdown=true, let's use the followings. Otherwise, this test case will fail when we change the default configuration value.

withSQLConf(SQLConf.PARQUET_FILTER_PUSHDOWN_ENABLED.key -> "true",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be done in master (and backported to 2.3 if desired).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR for master is #21323. My guess is there's no reason to block this backport and 2.3.1 by waiting for it to land, but happy to do whatever.


// Will return 0 rows if PARQUET-1217 is not fixed.
assert(df.where("col > 0").count() === 2)
}
}

class NumRowGroupsAcc extends AccumulatorV2[Integer, Integer] {
Expand Down