Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
3d567a3
[MINOR][SQL] Avoid unnecessary invocation on checkAndGlobPathIfNecessary
Ngone51 Oct 22, 2019
484f93e
[SPARK-29530][SQL] Make SQLConf in SQL parse process thread safe
AngersZhuuuu Oct 22, 2019
467c3f6
[SPARK-29529][DOCS] Remove unnecessary orc version and hive version i…
denglingang Oct 22, 2019
811d563
[SPARK-29536][PYTHON] Upgrade cloudpickle to 1.1.1 to support Python 3.8
HyukjinKwon Oct 22, 2019
868d851
[SPARK-29232][ML] Update the parameter maps of the DecisionTreeRegres…
huaxingao Oct 22, 2019
3163b6b
[SPARK-29516][SQL][TEST] Test ThriftServerQueryTestSuite asynchronously
wangyum Oct 22, 2019
bb49c80
[SPARK-21492][SQL] Fix memory leak in SortMergeJoin
xuanyuanking Oct 22, 2019
b4844ee
[SPARK-29517][SQL] TRUNCATE TABLE should look up catalog/table like v…
viirya Oct 22, 2019
8779938
[SPARK-28787][DOC][SQL] Document LOAD DATA statement in SQL Reference
huaxingao Oct 22, 2019
c1c6485
[SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference
dilipbiswal Oct 22, 2019
2036a8c
[SPARK-29488][WEBUI] In Web UI, stage page has js error when sort table
jennyinspur Oct 22, 2019
8009468
[SPARK-29556][CORE] Avoid putting request path in error response in E…
srowen Oct 22, 2019
3bf5355
[SPARK-29539][SQL] SHOW PARTITIONS should look up catalog/table like …
huaxingao Oct 22, 2019
f23c5d7
[SPARK-29560][BUILD] Add typesafe bintray repo for sbt-mima-plugin
dongjoon-hyun Oct 22, 2019
e674909
[SPARK-29107][SQL][TESTS] Port window.sql (Part 1)
DylanGuedes Oct 23, 2019
c128ac5
[SPARK-29511][SQL] DataSourceV2: Support CREATE NAMESPACE
imback82 Oct 23, 2019
8c34690
[SPARK-29546][TESTS] Recover jersey-guava test dependency in docker-i…
dongjoon-hyun Oct 23, 2019
cbe6ead
[SPARK-29352][SQL][SS] Track active streaming queries in the SparkSes…
brkyvz Oct 23, 2019
70dd9c0
[SPARK-29542][SQL][DOC] Make the descriptions of spark.sql.files.* be…
turboFei Oct 23, 2019
0a70951
[SPARK-29499][CORE][PYSPARK] Add mapPartitionsWithIndex for RDDBarrier
ConeyLiu Oct 23, 2019
df00b5c
[SPARK-29569][BUILD][DOCS] Copy and paste minified jquery instead whe…
HyukjinKwon Oct 23, 2019
53a5f17
[SPARK-29513][SQL] REFRESH TABLE should look up catalog/table like v2…
imback82 Oct 23, 2019
bfbf282
[SPARK-29503][SQL] Remove conversion CreateNamedStruct to CreateNamed…
HeartSaVioR Oct 23, 2019
7e8e4c0
[SPARK-29552][SQL] Execute the "OptimizeLocalShuffleReader" rule when…
JkSelf Oct 23, 2019
5867707
[SPARK-29557][BUILD] Update dropwizard/codahale metrics library to 3.2.6
LucaCanali Oct 23, 2019
b91356e
[SPARK-29533][SQL][TESTS][FOLLOWUP] Regenerate the result on EC2
dongjoon-hyun Oct 23, 2019
7ecf968
[SPARK-29567][TESTS] Update JDBC Integration Test Docker Images
dongjoon-hyun Oct 23, 2019
fd899d6
[SPARK-29576][CORE] Use Spark's CompressionCodec for Ser/Deser of Map…
dbtsai Oct 24, 2019
55ced9c
[SPARK-29571][SQL][TESTS][FOLLOWUP] Fix UT in AllExecutionsPageSuite
07ARB Oct 24, 2019
177bf67
[SPARK-29522][SQL] CACHE TABLE should look up catalog/table like v2 c…
viirya Oct 24, 2019
9e77d48
[SPARK-21492][SQL][FOLLOW UP] Reimplement UnsafeExternalRowSorter in …
xuanyuanking Oct 24, 2019
1296bbb
[SPARK-29504][WEBUI] Toggle full job description on click
PavithraRamachandran Oct 24, 2019
67cf043
[SPARK-29145][SQL] Support sub-queries in join conditions
AngersZhuuuu Oct 24, 2019
1ec1b2b
[SPARK-28791][DOC] Documentation for Alter table Command
PavithraRamachandran Oct 24, 2019
76d4beb
[SPARK-29559][WEBUI] Support pagination for JDBC/ODBC Server page
shahidki31 Oct 24, 2019
a35fb4f
[SPARK-29578][TESTS] Add "8634" as another skipped day for Kwajalein …
srowen Oct 24, 2019
cdea520
[SPARK-29532][SQL] Simplify interval string parsing
cloud-fan Oct 24, 2019
dcf5eaf
[SPARK-29444][FOLLOWUP] add doc and python parameter for ignoreNullFi…
Oct 24, 2019
92b2529
[SPARK-21287][SQL] Remove requirement of fetch_size>=0 from JDBCOptions
fuwhu Oct 24, 2019
dec99d8
[SPARK-29526][SQL] UNCACHE TABLE should look up catalog/table like v2…
imback82 Oct 24, 2019
40df9d2
[SPARK-29227][SS] Track rule info in optimization phase
wenxuanguan Oct 25, 2019
7417c3e
[SPARK-29597][DOCS] Deprecate old Java 8 versions prior to 8u92
dongjoon-hyun Oct 25, 2019
1474ed0
[SPARK-29562][SQL] Speed up and slim down metric aggregation in SQL l…
Oct 25, 2019
091cbc3
[SPARK-9612][ML] Add instance weight support for GBTs
zhengruifeng Oct 25, 2019
cfbdd9d
[SPARK-29461][SQL] Measure the number of records being updated for JD…
HeartSaVioR Oct 25, 2019
8bd8f49
[SPARK-29500][SQL][SS] Support partition column when writing to Kafka
redsk Oct 25, 2019
0cf4f07
[SPARK-29545][SQL] Add support for bit_xor aggregate function
yaooqinn Oct 25, 2019
68dca9a
[SPARK-29527][SQL] SHOW CREATE TABLE should look up catalog/table lik…
viirya Oct 25, 2019
ae5b60d
[SPARK-29182][CORE][FOLLOWUP] Cache preferred locations of checkpoint…
viirya Oct 25, 2019
2baf7a1
[SPARK-29608][BUILD] Add `hadoop-3.2` profile to release build
dongjoon-hyun Oct 25, 2019
2549391
[SPARK-29580][TESTS] Add kerberos debug messages for Kafka secure tests
gaborgsomogyi Oct 25, 2019
5bdc58b
[SPARK-27653][SQL][FOLLOWUP] Fix `since` version of `min_by/max_by`
dongjoon-hyun Oct 26, 2019
9a46702
[SPARK-29554][SQL] Add `version` SQL function
yaooqinn Oct 26, 2019
2115bf6
[SPARK-29490][SQL] Reset 'WritableColumnVector' in 'RowToColumnarExec'
marin-ma Oct 26, 2019
077fb99
[SPARK-29589][WEBUI] Support pagination for sqlstats session table in…
shahidki31 Oct 26, 2019
74514b4
[SPARK-29614][SQL][TEST] Fix failures of DateTimeUtilsSuite and Times…
MaxGekk Oct 27, 2019
a43b966
[SPARK-29613][BUILD][SS] Upgrade to Kafka 2.3.1
dongjoon-hyun Oct 27, 2019
b19fd48
[SPARK-29093][PYTHON][ML] Remove automatically generated param setter…
huaxingao Oct 28, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-29490][SQL] Reset 'WritableColumnVector' in 'RowToColumnarExec'
### What changes were proposed in this pull request?
Reset the `WritableColumnVector` when getting "next" ColumnarBatch in `RowToColumnarExec`
### Why are the changes needed?
When converting `Iterator[InternalRow]` to `Iterator[ColumnarBatch]`, the vectors used to create a new `ColumnarBatch` should be reset in the iterator's "next()" method.
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
N/A

Closes apache#26137 from rongma1997/reset-WritableColumnVector.

Authored-by: rongma1997 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
marin-ma authored and dongjoon-hyun committed Oct 26, 2019
commit 2115bf61465b504bc21e37465cb34878039b5cb8
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,7 @@ case class RowToColumnarExec(child: SparkPlan) extends UnaryExecNode {

override def next(): ColumnarBatch = {
cb.setNumRows(0)
vectors.foreach(_.reset())
var rowCount = 0
while (rowCount < numRows && rowIterator.hasNext) {
val row = rowIterator.next()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan,
import org.apache.spark.sql.catalyst.rules.Rule
import org.apache.spark.sql.execution._
import org.apache.spark.sql.execution.vectorized.OnHeapColumnVector
import org.apache.spark.sql.internal.SQLConf.COLUMN_BATCH_SIZE
import org.apache.spark.sql.internal.StaticSQLConf.SPARK_SESSION_EXTENSIONS
import org.apache.spark.sql.types.{DataType, Decimal, IntegerType, LongType, Metadata, StructType}
import org.apache.spark.sql.vectorized.{ColumnarArray, ColumnarBatch, ColumnarMap, ColumnVector}
Expand Down Expand Up @@ -171,6 +172,30 @@ class SparkSessionExtensionSuite extends SparkFunSuite {
}
}

test("reset column vectors") {
val session = SparkSession.builder()
.master("local[1]")
.config(COLUMN_BATCH_SIZE.key, 2)
.withExtensions { extensions =>
extensions.injectColumnar(session =>
MyColumarRule(PreRuleReplaceAddWithBrokenVersion(), MyPostRule())) }
.getOrCreate()

try {
assert(session.sessionState.columnarRules.contains(
MyColumarRule(PreRuleReplaceAddWithBrokenVersion(), MyPostRule())))
import session.sqlContext.implicits._

val input = Seq((100L), (200L), (300L))
val data = input.toDF("vals").repartition(1)
val df = data.selectExpr("vals + 1")
val result = df.collect()
assert(result sameElements input.map(x => Row(x + 2)))
} finally {
stop(session)
}
}

test("use custom class for extensions") {
val session = SparkSession.builder()
.master("local[1]")
Expand Down