Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
485 commits
Select commit Hold shift + click to select a range
15ff271
[SPARK-51619][PYTHON] Support UDT input / output in Arrow-optimized P…
ueshin Mar 27, 2025
762a744
[SPARK-51062][PYTHON] Fix assertSchemaEqual to compare decimal precis…
amoghantarkar Mar 27, 2025
edb2888
[SPARK-50657][DOCS][FOLLOW-UP] Fix the recommended pyarrow version
zhengruifeng Mar 28, 2025
73b21fc
[SPARK-51629][UI] Add a download link on the ExecutionPage for svg/do…
yaooqinn Mar 28, 2025
8c57500
[SPARK-51632][INFRA] Passing Command Line Arguments of lint-js to eslint
yaooqinn Mar 28, 2025
0fdff1b
[SPARK-51606][CONNECT] Add SPARK_IDENT_STRING when stopping Spark Con…
HyukjinKwon Mar 28, 2025
38a1958
[SPARK-51603][CONNECT][TESTS] Auto ignore tests that require starting…
LuciferYang Mar 28, 2025
936d4f8
[SPARK-51615][SQL] Refactor ShowNamespaces to use RunnableCommand
szehon-ho Mar 28, 2025
1dfb046
[MINOR][DOCS] Fix variable name typo in document
Mrhs121 Mar 28, 2025
4d7deab
[SPARK-51627][INFRA][FOLLOW-UP] Fix the workflow file name
zhengruifeng Mar 28, 2025
be448f2
[MINOR][BUILD][FOLLOWUP] Fix merge_spark_pr script for no jira case
yaooqinn Mar 28, 2025
c9fbcb1
[SPARK-51616][SQL] Run CollationTypeCasts before ResolveAliases and R…
vladimirg-db Mar 28, 2025
0895d19
[SPARK-51614][SQL] Introduce ResolveUnresolvedHaving rule in the Anal…
mihailoale-db Mar 28, 2025
13eeb10
[SPARK-51652][SQL] Refactor SetOperation computation out to reuse it …
mihailomilosevic2001 Mar 28, 2025
50afb73
[SPARK-51649][SQL] Dynamic writes/reads of TIME partitions
MaxGekk Mar 28, 2025
28a7562
[SPARK-51608][PYTHON] Log exception on Python runner termination
dmitrybricks Mar 28, 2025
825682d
[SPARK-51191][SQL] Validate default values handling in DELETE, UPDATE…
aokolnychyi Mar 28, 2025
325fb95
[SPARK-51662][SQL] Make OrcFileFormat comparable
vladimirg-db Mar 30, 2025
c9733e0
[SPARK-51645][SQL] Fix `CREATE OR REPLACE TABLE ... DEFAULT COLLATION…
ilicmarkodb Mar 30, 2025
73a3aba
[SPARK-51612][MINOR][FOLLOW-UP][SQL] Update Desc As JSON test to use …
asl3 Mar 30, 2025
21fad77
[SPARK-40353][SPARK-51599][PS][FOLLOW-UP] Fix failure in Python PS wi…
zhengruifeng Mar 30, 2025
b1275a0
[SPARK-51648][UI] Move inlined javascript in ExecutionPage to spark-s…
yaooqinn Mar 30, 2025
f945d01
[SPARK-51651][UI] Link the root execution id for current execution if…
yaooqinn Mar 30, 2025
e4eded8
[SPARK-51665][BUILD] Truncate the lists in dev/test-jars.txt and dev/…
HyukjinKwon Mar 31, 2025
c8c8c9a
[SPARK-51661][SQL] Partitions discovery of TIME column values
MaxGekk Mar 31, 2025
87b8f2b
[SPARK-51664][SQL] Support the TIME data type in the Hash expression
MaxGekk Mar 31, 2025
be8489f
[SPARK-51663][SQL] Short circuit && operation for JoinSelectionHelper
beliefer Mar 31, 2025
4128d0b
[SPARK-50994][SQL][FOLLOWUP] Do not use RDD with tracked execution in…
cloud-fan Mar 31, 2025
a73ca47
[SPARK-51669][SQL][TESTS] Generate random TIME values in tests
MaxGekk Mar 31, 2025
02db872
[SPARK-47895][SQL] group by alias should be idempotent
mihailomilosevic2001 Mar 31, 2025
a760df7
[SPARK-51667][SS][PYTHON] Disable Nagle's algorithm (via TCP_NODELAY …
HeartSaVioR Mar 31, 2025
72bd563
[SPARK-51537][CONNECT][CORE] construct the session-specific classload…
wbo4958 Mar 31, 2025
d642b6b
[SPARK-51252][SS][TESTS] Make tests more consistent for StateStoreIns…
zecookiez Mar 31, 2025
45eb03c
[SPARK-51674][DOCS] Remove unnecessary Spark Connect doc link from Sp…
xupefei Mar 31, 2025
c83b88a
[SPARK-51650][ML][CONNECT] Support delete ml cached objects in batch
zhengruifeng Apr 1, 2025
52bc072
[SPARK-51670][SQL] Refactor Intersect and Except to follow Union exam…
mihailomilosevic2001 Apr 1, 2025
13945c8
[SPARK-51657] UTF8_BINARY default table collation shown by default in…
asl3 Apr 1, 2025
c0fbc6b
[SPARK-51675][SS] Fix col family creation after opening local DB to a…
anishshri-db Apr 1, 2025
4e5ed45
[SPARK-51666][CORE] Fix sparkStageCompleted executorRunTime metric ca…
WeichenXu123 Apr 1, 2025
8839984
[SPARK-51655][SQL] Fix metric collection in UnionLoopExec and add test
Pajaraja Apr 1, 2025
6d60613
[SPARK-51668][SQL] Report metrics for failed writes to V2 data sources
olaky Apr 1, 2025
921eba8
[SPARK-51680][SQL] Set the logical type for TIME in the parquet writer
MaxGekk Apr 1, 2025
2f1bce2
[SPARK-51441][SQL] Add DSv2 APIs for constraints
aokolnychyi Apr 2, 2025
7cc72bd
[SPARK-51684][PYTHON][SS] Fix test failure in test_pandas_transform_w…
bogao007 Apr 2, 2025
c202bb2
[SPARK-51333][ML][PYTHON][CONNECT] Unwrap `InvocationTargetException`…
xi-db Apr 2, 2025
3ec2a9b
[SPARK-51681][CONNECT][BUILD] Remove redundant `build-helper-maven-pl…
LuciferYang Apr 2, 2025
5d71afc
[SPARK-51682][SS] State Store Checkpoint V2 should handle offset log …
siying Apr 2, 2025
beb509f
[SPARK-51687][SQL] Pushdown filters with TIME values to parquet
MaxGekk Apr 2, 2025
efcef70
[SPARK-51698][INFRA] Ignore sql/api/src/main/gen/ in Git
gengliangwang Apr 2, 2025
7347cac
[SPARK-51298][SQL] Support variant in CSV scan
chenhao-db Apr 3, 2025
ad0bc93
[SPARK-51677][CONNECT][TESTS] Refactor `ClientE2ETestSuite/SparkSessi…
LuciferYang Apr 3, 2025
295d37f
[SPARK-51701][PYTHON][TESTS] Move test objects to a separate file
ueshin Apr 3, 2025
211208a
[SPARK-51703][BUILD] Upgrade jetty to 11.0.25
LuciferYang Apr 3, 2025
a722b43
[SPARK-51686][UI] Link the execution IDs of sub-executions for curren…
yaooqinn Apr 3, 2025
a222f78
[SPARK-51700][SS] Fix incorrect logging when no files are eligible fo…
anishshri-db Apr 3, 2025
510edd7
[SPARK-51707][CONNECT][TESTS] Handling of `IllegalStateException` wit…
LuciferYang Apr 3, 2025
b0a7e2e
[SPARK-51685][SS] Excessive Info logging from RocksDb operations caus…
vinodkc Apr 4, 2025
3ab0f2d
[SPARK-51675][SS][FOLLOW-UP] Clear internal maps on close to remove D…
anishshri-db Apr 4, 2025
8def486
[SPARK-51350][SQL][FOLLOW-UP] Cleanup Show Procedure Test
szehon-ho Apr 4, 2025
babb950
[SPARK-51646][SQL] Fix propagating collation in views with default co…
ilicmarkodb Apr 4, 2025
9d3f937
[SPARK-51720][SQL] Add Cross Join as legal in recursion of Recursive CTE
Pajaraja Apr 4, 2025
eae5ca7
[SPARK-51704][SQL] Eliminate unnecessary collect operation
summaryzb Apr 4, 2025
ba7a537
[SPARK-50953][FOLLOW-UP] Allow whitespace/tab in variantGet paths
harshmotw-db Apr 5, 2025
24f600f
[SPARK-51725][SQL][TEST] Supplement tests for merge into
beliefer Apr 7, 2025
75d80c7
[SPARK-51079] Added constructor for ArrowSerializer for backward comp…
haiyangsun-db Apr 7, 2025
418cfd1
[SPARK-51690][SS] Change the protocol of ListState.put()/get()/append…
HeartSaVioR Apr 7, 2025
24e4000
[SPARK-51496][SQL][FOLLOW-UP] Preserve the case of DataSourceV2Relati…
drexler-sky Apr 7, 2025
738a503
[SPARK-51395][SQL] Refine handling of default values in procedures
aokolnychyi Apr 7, 2025
54288f8
[SPARK-51722][SQL] Remove "stop" origin from ParseException
gengliangwang Apr 7, 2025
554d678
[SPARK-51712][SQL] Swallow non-fatal Throwables when resolving tables…
heyihong Apr 7, 2025
edebb11
[SQL][MINOR] Fix warnings in DataSource/WriterV2Suites
szehon-ho Apr 8, 2025
78bbd39
[SPARK-51721][SQL] Change default value of ANALYZER_SINGLE_PASS_RESOL…
mihailoale-db Apr 8, 2025
6581e8c
[SPARK-51742][BUILD][TESTS] Upgrade `snowflake-jdbc` to 3.23.2
LuciferYang Apr 8, 2025
92f5d38
[SPARK-51711][ML][CONNECT] Memory based MLCache eviction policy
xi-db Apr 8, 2025
187adb8
[SPARK-51724][SS] RocksDB StateStore's lineage manager should be sync…
siying Apr 8, 2025
9efddcf
[SPARK-51611][SQL] New iteration of single-pass Analyzer functionality
vladimirg-db Apr 8, 2025
cba9454
[SPARK-51695][SQL] Introduce Parser Changes for Table Constraints (CH…
gengliangwang Apr 8, 2025
367d206
[MINOR][PYTHON][DOCS] Fix a typo in pyspark documentation
kirisakow Apr 9, 2025
daf9fe3
[SPARK-49862][INFRA][PYTHON] Restore package installations for Python…
zhengruifeng Apr 9, 2025
53966ae
[SPARK-51296][SQL] Support collecting corrupt data in singleVariantCo…
chenhao-db Apr 9, 2025
aa5a0a3
[SPARK-51733][BUILD] `make-distribution.sh` supports sbt
pan3793 Apr 9, 2025
3f16577
[SPARK-51717][SS][ROCKSDB] Fix SST mismatch corruption that can happe…
micheal-o Apr 9, 2025
c7024d3
[SPARK-51714][SS] Add Failure Ingestion test to test state store chec…
siying Apr 9, 2025
0246a7c
[SPARK-51732][SQL] Apply `rpad` on attributes with same `ExprId` if t…
mihailotim-db Apr 9, 2025
f4ce0ab
[SPARK-51512][CORE] Filter out null MapStatus when cleaning up shuffl…
wankunde Apr 9, 2025
5602fbf
[SPARK-51421][SQL] Get seconds of TIME datatype
senthh Apr 9, 2025
6692642
[SPARK-51738][SQL] IN subquery with struct type
cloud-fan Apr 9, 2025
99b7d2a
[SPARK-51673][SQL] Apply default collation to alter view query
ilicmarkodb Apr 9, 2025
1472ceb
[SPARK-51755][INFRA][PYTHON] Set up a scheduled builder for free-thre…
zhengruifeng Apr 10, 2025
c7ef21a
[SPARK-51754][PYTHON][DOCS][TESTS] Make `sampleBy` doctest deterministic
zhengruifeng Apr 10, 2025
c8cdeec
[SPARK-51751][SQL] Fix multiple rCTEs for one WITH statement that ref…
Pajaraja Apr 10, 2025
a9987a3
[SPARK-51738][SQL][FOLLOWUP] Fix HashJoin to accept structurally-equa…
ueshin Apr 10, 2025
d2a864f
[SPARK-51747][SQL] Data source cached plan should respect options
asl3 Apr 10, 2025
0e8cead
[SPARK-51699][BUILD] Upgrade to Apache parent pom 34
vrozov Apr 10, 2025
6cdf54b
[SPARK-51630][CORE][TESTS] Remove `pids` size check from "SPARK-45907…
LuciferYang Apr 10, 2025
6d731ee
[SPARK-51760][BUILD] Upgrade ASM to 9.8
LuciferYang Apr 10, 2025
a0bc044
[SPARK-51761][INFRA] Add a daily test using the Ubuntu Arm Runner
LuciferYang Apr 10, 2025
a33b9fc
[SPARK-51691][CORE][TESTS] SerializationDebugger should swallow excep…
summaryzb Apr 10, 2025
cabbcab
[SPARK-51726][SQL] Use TableInfo for Stage CREATE/REPLACE/CREATE OR R…
anoopj Apr 10, 2025
d4eea4c
[SPARK-51723][SQL] Add DSv2 APIs for create/get table constraints
gengliangwang Apr 10, 2025
104dc2b
[SPARK-51503][SQL] Support Variant type in XML scan
xiaonanyang-db Apr 10, 2025
622fa35
[SPARK-51537][CONNECT][CORE][FOLLOW-UP][TEST] Add test to ensure Spar…
wbo4958 Apr 10, 2025
9bc8cd5
[SPARK-51358][SS] Introduce snapshot upload lag detection through Sta…
zecookiez Apr 11, 2025
01c16af
[SPARK-51769][SQL] Add maxRecordsPerOutputBatch to limit the number o…
viirya Apr 11, 2025
c577ae7
[SPARK-51423][SQL] Add the current_time() function for TIME datatype
the-sakthi Apr 11, 2025
adc42b4
[SPARK-51420][SQL][FOLLOWUP] Support all valid TIME precisions in the…
the-sakthi Apr 11, 2025
b9dbf8b
[SPARK-51419][SQL][FOLLOWUP] Making hour function to accept any preci…
senthh Apr 11, 2025
1fa05b8
[SPARK-51777][SQL][CORE] Register sql.columnar.* classes to KryoSeria…
yaooqinn Apr 12, 2025
bc5ccad
[SPARK-51660][CORE] Gracefully handle when MDC is not supported
robreeves Apr 13, 2025
ece4bc0
[SPARK-51752][SQL] Enable rCTE referencing from within a CTE
Pajaraja Apr 13, 2025
5102370
[SPARK-51774][CONNECT] Add GRPC Status code to Python Connect GRPC Ex…
heyihong Apr 13, 2025
919e60f
[SPARK-51739][PYTHON] Validate Arrow schema from mapInArrow & mapInPa…
wengh Apr 13, 2025
1e717fe
[SPARK-51776][SQL] Fix logging in single-pass Analyzer
vladimirg-db Apr 14, 2025
da67217
[SPARK-51513][SQL] Fix RewriteMergeIntoTable rule produces unresolved…
wangyum Apr 14, 2025
0f5c98d
[SPARK-51789][CORE] Respect spark.api.mode and spark.remote properly …
HyukjinKwon Apr 14, 2025
5308151
[SPARK-51780][SQL] Implement Describe Procedure
szehon-ho Apr 14, 2025
c3af3dc
[SPARK-51788][INFRA] Add a PySpark test that runs every 3 days using …
LuciferYang Apr 14, 2025
8718eba
Revert "[SPARK-47895][SQL] group by alias should be idempotent"
mihailotim-db Apr 14, 2025
5b19aa3
[SPARK-51778][SQL][TESTS] Close SQL test gaps discovered during singl…
vladimirg-db Apr 14, 2025
7087677
[SPARK-51775][SQL] Normalize LogicalRelation and HiveTableRelation by…
vladimirg-db Apr 14, 2025
bb1a63a
[SPARK-51758][SS] Apply late record filtering based on watermark only…
anishshri-db Apr 14, 2025
53ce5c3
[SPARK-51395][SQL][TESTS][FOLLOW-UP] Explicitly sets failOnError in A…
HyukjinKwon Apr 14, 2025
d3b6dd1
[SPARK-51688][PYTHON] Use Unix Domain Socket between Python and JVM c…
HyukjinKwon Apr 15, 2025
898a0b4
[SPARK-46640][FOLLOW-UP] Consider the whole expression tree when excl…
nikhilsheoran-db Apr 15, 2025
467644e
[SPARK-51790][SQL] Register UTF8String to KryoSerializer
yaooqinn Apr 15, 2025
3998186
[SPARK-51747][SQL][FOLLOW-UP] Data source cached plan conf and migrat…
asl3 Apr 15, 2025
54d3c2e
[SPARK-51800][INFRA] Set up the CI for UDS in PySpark
HyukjinKwon Apr 15, 2025
bb39e4b
[SPARK-51633][CORE][TESTS] Reset `Utils#customHostname` in the `final…
LuciferYang Apr 15, 2025
58eb0ad
[SPARK-51794][INFRA] Install arm64 Python for MacOS daily test
LuciferYang Apr 15, 2025
9ec2670
[SPARK-51716][SQL] Support serializing Variant to XML
xiaonanyang-db Apr 15, 2025
2366262
[SPARK-51791][ML] `ImputerModel` stores coefficients with arrays inst…
zhengruifeng Apr 15, 2025
203942c
[SPARK-51688][PYTHON][FOLLOW-UP] Implement UDS in Accumulators
HyukjinKwon Apr 15, 2025
b4faf06
[SPARK-50131][SQL] Add IN Subquery DataFrame API
ueshin Apr 15, 2025
6e8b4b5
[SPARK-51638][CORE] Fix fetching the remote disk stored RDD blocks vi…
attilapiros Apr 15, 2025
a551080
[SPARK-51768][SS][TESTS] Create Failure Injection Test for Streaming …
siying Apr 15, 2025
e14e60c
[SPARK-51771][SQL] Add DSv2 APIs for ALTER TABLE ADD/DROP CONSTRAINT
gengliangwang Apr 15, 2025
e00e189
[SPARK-51762][SQL] Fix Resolution of Views in Single-Pass Analyzer Wh…
mihailomilosevic2001 Apr 16, 2025
0a79406
[SPARK-51773][SQL] Turn file formats into case classes to properly co…
vladimirg-db Apr 16, 2025
beb71bb
[SPARK-51813][SQL][CORE] Add a nonnullable DefaultCachedBatchKryoSeri…
yaooqinn Apr 16, 2025
fd44ab6
[SPARK-51806][BUILD] Upgrade `kryo-shaded` to 4.0.3
LuciferYang Apr 16, 2025
2ba1560
[SPARK-51811][BUILD] Upgrade `commons-io` to 2.19.0
panbingkun Apr 16, 2025
ed702c0
[SPARK-51812][SQL] Remove redundant parameters of some methods in `Qu…
panbingkun Apr 16, 2025
cd47c3c
[SPARK-50718][PYTHON][TESTS][FOLLOW-UP] Enable `pyspark.sql.tests.tes…
zhengruifeng Apr 16, 2025
61e23ef
[SPARK-51819][PYTHON] Update pyspark-errors test module to include mi…
heyihong Apr 16, 2025
bfe9558
[SPARK-51816][SQL] Simplify `StatFunctions.multipleApproxQuantiles` w…
zhengruifeng Apr 16, 2025
9ee1129
[SPARK-51739][PYTHON][FOLLOW-UP] Set spark.sql.execution.arrow.pyspar…
HyukjinKwon Apr 17, 2025
19440a1
[SPARK-51824][ML][CONNECT][TESTS] Force to clean up the ML cache afte…
zhengruifeng Apr 17, 2025
777044c
[SPARK-51797][SQL] Add table name to TableConstraint and remove parse…
gengliangwang Apr 17, 2025
3cebd7d
[SPARK-51800][INFRA][FOLLOW-UP] Respect PYSPARK_UDS_MODE environment …
HyukjinKwon Apr 17, 2025
2b78cec
[SPARK-51758][SS][FOLLOWUP][TESTS] Fix flaky test around watermark du…
anishshri-db Apr 17, 2025
216f05b
[SPARK-51414][SQL] Add the make_time() function
robreeves Apr 17, 2025
e9b5042
Revert "[SPARK-51824][ML][CONNECT][TESTS] Force to clean up the ML ca…
HyukjinKwon Apr 17, 2025
6defd69
[SPARK-51824][ML][CONNECT][TESTS] Force to clean up the ML cache afte…
zhengruifeng Apr 17, 2025
b2954ca
[SPARK-51826][K8S][DOCS] Update `YuniKorn` docs with `1.6.2`
dongjoon-hyun Apr 17, 2025
5645148
Revert "[SPARK-51758][SS][FOLLOWUP][TESTS] Fix flaky test around wate…
zhengruifeng Apr 17, 2025
9cea811
[SPARK-51832][BUILD] Use -q option to simplify maven version evaluation
yaooqinn Apr 17, 2025
45082b8
[SPARK-51803][CORE] Store external engine JDBC type in the spark sche…
alekjarmov Apr 17, 2025
335063f
[SPARK-51758][SS] Fix test case related to extra batch causing empty …
anishshri-db Apr 17, 2025
257e172
[MINOR][PYTHON] FIX change type hint of pyspark percentile functions …
Apr 17, 2025
d7992b1
[SPARK-51372][SQL][FOLLOW-UP] Retain the property map for DataSourceV…
anoopj Apr 18, 2025
290502d
[SPARK-51820][SQL] Move `UnresolvedOrdinal` construction before analy…
mihailotim-db Apr 18, 2025
8b33a83
[SPARK-50967][SS][MINOR][FOLLOW-UP] Update `flatMapGroupsWithState` t…
asl3 Apr 18, 2025
0d3d3a4
[SPARK-51829][PYTHON][ML] Client side should update `client.thread_lo…
zhengruifeng Apr 18, 2025
23c77b7
[SPARK-51411][SS][DOCS] Add documentation for the transformWithState …
anishshri-db Apr 18, 2025
40093c5
[SPARK-51836][PYTHON][CONNECT][TESTS] Avoid per-test-function connect…
zhengruifeng Apr 18, 2025
0ef8663
[SPARK-51838][PYTHON][TESTS] Add a test to check function wildcard im…
zhengruifeng Apr 18, 2025
a6a251c
[SPARK-51774][CONNECT][FOLLOW-UP][TESTS] Skip ConnectErrorsTest if gr…
HyukjinKwon Apr 18, 2025
e616b84
[SPARK-51836][PYTHON][CONNECT][TESTS][FOLLOW-UP] update `test_connect…
zhengruifeng Apr 18, 2025
ec43027
[SPARK-51814][SS][PYTHON] Introduce a new API `transformWithState` in…
HeartSaVioR Apr 18, 2025
cf80461
Revert "[SPARK-51691][CORE][TESTS] SerializationDebugger should swall…
LuciferYang Apr 18, 2025
87b9866
[SPARK-51801][BUILD] Upgrade ORC Format to 1.1.0
dongjoon-hyun Apr 18, 2025
2e16568
[SPARK-51553][SQL] Modify EXTRACT to support TIME data type
vinodkc Apr 19, 2025
ad2f611
[SPARK-51840][SQL] Restore Partition columns in HiveExternalCatalog#a…
yaooqinn Apr 20, 2025
35b003b
[SPARK-51848][SQL] Fix parsing XML records with defined schema of arr…
xiaonanyang-db Apr 20, 2025
ffc0428
[SPARK-51843][PYTHON][ML][TESTS] Avoid per-test classic session start…
zhengruifeng Apr 21, 2025
cb32570
[SPARK-51688][PYTHON][FOLLOW-UP] Log string instead of option instance
HyukjinKwon Apr 21, 2025
e50fe86
[SPARK-51844][PYTHON][TESTS] Add ReusedMixedTestCase for test env wit…
zhengruifeng Apr 21, 2025
8fef8ea
[SPARK-51814][SS][PYTHON][FOLLOW-UP] Add pyspark.sql.tests.pandas.hel…
HyukjinKwon Apr 21, 2025
a10f32d
[SPARK-51838][PYTHON][TESTS][FOLLWO-UP] Skip `test_wildcard_import` i…
zhengruifeng Apr 21, 2025
0955193
[SPARK-51691][CORE] SerializationDebugger should swallow exception wh…
summaryzb Apr 21, 2025
f2e65e0
Revert "[SPARK-51838][PYTHON][TESTS][FOLLWO-UP] Skip `test_wildcard_i…
HyukjinKwon Apr 21, 2025
48ff0d3
[SPARK-51838][PYTHON][TESTS][FOLLOW-UP] Skip `test_wildcard_import` i…
zhengruifeng Apr 21, 2025
7236d7d
[SPARK-51833][SQL] Normalize relations in NormalizePlan
vladimirg-db Apr 21, 2025
929042c
[SPARK-51663][SQL][FOLLOWUP] change buildLeft and buildRight to function
beliefer Apr 21, 2025
b557311
[SPARK-51845][ML][CONNECT] Add proto messages `CleanCache` and `GetCa…
zhengruifeng Apr 21, 2025
c77b163
[SPARK-49488][SQL][FOLLOWUP] Do not push down extract expression if e…
beliefer Apr 22, 2025
b4b1f78
[SPARK-51838][PYTHON][TESTS][FOLLWO-UP] Ignore typing module in `test…
zhengruifeng Apr 22, 2025
9951159
[SPARK-51779][SS] Use virtual column families for stream-stream joins
zecookiez Apr 22, 2025
cafe74d
[MINOR] rename the spark-connect-enabled tarball
cloud-fan Apr 22, 2025
a6f8426
[SPARK-51860][CONNECT] Disable `spark.connect.grpc.debug.enabled` by …
dongjoon-hyun Apr 22, 2025
9bcf559
[SPARK-51688][PYTHON][FOLLOW-UP] Use `socketPath.get` for logging ins…
HyukjinKwon Apr 22, 2025
eab8649
[SPARK-51814][SS][PYTHON][FOLLLOW-UP] Use RecordBatch.schema.names in…
HyukjinKwon Apr 22, 2025
1cf35f5
[SPARK-51774][CONNECT][FOLLOW-UP][TESTS] Skip ConnectErrorsTest if gr…
HyukjinKwon Apr 22, 2025
7904dd3
[SPARK-51861][SQL][UI] Remove duplicated/unnecessary info of InMemory…
yaooqinn Apr 22, 2025
3a7d5d7
[SPARK-51862][ML][CONNECT][TESTS] Clean up ml cache before ReusedConn…
zhengruifeng Apr 22, 2025
c86f617
[SPARK-51849][SQL] Refactoring `ResolveDDLCommandStringTypes`
ilicmarkodb Apr 22, 2025
58adb01
[SPARK-51814][SS][PYTHON][FOLLLOW-UP] Uses `list(self)` instead of `S…
HyukjinKwon Apr 22, 2025
772c4cb
[SPARK-51822][SS] Throwing classified error when disallowed functions…
ericm-db Apr 22, 2025
cdd5296
[SPARK-51856][ML][CONNECT] Update model size API to count distributed…
WeichenXu123 Apr 23, 2025
d7ce6ef
[SPARK-51866][CONNECT][TESTS] Ensure `serializerAllocator/deserialize…
LuciferYang Apr 23, 2025
124a17f
[SPARK-50582][PYTHON][FOLLOW-UP] Add `quote` to functions `__all__`
zhengruifeng Apr 23, 2025
24511dd
[SPARK-47791][SQL][FOLLOWUP] Avoid invalid JDBC decimal scale
cloud-fan Apr 23, 2025
a9128a0
[SPARK-51876][PYTHON][TESTS] Add configuration for `log4j.configurati…
LuciferYang Apr 23, 2025
a28977e
[SPARK-51711][ML][PYTHON][CONNECT] Propagates the active remote spark…
xi-db Apr 23, 2025
920634a
[SPARK-51868][SQL] Move type coercion validation to a separate object
vladimirg-db Apr 23, 2025
632e681
[SPARK-51820][SQL][FOLLOWUP] Fix `origin` for `UnresolvedOrdinal`
mihailotim-db Apr 23, 2025
d9816b7
[SPARK-51873][ML] For OneVsRest algorithm, allow using save / load to…
WeichenXu123 Apr 23, 2025
ce7f335
[SPARK-51865][CONNECT][TESTS] Print stacktrace message when `o.a.arro…
LuciferYang Apr 23, 2025
6df5cb7
[SPARK-51439][SQL] Support SQL UDF with DEFAULT argument
wengh Apr 24, 2025
516859f
[SPARK-51119][SQL][FOLLOW-UP] Fix missing fallback case for parsing c…
szehon-ho Apr 24, 2025
5a5bf04
[SPARK-51874][CORE][SQL] Add TypedConfigBuilder for Scala Enumeration
yaooqinn Apr 24, 2025
6eb9668
[SPARK-51889][PYTHON][SS] Fix a bug for MapState clear() in Python TWS
bogao007 Apr 24, 2025
5b7abdc
[SPARK-51877][PYTHON][CONNECT] Add functions 'chr', 'random' and 'uuid'
zhengruifeng Apr 24, 2025
bdc5232
[MINOR][PYTHON][TESTS] Further fix scheduled builder without grpc
zhengruifeng Apr 24, 2025
d2a3f88
[SPARK-51895][K8S][INFRA] Update K8s IT CI to use K8s 1.33
dongjoon-hyun Apr 24, 2025
8da4904
[SPARK-51898][PYTHON][CONNECT][TESTS] Refactor SparkConnectErrorTests…
zhengruifeng Apr 24, 2025
bab6988
[SPARK-51897][SQL] Fix compilation warnings related to "The auto inse…
LuciferYang Apr 24, 2025
83db398
[SPARK-51881][SQL] Make AvroOptions comparable
vladimirg-db Apr 24, 2025
2c0fc70
[SPARK-51891][SS] Squeeze the protocol of ListState GET / PUT / APPEN…
HeartSaVioR Apr 25, 2025
87cd48f
[SPARK-51909][INFRA] Add a scheduled workflow for PySpark Classic-only
zhengruifeng Apr 25, 2025
98ba05e
[SPARK-51878][SQL] Improve fillDefaultValue by exec the foldable defa…
beliefer Apr 25, 2025
8c808af
[SPARK-51896][CORE][SQL] Add Java Enum Support for TypedConfigBuilder
yaooqinn Apr 25, 2025
72d11ea
[SPARK-51908][K8S][TESTS] Update `setup-minikube` to v0.0.19
dongjoon-hyun Apr 25, 2025
80ed987
[SPARK-51909][INFRA][FOLLOW-UP] Fix python lint
zhengruifeng Apr 25, 2025
e3d0c0f
[MINOR][PYTHON][DOCS] Add 4 missing functions to API references
zhengruifeng Apr 25, 2025
676cac0
[SPARK-51915][PYTHON][CONNECT][TESTS] Enable SparkConnectDataFrameDeb…
zhengruifeng Apr 25, 2025
1c1d80f
[SPARK-51805][SQL] Get function with improper argument should throw p…
mihailoale-db Apr 25, 2025
c2305ed
[SPARK-51900][SQL] Properly throw datatype mismatch in single-pass An…
vladimirg-db Apr 25, 2025
23785d3
[SPARK-51904][SS] Removing async metadata purging for StateSchemaV3 a…
ericm-db Apr 26, 2025
b634978
[SPARK-51922][SS] Fix UTFDataFormatException thrown from StateStoreCh…
liviazhu Apr 26, 2025
08cf702
[SPARK-51925][CONNECT][TESTS] Fix the erroneous `assume` in the `Clas…
LuciferYang Apr 27, 2025
276ca7f
[SPARK-51913][SQL] JDBCTableCatalog#loadTable should throw no such ta…
cloud-fan Apr 27, 2025
0d42e51
[SPARK-51914][SQL] Add com.mysql.cj to spark.sql.hive.metastore.share…
yaooqinn Apr 27, 2025
6cb9a05
[SPARK-51923][BUILD] Upgrade Apache `commons-collections4` to 4.5.0
LuciferYang Apr 27, 2025
30540df
[SPARK-51814][SS][PYTHON][FOLLOWUP] Sync missing Python function types
HeartSaVioR Apr 27, 2025
3aa7570
[SPARK-51898][PYTHON][CONNECT][TESTS][FOLLOW-UP] Fix inconsistent exc…
zhengruifeng Apr 27, 2025
528fe20
[SPARK-51817][SPARK-49578][CONNECT] Re-introduce ansiConfig fields in…
nija-at Apr 27, 2025
81ede34
[SPARK-51827][SS][CONNECT] Support Spark Connect on transformWithStat…
HeartSaVioR Apr 27, 2025
f3efe1a
[SPARK-51924][BUILD] Upgrade jupiter-interface to 0.14.0 and Junit5 t…
LuciferYang Apr 27, 2025
4c9c41e
Revert "[SPARK-51827][SS][CONNECT] Support Spark Connect on transform…
yaooqinn Apr 27, 2025
1f43d37
[SPARK-51929][BUILD] Upgrade AWS SDK v2 to 2.29.52
dongjoon-hyun Apr 27, 2025
116c20c
[SPARK-51757][SQL] Fix LEAD/LAG Function Offset Exceeds Window Group …
xin-aurora Apr 27, 2025
7832e91
[SPARK-51928][BUILD] Upgrade Apache `common-text` to 1.13.1
LuciferYang Apr 27, 2025
7292ff1
[SPARK-51823][SS] Add config to not persist state store on executors
Kimahriman Apr 27, 2025
6f47783
[SPARK-51827][SS][CONNECT] Support Spark Connect on transformWithStat…
HeartSaVioR Apr 27, 2025
aeff679
[SPARK-51880][ML][PYTHON][CONNECT] Fix ML cache object python client …
WeichenXu123 Apr 28, 2025
e32fb5e
[SPARK-51814][SS][PYTHON][FOLLOWUP][MINOR] Add missing type handling …
HeartSaVioR Apr 28, 2025
4391538
[SPARK-51899][SQL] Implement error handling rules for spark.catalog.l…
heyihong Apr 28, 2025
1a9be7b
[SPARK-51930][BUILD] Upgrade datasketches-java to 6.2.0
LuciferYang Apr 28, 2025
7c2793e
CI: Do not upload docker build file
EnricoMi Sep 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-51421][SQL] Get seconds of TIME datatype
### What changes were proposed in this pull request?
This PR adds support for extracting the second component from TIME (TimeType) values in Spark SQL. For example:
```
scala> spark.sql("SELECT SECOND(TIME'13:59:45.99')").show()
+--------------------------+
|second(TIME '13:59:45.99')|
+--------------------------+
|                        45|
+--------------------------+

scala> spark.sql("select second(cast('12:00:01.123' as time(4)))").show(false)
+-------------------------------------+
|second(CAST(12:00:01.123 AS TIME(4)))|
+-------------------------------------+
|1                                    |
+-------------------------------------+
```

### Why are the changes needed?
Spark previously supported second() for only TIMESTAMP type values. TIME support was missing, leading to implicit casting attempt to TIMESTAMP, which was incorrect. This PR ensures that second(TIME'HH:MM:SS.######') behaves correctly without unnecessary type coercion.

### Does this PR introduce _any_ user-facing change?
Yes

- Before this PR, calling second(TIME'HH:MM:SS.######') resulted in a type mismatch error or an implicit cast attempt to TIMESTAMP, which was incorrect.
- With this PR, second(TIME'HH:MM:SS.######') now works correctly for TIME values without implicit casting.
- Users can now extract the second component from TIME values natively.

### How was this patch tested?
By running new tests:

```$ build/sbt "test:testOnly *TimeExpressionsSuite"```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#50525 from senthh/getSeconds.

Authored-by: senthh <senthil.kumar@acceldata.io>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
  • Loading branch information
senthh authored and MaxGekk committed Apr 9, 2025
commit 5602fbf93f9ea3d5accd8c89b1131165fccd386e
Original file line number Diff line number Diff line change
Expand Up @@ -648,7 +648,7 @@ object FunctionRegistry {
expression[NextDay]("next_day"),
expression[Now]("now"),
expression[Quarter]("quarter"),
expression[Second]("second"),
expressionBuilder("second", SecondExpressionBuilder),
expression[ParseToTimestamp]("to_timestamp"),
expression[ParseToDate]("to_date"),
expression[ToTime]("to_time"),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ import org.apache.spark.sql.catalyst.util.DateTimeUtils
import org.apache.spark.sql.catalyst.util.TimeFormatter
import org.apache.spark.sql.errors.{QueryCompilationErrors, QueryExecutionErrors}
import org.apache.spark.sql.internal.types.StringTypeWithCollation
import org.apache.spark.sql.types.{AbstractDataType, IntegerType, ObjectType, TimeType}
import org.apache.spark.sql.types.{AbstractDataType, IntegerType, ObjectType, TimeType, TypeCollection}
import org.apache.spark.unsafe.types.UTF8String

/**
Expand Down Expand Up @@ -290,3 +290,62 @@ object HourExpressionBuilder extends ExpressionBuilder {
}
}
}

case class SecondsOfTime(child: Expression)
extends RuntimeReplaceable
with ExpectsInputTypes {

override def replacement: Expression = StaticInvoke(
classOf[DateTimeUtils.type],
IntegerType,
"getSecondsOfTime",
Seq(child),
Seq(child.dataType)
)

override def inputTypes: Seq[AbstractDataType] =
Seq(TypeCollection(TimeType.MIN_PRECISION to TimeType.MAX_PRECISION map TimeType: _*))

override def children: Seq[Expression] = Seq(child)

override def prettyName: String = "second"

override protected def withNewChildrenInternal(
newChildren: IndexedSeq[Expression]): Expression = {
copy(child = newChildren.head)
}
}

@ExpressionDescription(
usage = """
_FUNC_(expr) - Returns the second component of the given expression.

If `expr` is a TIMESTAMP or a string that can be cast to timestamp,
it returns the second of that timestamp.
If `expr` is a TIME type (since 4.1.0), it returns the second of the time-of-day.
""",
examples = """
Examples:
> SELECT _FUNC_('2018-02-14 12:58:59');
59
> SELECT _FUNC_(TIME'13:25:59.999999');
59
""",
since = "1.5.0",
group = "datetime_funcs")
object SecondExpressionBuilder extends ExpressionBuilder {
override def build(name: String, expressions: Seq[Expression]): Expression = {
if (expressions.isEmpty) {
throw QueryCompilationErrors.wrongNumArgsError(name, Seq("> 0"), expressions.length)
} else {
val child = expressions.head
child.dataType match {
case _: TimeType =>
SecondsOfTime(child)
case _ =>
Second(child)
}
}
}
}

Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,12 @@ object DateTimeUtils extends SparkDateTimeUtils {
getLocalDateTime(micros, zoneId).getSecond
}

/**
* Returns the second value of a given TIME (TimeType) value.
*/
def getSecondsOfTime(micros: Long): Int = {
microsToLocalTime(micros).getSecond
}
/**
* Returns the seconds part and its fractional part with microseconds.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,4 +168,62 @@ class TimeExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper {
checkConsistencyBetweenInterpretedAndCodegen(
(child: Expression) => MinutesOfTime(child).replacement, TimeType())
}

test("SecondExpressionBuilder") {
// Empty expressions list
checkError(
exception = intercept[AnalysisException] {
SecondExpressionBuilder.build("second", Seq.empty)
},
condition = "WRONG_NUM_ARGS.WITHOUT_SUGGESTION",
parameters = Map(
"functionName" -> "`second`",
"expectedNum" -> "> 0",
"actualNum" -> "0",
"docroot" -> SPARK_DOC_ROOT)
)

// test TIME-typed child should build SecondsOfTime
val timeExpr = Literal(localTime(12, 58, 59), TimeType())
val builtExprForTime = SecondExpressionBuilder.build("second", Seq(timeExpr))
assert(builtExprForTime.isInstanceOf[SecondsOfTime])
assert(builtExprForTime.asInstanceOf[SecondsOfTime].child eq timeExpr)

// test non TIME-typed child should build second
val tsExpr = Literal("2007-09-03 10:45:23")
val builtExprForTs = SecondExpressionBuilder.build("second", Seq(tsExpr))
assert(builtExprForTs.isInstanceOf[Second])
assert(builtExprForTs.asInstanceOf[Second].child eq tsExpr)
}

test("Second with TIME type") {
// A few test times in microseconds since midnight:
// time in microseconds -> expected second
val testTimes = Seq(
localTime() -> 0,
localTime(1) -> 0,
localTime(0, 59) -> 0,
localTime(14, 30) -> 0,
localTime(12, 58, 59) -> 59,
localTime(23, 0, 1) -> 1,
localTime(23, 59, 59, 999999) -> 59
)

// Create a literal with TimeType() for each test microsecond value
// evaluate SecondsOfTime(...), and check that the result matches the expected second.
testTimes.foreach { case (micros, expectedSecond) =>
checkEvaluation(
SecondsOfTime(Literal(micros, TimeType())),
expectedSecond)
}

// Verify NULL handling
checkEvaluation(
SecondsOfTime(Literal.create(null, TimeType(TimeType.MICROS_PRECISION))),
null
)

checkConsistencyBetweenInterpretedAndCodegen(
(child: Expression) => SecondsOfTime(child).replacement, TimeType())
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@
| org.apache.spark.sql.catalyst.expressions.SchemaOfJson | schema_of_json | SELECT schema_of_json('[{"col":0}]') | struct<schema_of_json([{"col":0}]):string> |
| org.apache.spark.sql.catalyst.expressions.SchemaOfXml | schema_of_xml | SELECT schema_of_xml('<p><a>1</a></p>') | struct<schema_of_xml(<p><a>1</a></p>):string> |
| org.apache.spark.sql.catalyst.expressions.Sec | sec | SELECT sec(0) | struct<SEC(0):double> |
| org.apache.spark.sql.catalyst.expressions.Second | second | SELECT second('2009-07-30 12:58:59') | struct<second(2009-07-30 12:58:59):int> |
| org.apache.spark.sql.catalyst.expressions.SecondExpressionBuilder | second | SELECT second('2018-02-14 12:58:59') | struct<second(2018-02-14 12:58:59):int> |
| org.apache.spark.sql.catalyst.expressions.SecondsToTimestamp | timestamp_seconds | SELECT timestamp_seconds(1230219000) | struct<timestamp_seconds(1230219000):timestamp> |
| org.apache.spark.sql.catalyst.expressions.Sentences | sentences | SELECT sentences('Hi there! Good morning.') | struct<sentences(Hi there! Good morning., , ):array<array<string>>> |
| org.apache.spark.sql.catalyst.expressions.Sequence | sequence | SELECT sequence(1, 5) | struct<sequence(1, 5):array<int>> |
Expand Down