Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
925 commits
Select commit Hold shift + click to select a range
82b4ad2
[SPARK-46393][SQL][FOLLOWUP] Classify exceptions in JDBCTableCatalog.…
cloud-fan Jun 7, 2024
9491292
[SPARK-48548][BUILD] Add LICENSE/NOTICE for spark-core with shaded de…
yaooqinn Jun 7, 2024
b7d9c31
Revert "[SPARK-46393][SQL][FOLLOWUP] Classify exceptions in JDBCTable…
yaooqinn Jun 7, 2024
87b0f59
[SPARK-48561][PS][CONNECT] Throw `PandasNotImplementedError` for unsu…
zhengruifeng Jun 7, 2024
d81b1e3
[SPARK-48559][SQL] Fetch globalTempDatabase name directly without inv…
willwwt Jun 7, 2024
8911d59
[SPARK-46393][SQL][FOLLOWUP] Classify exceptions in JDBCTableCatalog.…
panbingkun Jun 7, 2024
201df0d
[MINOR][PYTHON][TESTS] Move a test out of parity tests
zhengruifeng Jun 7, 2024
24bce72
[SPARK-48012][SQL] SPJ: Support Transfrom Expressions for One Side Sh…
szehon-ho Jun 9, 2024
d9394ee
[SPARK-48560][SS][PYTHON] Make StreamingQueryListener.spark settable
HyukjinKwon Jun 9, 2024
1901669
[SPARK-48564][PYTHON][CONNECT] Propagate cached schema in set operations
zhengruifeng Jun 10, 2024
61fd936
[SPARK-48403][SQL] Fix Lower & Upper expressions for UTF8_BINARY_LCAS…
uros-db Jun 10, 2024
3857a9d
[SPARK-48410][SQL] Fix InitCap expression for UTF8_BINARY_LCASE & ICU…
uros-db Jun 10, 2024
ec6db63
[SPARK-48569][SS][CONNECT] Handle edge cases in query.name
WweiL Jun 10, 2024
5a2f374
[SPARK-48544][SQL] Reduce memory pressure of empty TreeNode BitSets
n-young-db Jun 10, 2024
3fe6abd
[SPARK-48563][BUILD] Upgrade `pickle` to 1.5
LuciferYang Jun 11, 2024
1e4750e
[SPARK-47500][PYTHON][CONNECT][FOLLOWUP] Restore error message for `D…
zhengruifeng Jun 11, 2024
53d65fd
[SPARK-48565][UI] Fix thread dump display in UI
pan3793 Jun 11, 2024
452c1b6
[SPARK-48551][SQL] Perf improvement for escapePathName
yaooqinn Jun 11, 2024
df4156a
[SPARK-48372][SPARK-45716][PYTHON][FOLLOW-UP] Remove unused helper me…
zhengruifeng Jun 11, 2024
583ab05
[SPARK-47415][SQL] Add collation support for Levenshtein expression
uros-db Jun 11, 2024
224ba16
[SPARK-48556][SQL] Fix incorrect error message pointing to UNSUPPORTE…
nikolamand-db Jun 11, 2024
aad6771
[SPARK-48576][SQL] Rename UTF8_BINARY_LCASE to UTF8_LCASE
uros-db Jun 11, 2024
6107836
[SPARK-48576][SQL][FOLLOWUP] Rename UTF8_BINARY_LCASE to UTF8_LCASE
uros-db Jun 11, 2024
82a84ed
[SPARK-46937][SQL] Revert "[] Improve concurrency performance for Fun…
cloud-fan Jun 11, 2024
72df3cb
[SPARK-48582][BUILD] Upgrade `braces` from 3.0.2 to 3.0.3 in ui-test
LuciferYang Jun 12, 2024
334816a
[SPARK-48411][SS][PYTHON] Add E2E test for DropDuplicateWithinWatermark
eason-yuchen-liu Jun 12, 2024
8870efc
[SPARK-48581][BUILD] Upgrade dropwizard metrics to 4.2.26
wayneguow Jun 12, 2024
da81d8e
[SPARK-48584][SQL] Perf improvement for unescapePathName
yaooqinn Jun 12, 2024
a3625a9
[SPARK-48595][CORE] Cleanup deprecated api usage related to `commons-…
LuciferYang Jun 12, 2024
b5e1b79
[SPARK-48596][SQL] Perf improvement for calculating hex string for long
yaooqinn Jun 12, 2024
2d0b122
[SPARK-48594][PYTHON][CONNECT] Rename `parent` field to `child` in `C…
zhengruifeng Jun 12, 2024
d1d29c9
[SPARK-48598][PYTHON][CONNECT] Propagate cached schema in dataframe o…
zhengruifeng Jun 12, 2024
0bbd049
[SPARK-48591][PYTHON] Simplify the if-else branches with `F.lit`
zhengruifeng Jun 12, 2024
c059c84
[SPARK-48421][SQL] SPJ: Add documentation
szehon-ho Jun 12, 2024
3988548
[SPARK-48593][PYTHON][CONNECT] Fix the string representation of lambd…
zhengruifeng Jun 12, 2024
fd045c9
[SPARK-48583][SQL][TESTS] Replace deprecated classes and methods of `…
wayneguow Jun 13, 2024
ea2bca7
[SPARK-48602][SQL] Make csv generator support different output style …
yaooqinn Jun 13, 2024
78fd4e3
[SPARK-48584][SQL][FOLLOWUP] Improve the unescapePathName
beliefer Jun 13, 2024
b8c7aee
[SPARK-48609][BUILD] Upgrade `scala-xml` to 2.3.0
panbingkun Jun 13, 2024
bdcb79f
[SPARK-48543][SS] Track state row validation failures using explicit …
anishshri-db Jun 13, 2024
08e741b
[SPARK-48604][SQL] Replace deprecated `new ArrowType.Decimal(precisio…
wayneguow Jun 13, 2024
be154a3
[SPARK-48622][SQL] get SQLConf once when resolving column names
andrewxue-db Jun 14, 2024
70bdcc9
[MINOR][DOCS] Fix metrics info of shuffle service
Jun 14, 2024
0b214f1
[MINOR][DOCS][TESTS] Update repo name and link from `parquet-mr` to `…
wayneguow Jun 14, 2024
75fff90
[SPARK-45685][SQL][FOLLOWUP] Add handling for `Stream` where `LazyLis…
LuciferYang Jun 14, 2024
157b1e3
[SPARK-48612][SQL][SS] Cleanup deprecated api usage related to common…
LuciferYang Jun 14, 2024
3831886
[SPARK-48625][BUILD] Upgrade `mssql-jdbc` to 12.6.2.jre11
wayneguow Jun 14, 2024
878de00
[SPARK-48626][CORE] Change the scope of object LogKeys as private in …
gengliangwang Jun 14, 2024
dd8b05f
[SPARK-42252][CORE] Add `spark.shuffle.localDisk.file.output.buffer` …
wayneguow Jun 14, 2024
2d2bedf
[SPARK-48056][CONNECT][FOLLOW-UP] Scala Client re-execute plan if a S…
Jun 14, 2024
aa4bfb0
Revert "[SPARK-48591][PYTHON] Simplify the if-else branches with `F.l…
zhengruifeng Jun 14, 2024
8ee8aba
[SPARK-48621][SQL] Fix Like simplification in Optimizer for collated …
uros-db Jun 14, 2024
0775ea7
[SPARK-48611][CORE] Log TID for input split in HadoopRDD and NewHadoo…
pan3793 Jun 15, 2024
347f9c6
[SPARK-48302][PYTHON] Preserve nulls in map columns in PyArrow Tables
ianmcook Jun 16, 2024
c09039a
[SPARK-48597][SQL] Introduce a marker for isStreaming property in tex…
HeartSaVioR Jun 17, 2024
9881e0a
[SPARK-47777] fix python streaming data source connect test
chaoqin-li1123 Jun 17, 2024
8f662fc
[SPARK-48555][SQL][PYTHON][CONNECT] Support using Columns as paramete…
Ronserruya Jun 17, 2024
33a9c5d
[MINOR][PYTHON][DOCS] Fix pyspark.sql.functions.reduce docstring typo
kaashif Jun 17, 2024
257a788
[SPARK-48615][SQL] Perf improvement for parsing hex string
yaooqinn Jun 17, 2024
42cd961
[SPARK-48587][VARIANT] Avoid storage amplification when accessing a s…
cashmand Jun 17, 2024
0a4b112
[SPARK-48633][BUILD] Upgrade `scalacheck` to 1.18.0
wayneguow Jun 17, 2024
71475f7
[SPARK-48577][SQL] Invalid UTF-8 byte sequence replacement
uros-db Jun 17, 2024
0c16624
[SPARK-48627][SQL] Perf improvement for binary to to HEX_DISCRETE string
yaooqinn Jun 17, 2024
90d302a
[SPARK-48557][SQL] Support scalar subquery with group-by on column eq…
jchen5 Jun 17, 2024
d3da240
[SPARK-48610][SQL] refactor: use auxiliary idMap instead of OP_ID_TAG
liuzqt Jun 17, 2024
e00d26f
[SPARK-48600][SQL] Fix FrameLessOffsetWindowFunction expressions impl…
mihailomilosevic2001 Jun 17, 2024
d3455df
[SPARK-48572][SQL] Fix DateSub, DateAdd, WindowTime, TimeWindow and S…
mihailomilosevic2001 Jun 17, 2024
9ef092f
[SPARK-48641][BUILD] Upgrade `curator` to 5.7.0
wayneguow Jun 17, 2024
8fdd85f
[SPARK-48603][TEST] Update *ParquetReadSchemaSuite to cover type wide…
pan3793 Jun 17, 2024
66d8a29
[SPARK-47577][SPARK-47579] Correct misleading usage of log key TASK_ID
gengliangwang Jun 17, 2024
0864bbe
[SPARK-48566][PYTHON] Fix bug where partition indices are incorrect w…
dtenedor Jun 17, 2024
00a96bb
[SPARK-48642][CORE] False SparkOutOfMemoryError caused by killing tas…
pan3793 Jun 17, 2024
d8a24b7
[SPARK-48645][BUILD] Upgrade Maven to 3.9.8
dongjoon-hyun Jun 17, 2024
042804a
[SPARK-48567][SS] StreamingQuery.lastProgress should return the actua…
WweiL Jun 17, 2024
f0b7cfa
[SPARK-48497][PYTHON][DOCS] Add an example for Python data source wri…
allisonwang-db Jun 17, 2024
e265c60
[SPARK-47910][CORE] close stream when DiskBlockObjectWriter closeReso…
JacobZheng0927 Jun 18, 2024
738acd1
[SPARK-48648][PYTHON][CONNECT] Make SparkConnectClient.tags properly …
HyukjinKwon Jun 18, 2024
05c87e5
[SPARK-48644][SQL] Do a length check and throw COLLECTION_SIZE_LIMIT_…
yaooqinn Jun 18, 2024
c5809b6
[SPARK-48647][PYTHON][CONNECT] Refine the error message for `YearMont…
zhengruifeng Jun 18, 2024
a3feffd
[SPARK-48585][SQL] Make `built-in` JdbcDialect's method `classifyExce…
panbingkun Jun 18, 2024
9898e9d
[SPARK-48342][SQL] Introduction of SQL Scripting Parser
davidm-db Jun 18, 2024
58701d8
[SPARK-47148][SQL][FOLLOWUP] Use broadcast hint to make test more stable
cloud-fan Jun 18, 2024
80bba44
[SPARK-48459][CONNECT][PYTHON] Implement DataFrameQueryContext in Spa…
HyukjinKwon Jun 18, 2024
47ffe40
[SPARK-48646][PYTHON] Refine Python data source API docstring and typ…
allisonwang-db Jun 19, 2024
6ee7c25
[SPARK-48634][PYTHON][CONNECT] Avoid statically initialize threadpool…
HyukjinKwon Jun 19, 2024
5e28e95
[SPARK-48649][SQL] Add "ignoreInvalidPartitionPaths" and "spark.sql.f…
sadikovi Jun 19, 2024
878dd6a
[SPARK-48601][SQL] Give a more user friendly error message when setti…
stevomitric Jun 19, 2024
b77caf7
[SPARK-48651][DOC] Configuring different JDK for Spark on YARN
pan3793 Jun 19, 2024
1e868b2
Revert "[SPARK-48554][INFRA] Use R 4.4.0 in `windows` R GitHub Action…
HyukjinKwon Jun 19, 2024
d067fc6
Revert "[SPARK-48567][SS] StreamingQuery.lastProgress should return t…
HyukjinKwon Jun 19, 2024
b0e2cb5
[SPARK-48623][CORE] Structured Logging Migrations
asl3 Jun 19, 2024
2fe0692
[SPARK-48466][SQL] Create dedicated node for EmptyRelation in AQE
liuzqt Jun 19, 2024
3ac31b1
[SPARK-48574][SQL] Fix support for StructTypes with collations
mihailomilosevic2001 Jun 19, 2024
484e7ac
[SPARK-48472][SQL] Enable reflect expressions with collated strings
mihailoale-db Jun 19, 2024
5458763
[SPARK-48541][CORE] Add a new exit code for executors killed by TaskR…
bozhang2820 Jun 19, 2024
5d6e9dd
[SPARK-47986][CONNECT][FOLLOW-UP] Unable to create a new session when…
Jun 19, 2024
248fd4c
[SPARK-48342][FOLLOWUP][SQL] Remove unnecessary import in AstBuilder
davidm-db Jun 20, 2024
9eadb2c
[SPARK-48634][PYTHON][CONNECT][FOLLOW-UP] Do not make a request if th…
HyukjinKwon Jun 20, 2024
0d9f8a1
[SPARK-48479][SQL] Support creating scalar and table SQL UDFs in parser
allisonwang-db Jun 20, 2024
692d869
[SPARK-48591][PYTHON] Add a helper function to simplify `Column.py`
zhengruifeng Jun 20, 2024
955349f
[SPARK-48620][PYTHON] Fix internal raw data leak in `YearMonthInterva…
zhengruifeng Jun 20, 2024
714699b
[SPARK-47911][SQL][FOLLOWUP] Enable binary format tests in ThriftServ…
yaooqinn Jun 20, 2024
904d4dd
[SPARK-48635][SQL] Assign classes to join type errors and as-of join …
wayneguow Jun 21, 2024
b0b02b2
[SPARK-48653][PYTHON] Fix invalid Python data source error class refe…
allisonwang-db Jun 21, 2024
e68b8ca
[SPARK-48677][BUILD] Upgrade `scalafmt` to 3.8.2
panbingkun Jun 21, 2024
6eb7978
[SQL][TEST] Re-run collation benchmark
uros-db Jun 21, 2024
d8099a2
[SPARK-48479][SQL][FOLLOWUP] Consolidate createOrReplaceTableColType …
allisonwang-db Jun 21, 2024
67c7187
[SPARK-48661][BUILD] Upgrade `RoaringBitmap` to 1.1.0
wayneguow Jun 21, 2024
f077759
[SPARK-48631][CORE][TEST] Fix test "error during accessing host local…
bozhang2820 Jun 21, 2024
62bad53
[SPARK-48672][DOC] Update Jakarta Servlet reference in security page
pan3793 Jun 21, 2024
b99bb00
[SPARK-48630][INFRA] Make `merge_spark_pr` keep the format of revert PR
zhengruifeng Jun 21, 2024
3469ec6
[SPARK-48656][CORE] Do a length check and throw COLLECTION_SIZE_LIMIT…
wayneguow Jun 21, 2024
cd8bf11
[SPARK-48659][SQL][TESTS] Unify v1 and v2 ALTER TABLE .. SET TBLPROPE…
panbingkun Jun 21, 2024
fdabe08
[SPARK-48490][CORE][FOLLOWUP] Properly process escape sequences
gengliangwang Jun 21, 2024
b5d0d07
[SPARK-48662][SQL] Fix StructsToXml expression with collations
mihailomilosevic2001 Jun 21, 2024
97d3add
Revert "Revert "[SPARK-48554][INFRA] Use R 4.4.0 in `windows` R GitHu…
HyukjinKwon Jun 21, 2024
32861e0
[SPARK-48684][INFRA] Print related JIRA summary before proceeding merge
yaooqinn Jun 21, 2024
f0563ef
[SPARK-47172][CORE] Add support for AES-GCM for RPC encryption
sweisdb Jun 21, 2024
0bc38ac
[SPARK-48675][SQL] Fix cache table with collated column
nikolamand-db Jun 21, 2024
9414211
[SPARK-48490][CORE][TESTS][FOLLOWUP] Add some UT for the Windows path…
panbingkun Jun 21, 2024
7e5a461
[SPARK-48655][SQL] SPJ: Add tests for shuffle skipping for aggregate …
szehon-ho Jun 21, 2024
b1677a4
[SPARK-48545][SQL] Create to_avro and from_avro SQL functions to matc…
dtenedor Jun 21, 2024
c8d75c1
[SPARK-48620][PYTHON][FOLLOW-UP] Correct the error message for `Calen…
zhengruifeng Jun 22, 2024
84d278c
[MINOR] Fix some typos in `error-states.json`
wayneguow Jun 23, 2024
4b37eb8
[SPARK-48678][CORE] Performance optimizations for SparkConf.get(Confi…
JoshRosen Jun 23, 2024
e972dae
[SPARK-48688][SQL] Return reasonable error when calling SQL to_avro a…
dtenedor Jun 23, 2024
88cc153
[SPARK-48650][PYTHON] Display correct call site from IPython Notebook
itholic Jun 24, 2024
31fa9d8
[SQL][TEST][FOLLOWUP] Re-run collation benchmark (NonASCII)
uros-db Jun 24, 2024
4663b84
[SPARK-47681][FOLLOWUP] Fix schema_of_variant for float inputs
chenhao-db Jun 24, 2024
a7dc020
[SPARK-48681][SQL] Use ICU in Lower/Upper expressions for UTF8_BINARY…
uros-db Jun 24, 2024
8b16196
[SPARK-48680][SQL][DOCS] Add missing Java APIs and language-specific …
yaooqinn Jun 24, 2024
e459674
[SPARK-48683][SQL] Fix schema evolution with `df.mergeInto` losing `w…
xupefei Jun 24, 2024
09cb592
[SPARK-48639][CONNECT][PYTHON] Add Origin to Relation.RelationCommon
HyukjinKwon Jun 24, 2024
8e02a64
[SPARK-48695][PYTHON] `TimestampNTZType.fromInternal` not use the dep…
zhengruifeng Jun 24, 2024
fb5697d
[SPARK-48658][SQL] Encode/Decode functions report coding errors inste…
yaooqinn Jun 24, 2024
2ac2710
[SPARK-48702][INFRA] Fix `Python CodeGen check`
panbingkun Jun 25, 2024
5112e58
[SPARK-48692][BUILD] Upgrade `rocksdbjni` to 9.2.1
panbingkun Jun 25, 2024
d47f34f
[SPARK-48629] Migrate the residual code to structured logging framework
panbingkun Jun 25, 2024
b49479b
[SPARK-48704][INFRA] Update `build_sparkr_window.yml` to use `windows…
panbingkun Jun 25, 2024
51f1103
[SPARK-48686][SQL] Improve performance of ParserUtils.unescapeSQLString
JoshRosen Jun 25, 2024
8c4ca7e
[SPARK-48693][SQL] Simplify and unify toString of Invoke and StaticIn…
yaooqinn Jun 25, 2024
068be4b
[SPARK-48578][SQL] add UTF8 string validation related functions
uros-db Jun 25, 2024
5928908
[SPARK-48466][SQL][FOLLOWUP] Fix missing pattern match in EmptyRelati…
liuzqt Jun 25, 2024
9d4abaf
[SPARK-48638][CONNECT] Add ExecutionInfo support for DataFrame
grundprinzip Jun 25, 2024
ebacb91
[SPARK-48718][SQL] Handle and fix the case when deserializer in cogro…
anchovYu Jun 26, 2024
c459afb
[SPARK-48573][SQL] Upgrade ICU version
mihailomilosevic2001 Jun 26, 2024
169346c
[SPARK-48578][SQL][FOLLOWUP] Fix `dev/scalastyle` error for `Expressi…
panbingkun Jun 26, 2024
07cbba6
[SPARK-48706][PYTHON] Python UDF in higher order functions should not…
HyukjinKwon Jun 26, 2024
ee3a612
[SPARK-48705][PYTHON] Explicitly use worker_main when it starts with …
HyukjinKwon Jun 26, 2024
bb21861
[SPARK-48059][CORE][FOLLOWUP] Fix bug for SparkLogger
panbingkun Jun 26, 2024
ec0ee86
[SPARK-48699][SQL] Refine collation API
uros-db Jun 26, 2024
7a1608b
[SPARK-48670][SQL] Providing suggestion as part of error message when…
dbatomic Jun 26, 2024
0fc5b0b
[SPARK-47353][SQL] Enable collation support for the Mode expression
GideonPotok Jun 26, 2024
4cf5450
[SPARK-48699][SQL][FOLLOWUP] Refine collation API
uros-db Jun 26, 2024
313479c
[SPARK-48713][SQL] Add index range check for UnsafeRow.pointTo when b…
Ngone51 Jun 26, 2024
e23d69b
[SPARK-48709][SQL] Fix varchar type resolution mismatch for DataSourc…
wangyum Jun 26, 2024
a474b88
[SPARK-48724][SQL][TESTS] Fix incorrect conf settings of `ignoreCorru…
wayneguow Jun 26, 2024
a50b30d
[SPARK-48687][SS] Add change to perform state schema validation and u…
anishshri-db Jun 26, 2024
7f5f96c
[SPARK-48691][BUILD] Upgrade scalatest related dependencies to the 3.…
wayneguow Jun 26, 2024
b47c614
writing schema
ericm-db Jun 25, 2024
c238e70
commenting
ericm-db Jun 25, 2024
acd8504
removing TODO
ericm-db Jun 26, 2024
de30c7a
tws tests pass
ericm-db Jun 26, 2024
e9d4fcc
added test case, serializing list of columnFamilyMetadata instead of …
ericm-db Jun 26, 2024
998a019
adding test case
ericm-db Jun 26, 2024
062f955
comment
ericm-db Jun 26, 2024
723f23c
adding purging logic
ericm-db Jun 26, 2024
32e73d0
added test case for purging
ericm-db Jun 26, 2024
1581264
[SPARK-48717][PYTHON][SS] Catch ForeachBatch py4j InterruptedExceptio…
WweiL Jun 27, 2024
906af78
[SPARK-48578][SQL][FOLLOWUP] add UTF8 string validation related funct…
uros-db Jun 27, 2024
b5f76b1
[SPARK-48723][INFRA] Run `git cherry-pick --abort` if backporting is …
yaooqinn Jun 27, 2024
14272e8
[SPARK-48721][SQL][DOCS] Fix the doc of `decode` function in SQL API …
yaooqinn Jun 27, 2024
ec7dde7
[SPARK-42610][CONNECT][FOLLOWUP] Add some test cases for `SQLImplicit…
wayneguow Jun 27, 2024
ea0cd01
[SPARK-39627][SQL][FOLLOWUP] Cleanup deprecated api usage related to …
wayneguow Jun 27, 2024
7c7c196
[SPARK-48712][SQL] Perf Improvement for encode with empty values or U…
yaooqinn Jun 27, 2024
48f39b8
[SPARK-48729][SQL] Add a UserDefinedFunction interface to represent a…
allisonwang-db Jun 27, 2024
2e31572
Revert "[SPARK-48639][CONNECT][PYTHON] Add Origin to Relation.Relatio…
HyukjinKwon Jun 27, 2024
b154623
[MINOR][TESTS] Always remove spark.master in ReusedConnectTestCase
HyukjinKwon Jun 27, 2024
58d1a89
[SPARK-48555][PYTHON][FOLLOW-UP] Simplify the support of `Any` parame…
zhengruifeng Jun 27, 2024
ff2d177
[MINOR][DOCS] Make pivot doctest deterministic
HyukjinKwon Jun 27, 2024
5943905
[MINOR][PYTHON][TESTS] Remove duplicate schema checking
HyukjinKwon Jun 27, 2024
642c4bb
[SPARK-48639][CONNECT][PYTHON] Add Origin to RelationCommon
HyukjinKwon Jun 27, 2024
5b53c6c
[MINOR][PYTHON][DOCS] Fix indents in function API references
zhengruifeng Jun 27, 2024
c788d12
[SPARK-48733][PYTHON][TESTS] Do not test SET command in Python UDTF test
HyukjinKwon Jun 27, 2024
7bf9119
[SPARK-48734][PYTHON][TESTS] Separate local cluster test from test_ar…
HyukjinKwon Jun 27, 2024
d89aad3
[SPARK-47927][SQL][FOLLOWUP] fix ScalaUDF output nullability
cloud-fan Jun 27, 2024
b11608c
[SPARK-48428][SQL] Fix IllegalStateException in NestedColumnAliasing
Jun 27, 2024
b5a55e4
[SPARK-46957][CORE] Decommission migrated shuffle files should be abl…
Ngone51 Jun 27, 2024
40ad829
[SPARK-48586][SS] Remove lock acquisition in doMaintenance() by makin…
riyaverm-db Jun 27, 2024
baf461b
[SPARK-48708][CORE] Remove three unnecessary type registrations from …
LuciferYang Jun 27, 2024
df13ca0
[SPARK-48735][SQL] Performance Improvement for BIN function
yaooqinn Jun 27, 2024
1cdd5fa
[SPARK-48736][PYTHON] Support infra fro additional includes for Pytho…
grundprinzip Jun 27, 2024
2c9eb1c
[SPARK-48682][SQL] Use ICU in InitCap expression for UTF8_BINARY strings
uros-db Jun 28, 2024
6304484
[SPARK-48282][SQL] Alter string search logic for UTF8_BINARY_LCASE co…
uros-db Jun 28, 2024
8cd095f
[SPARK-48738][SQL] Correct since version for built-in func alias `ran…
wayneguow Jun 28, 2024
3bf7de0
[SPARK-48668][SQL] Support ALTER NAMESPACE ... UNSET PROPERTIES in v2
panbingkun Jun 28, 2024
2eeebef
[SPARK-46957][CORE][FOLLOW-UP] Use Collections.emptyMap for Java comp…
Ngone51 Jun 28, 2024
9141aa4
[SPARK-48744][CORE] Log entry should be constructed only once
gengliangwang Jun 28, 2024
a2c4be0
[SPARK-47233][CONNECT][SS][FOLLOW-UP] Add eventually for terminated e…
HyukjinKwon Jun 28, 2024
69f3a9b
[SPARK-48586][SS][FOLLOWUP] RocksDB and RocksDBFileManager code style…
riyaverm-db Jun 28, 2024
fc98ccd
[SPARK-48746][PYTHON][SS][TESTS] Avoid using global temp view in fore…
HyukjinKwon Jun 28, 2024
80277ee
[SPARK-48745][INFRA][PYTHON][TESTS] Remove unnecessary installation `…
panbingkun Jun 28, 2024
4e57f06
[SPARK-48307][SQL][FOLLOWUP] not-inlined CTE references sibling shoul…
cloud-fan Jun 28, 2024
6bfeb09
[SPARK-48757][CORE] Make `IndexShuffleBlockResolver` have explicit co…
dongjoon-hyun Jun 28, 2024
f49418b
[SPARK-48751][INFRA][PYTHON][TESTS] Re-balance `pyspark-pandas-connec…
panbingkun Jun 30, 2024
0487d78
[SPARK-48748][SQL] Cache numChars in UTF8String
uros-db Jul 1, 2024
a84a6a4
[SPARK-48749][SQL] Simplify UnaryPositive and eliminate its Catalyst …
yaooqinn Jul 1, 2024
f70ce13
[SPARK-48638][INFRA][FOLLOW-UP] Add graphviz into CI to run the relat…
HyukjinKwon Jul 1, 2024
399980e
[MINOR][DOCS] Fix the type hints of `functions.first(..., ignorenulls…
HyukjinKwon Jul 1, 2024
bc16b24
[SPARK-48765][DEPLOY] Enhance default value evaluation for SPARK_IDEN…
pan3793 Jul 1, 2024
cd1f1af
[SPARK-48747][SQL] Add code point iterator to UTF8String
uros-db Jul 1, 2024
48eb4d5
[SPARK-48737] Perf improvement during analysis - Create exception onl…
urosstan-db Jul 1, 2024
703b076
[SPARK-48697][SQL] Add collation aware string filters
stefankandic Jul 1, 2024
bab129d
combining rules
ericm-db Jul 1, 2024
9b94439
passing in encoders to columnfamilyschema
ericm-db Jul 1, 2024
5c29d8d
[SPARK-48768][PYTHON][CONNECT] Should not cache `explain`
zhengruifeng Jul 1, 2024
5ac7c9b
[SPARK-48766][PYTHON] Document the behavior difference of `extraction…
zhengruifeng Jul 1, 2024
6768eea
Feedback
ericm-db Jul 2, 2024
afb5e39
added base class
ericm-db Jul 2, 2024
d515740
rebase
ericm-db Jul 2, 2024
a1288a4
refactors
ericm-db Jul 2, 2024
a670585
comment
ericm-db Jul 2, 2024
9304223
[SPARK-44718][FOLLOWUP][DOCS] Avoid using ConfigEntry in spark.sql.co…
yaooqinn Jul 2, 2024
8a5f4e0
[SPARK-48759][SQL] Add migration doc for CREATE TABLE AS SELECT behav…
asl3 Jul 2, 2024
4ee37ed
[SPARK-48764][PYTHON] Filtering out IPython-related frames from user …
itholic Jul 2, 2024
353c2da
feedback, creating ColumnFamilySchemaFactory
ericm-db Jul 2, 2024
f471bfe
adding version in tws
ericm-db Jul 2, 2024
db9e1ac
[SPARK-48177][BUILD] Upgrade `Apache Parquet` to 1.14.1
Fokko Jul 2, 2024
49fece8
added override modifier
ericm-db Jul 2, 2024
7ad352f
feedback
ericm-db Jul 2, 2024
ee0d306
[SPARK-48589][SQL][SS] Add option snapshotStartBatchId and snapshotPa…
eason-yuchen-liu Jul 2, 2024
efe2e74
Merge branch 'master' into state-schema-tws
ericm-db Jul 2, 2024
3ea5d29
using tempdir
ericm-db Jul 2, 2024
5eee250
using map instead of list
ericm-db Jul 2, 2024
dce9968
removing purging
ericm-db Jul 2, 2024
9b1a7b2
tests pass
ericm-db Jul 3, 2024
f2857b9
combining rules
ericm-db Jul 3, 2024
4337016
removing metadataCacheEnabled
ericm-db Jul 3, 2024
dfb122f
removing COLUMN_FAMILY_SCHEMA_VERSION
ericm-db Jul 3, 2024
21af247
reverting hdfs metadata log changes
ericm-db Jul 3, 2024
3a36d06
feedback
ericm-db Jul 4, 2024
1b6ea1a
removing unused imports
ericm-db Jul 4, 2024
1df1e5d
line
ericm-db Jul 4, 2024
25b7b80
case match
ericm-db Jul 8, 2024
ede3136
feedback
ericm-db Jul 9, 2024
0a7945e
adding todos
ericm-db Jul 9, 2024
c1a041d
adding PR link
ericm-db Jul 9, 2024
feb4c01
removing batchId as dir
ericm-db Jul 9, 2024
c24490a
removing links
ericm-db Jul 10, 2024
b250aa4
sparkthrowablesuite
ericm-db Jul 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[SPARK-48623][CORE] Structured Logging Migrations
### What changes were proposed in this pull request?
This PR migrates Scala logging to comply with the scala style changes in [apache#46979](apache#46947)

### Why are the changes needed?
This makes development and PR review of the structured logging migration easier.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Tested by ensuring `dev/scalastyle` checks pass

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#46980 from asl3/logging-migrationscala.

Authored-by: Amanda Liu <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
  • Loading branch information
asl3 authored and gengliangwang committed Jun 19, 2024
commit b0e2cb575390d9dabb1142a78f4ceed48c059212
26 changes: 26 additions & 0 deletions common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,10 @@ private[spark] object LogKeys {
case object ADDED_JARS extends LogKey
case object ADMIN_ACLS extends LogKey
case object ADMIN_ACL_GROUPS extends LogKey
case object ADVISORY_TARGET_SIZE extends LogKey
case object AGGREGATE_FUNCTIONS extends LogKey
case object ALIGNED_FROM_TIME extends LogKey
case object ALIGNED_TO_TIME extends LogKey
case object ALPHA extends LogKey
case object ANALYSIS_ERROR extends LogKey
case object APP_ATTEMPT_ID extends LogKey
Expand All @@ -77,8 +80,10 @@ private[spark] object LogKeys {
case object APP_STATE extends LogKey
case object ARCHIVE_NAME extends LogKey
case object ARGS extends LogKey
case object ARTIFACT_ID extends LogKey
case object ATTRIBUTE_MAP extends LogKey
case object AUTH_ENABLED extends LogKey
case object AVG_BATCH_PROC_TIME extends LogKey
case object BACKUP_FILE extends LogKey
case object BARRIER_EPOCH extends LogKey
case object BARRIER_ID extends LogKey
Expand All @@ -99,6 +104,8 @@ private[spark] object LogKeys {
case object BROADCAST_OUTPUT_STATUS_SIZE extends LogKey
case object BUCKET extends LogKey
case object BYTECODE_SIZE extends LogKey
case object BYTE_BUFFER extends LogKey
case object BYTE_SIZE extends LogKey
case object CACHED_TABLE_PARTITION_METADATA_SIZE extends LogKey
case object CACHE_AUTO_REMOVED_SIZE extends LogKey
case object CACHE_UNTIL_HIGHEST_CONSUMED_SIZE extends LogKey
Expand All @@ -109,6 +116,7 @@ private[spark] object LogKeys {
case object CATALOG_NAME extends LogKey
case object CATEGORICAL_FEATURES extends LogKey
case object CHECKPOINT_FILE extends LogKey
case object CHECKPOINT_INTERVAL extends LogKey
case object CHECKPOINT_LOCATION extends LogKey
case object CHECKPOINT_PATH extends LogKey
case object CHECKPOINT_ROOT extends LogKey
Expand Down Expand Up @@ -186,6 +194,7 @@ private[spark] object LogKeys {
case object DELEGATE extends LogKey
case object DELTA extends LogKey
case object DEPRECATED_KEY extends LogKey
case object DERIVATIVE extends LogKey
case object DESCRIPTION extends LogKey
case object DESIRED_NUM_PARTITIONS extends LogKey
case object DESIRED_TREE_DEPTH extends LogKey
Expand All @@ -197,6 +206,7 @@ private[spark] object LogKeys {
case object DRIVER_MEMORY_SIZE extends LogKey
case object DRIVER_STATE extends LogKey
case object DROPPED_PARTITIONS extends LogKey
case object DSTREAM extends LogKey
case object DURATION extends LogKey
case object EARLIEST_LOADED_VERSION extends LogKey
case object EFFECTIVE_STORAGE_LEVEL extends LogKey
Expand Down Expand Up @@ -251,6 +261,7 @@ private[spark] object LogKeys {
case object FEATURE_NAME extends LogKey
case object FETCH_SIZE extends LogKey
case object FIELD_NAME extends LogKey
case object FILES extends LogKey
case object FILE_ABSOLUTE_PATH extends LogKey
case object FILE_END_OFFSET extends LogKey
case object FILE_FORMAT extends LogKey
Expand Down Expand Up @@ -307,6 +318,7 @@ private[spark] object LogKeys {
case object INIT_MODE extends LogKey
case object INPUT extends LogKey
case object INPUT_SPLIT extends LogKey
case object INTEGRAL extends LogKey
case object INTERVAL extends LogKey
case object ISOLATION_LEVEL extends LogKey
case object ISSUE_DATE extends LogKey
Expand Down Expand Up @@ -394,6 +406,7 @@ private[spark] object LogKeys {
case object MIN_COMPACTION_BATCH_ID extends LogKey
case object MIN_NUM_FREQUENT_PATTERN extends LogKey
case object MIN_POINT_PER_CLUSTER extends LogKey
case object MIN_RATE extends LogKey
case object MIN_SHARE extends LogKey
case object MIN_SIZE extends LogKey
case object MIN_TIME extends LogKey
Expand Down Expand Up @@ -490,6 +503,7 @@ private[spark] object LogKeys {
case object NUM_PREFIXES extends LogKey
case object NUM_PRUNED extends LogKey
case object NUM_PUSH_MERGED_LOCAL_BLOCKS extends LogKey
case object NUM_RECEIVERS extends LogKey
case object NUM_RECORDS_READ extends LogKey
case object NUM_RELEASED_LOCKS extends LogKey
case object NUM_REMAINED extends LogKey
Expand Down Expand Up @@ -547,6 +561,7 @@ private[spark] object LogKeys {
case object PARTITIONER extends LogKey
case object PARTITION_ID extends LogKey
case object PARTITION_IDS extends LogKey
case object PARTITION_SIZE extends LogKey
case object PARTITION_SPECIFICATION extends LogKey
case object PARTITION_SPECS extends LogKey
case object PATH extends LogKey
Expand Down Expand Up @@ -575,6 +590,7 @@ private[spark] object LogKeys {
case object PROCESSING_TIME extends LogKey
case object PRODUCER_ID extends LogKey
case object PROPERTY_NAME extends LogKey
case object PROPORTIONAL extends LogKey
case object PROTOCOL_VERSION extends LogKey
case object PROVIDER extends LogKey
case object PUSHED_FILTERS extends LogKey
Expand All @@ -595,6 +611,8 @@ private[spark] object LogKeys {
case object QUERY_PLAN_LENGTH_MAX extends LogKey
case object QUERY_RUN_ID extends LogKey
case object RANGE extends LogKey
case object RATE_LIMIT extends LogKey
case object RATIO extends LogKey
case object RDD_CHECKPOINT_DIR extends LogKey
case object RDD_DEBUG_STRING extends LogKey
case object RDD_DESCRIPTION extends LogKey
Expand Down Expand Up @@ -646,6 +664,8 @@ private[spark] object LogKeys {
case object RULE_NAME extends LogKey
case object RUN_ID extends LogKey
case object SCALA_VERSION extends LogKey
case object SCALING_DOWN_RATIO extends LogKey
case object SCALING_UP_RATIO extends LogKey
case object SCHEDULER_POOL_NAME extends LogKey
case object SCHEDULING_MODE extends LogKey
case object SCHEMA extends LogKey
Expand All @@ -671,12 +691,14 @@ private[spark] object LogKeys {
case object SHUFFLE_SERVICE_NAME extends LogKey
case object SIGMAS_LENGTH extends LogKey
case object SIGNAL extends LogKey
case object SINK extends LogKey
case object SIZE extends LogKey
case object SLEEP_TIME extends LogKey
case object SLIDE_DURATION extends LogKey
case object SMALLEST_CLUSTER_INDEX extends LogKey
case object SNAPSHOT_VERSION extends LogKey
case object SOCKET_ADDRESS extends LogKey
case object SOURCE extends LogKey
case object SOURCE_PATH extends LogKey
case object SPARK_BRANCH extends LogKey
case object SPARK_BUILD_DATE extends LogKey
Expand Down Expand Up @@ -708,6 +730,7 @@ private[spark] object LogKeys {
case object STORAGE_LEVEL_REPLICATION extends LogKey
case object STORAGE_MEMORY_SIZE extends LogKey
case object STORE_ID extends LogKey
case object STREAMING_CONTEXT extends LogKey
case object STREAMING_DATA_SOURCE_DESCRIPTION extends LogKey
case object STREAMING_DATA_SOURCE_NAME extends LogKey
case object STREAMING_OFFSETS_END extends LogKey
Expand All @@ -729,6 +752,7 @@ private[spark] object LogKeys {
case object TARGET_NUM_EXECUTOR extends LogKey
case object TARGET_NUM_EXECUTOR_DELTA extends LogKey
case object TARGET_PATH extends LogKey
case object TARGET_SIZE extends LogKey
case object TASK_ATTEMPT_ID extends LogKey
case object TASK_ID extends LogKey
case object TASK_INDEX extends LogKey
Expand All @@ -752,6 +776,7 @@ private[spark] object LogKeys {
case object THREAD_POOL_SIZE extends LogKey
case object THREAD_POOL_WAIT_QUEUE_SIZE extends LogKey
case object THRESHOLD extends LogKey
case object THRESH_TIME extends LogKey
case object TIME extends LogKey
case object TIMEOUT extends LogKey
case object TIMER extends LogKey
Expand Down Expand Up @@ -814,4 +839,5 @@ private[spark] object LogKeys {
case object XML_SCHEDULING_MODE extends LogKey
case object XSD_PATH extends LogKey
case object YOUNG_GENERATION_GC extends LogKey
case object ZERO_TIME extends LogKey
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ import java.util.concurrent.CopyOnWriteArrayList
import scala.jdk.CollectionConverters._

import org.apache.spark.connect.proto.{Command, ExecutePlanResponse, Plan, StreamingQueryEventType}
import org.apache.spark.internal.Logging
import org.apache.spark.internal.{Logging, LogKeys, MDC}
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.connect.client.CloseableIterator
import org.apache.spark.sql.streaming.StreamingQueryListener.{Event, QueryIdleEvent, QueryProgressEvent, QueryStartedEvent, QueryTerminatedEvent}
Expand Down Expand Up @@ -115,7 +115,7 @@ class StreamingQueryListenerBus(sparkSession: SparkSession) extends Logging {
case StreamingQueryEventType.QUERY_TERMINATED_EVENT =>
postToAll(QueryTerminatedEvent.fromJson(event.getEventJson))
case _ =>
logWarning(s"Unknown StreamingQueryListener event: $event")
logWarning(log"Unknown StreamingQueryListener event: ${MDC(LogKeys.EVENT, event)}")
}
})
}
Expand Down Expand Up @@ -144,7 +144,10 @@ class StreamingQueryListenerBus(sparkSession: SparkSession) extends Logging {
listener.onQueryIdle(t)
case t: QueryTerminatedEvent =>
listener.onQueryTerminated(t)
case _ => logWarning(s"Unknown StreamingQueryListener event: $event")
case _ =>
logWarning(
log"Unknown StreamingQueryListener event: " +
log"${MDC(LogKeys.EVENT, event)}")
}
} catch {
case e: Exception =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ import io.grpc.stub.StreamObserver
import org.apache.spark.connect.proto.ExecutePlanResponse
import org.apache.spark.connect.proto.StreamingQueryListenerBusCommand
import org.apache.spark.connect.proto.StreamingQueryListenerEventsResult
import org.apache.spark.internal.Logging
import org.apache.spark.internal.{Logging, LogKeys, MDC}
import org.apache.spark.sql.connect.service.ExecuteHolder

/**
Expand Down Expand Up @@ -83,20 +83,30 @@ class SparkConnectStreamingQueryListenerHandler(executeHolder: ExecuteHolder) ex
} catch {
case NonFatal(e) =>
logError(
s"[SessionId: $sessionId][UserId: $userId][operationId: " +
s"${executeHolder.operationId}] Error sending listener added response.",
log"[SessionId: ${MDC(LogKeys.SESSION_ID, sessionId)}]" +
log"[UserId: ${MDC(LogKeys.USER_ID, userId)}]" +
log"[operationId: " +
log"${MDC(LogKeys.OPERATION_HANDLE_IDENTIFIER, executeHolder.operationId)}] " +
log"Error sending listener added response.",
e)
listenerHolder.cleanUp()
return
}
}
logInfo(s"[SessionId: $sessionId][UserId: $userId][operationId: " +
s"${executeHolder.operationId}] Server side listener added. Now blocking until " +
"all client side listeners are removed or there is error transmitting the event back.")
logInfo(
log"[SessionId: ${MDC(LogKeys.SESSION_ID, sessionId)}][UserId: " +
log"${MDC(LogKeys.USER_ID, userId)}][operationId: " +
log"${MDC(LogKeys.OPERATION_HANDLE_IDENTIFIER, executeHolder.operationId)}] " +
log"Server side listener added. Now blocking until all client side listeners are " +
log"removed or there is error transmitting the event back.")
// Block the handling thread, and have serverListener continuously send back new events
listenerHolder.streamingQueryListenerLatch.await()
logInfo(s"[SessionId: $sessionId][UserId: $userId][operationId: " +
s"${executeHolder.operationId}] Server side listener long-running handling thread ended.")
logInfo(
log"[SessionId: ${MDC(LogKeys.SESSION_ID, sessionId)}][UserId: " +
log"${MDC(LogKeys.USER_ID, userId)}]" +
log"[operationId: " +
log"${MDC(LogKeys.OPERATION_HANDLE_IDENTIFIER, executeHolder.operationId)}] " +
log"Server side listener long-running handling thread ended.")
case StreamingQueryListenerBusCommand.CommandCase.REMOVE_LISTENER_BUS_LISTENER =>
listenerHolder.isServerSideListenerRegistered match {
case true =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ import java.io.Serializable
import scala.reflect.ClassTag

import org.apache.spark.SparkException
import org.apache.spark.internal.Logging
import org.apache.spark.internal.{Logging, LogKeys, MDC}
import org.apache.spark.util.Utils

/**
Expand Down Expand Up @@ -106,7 +106,8 @@ abstract class Broadcast[T: ClassTag](val id: Long) extends Serializable with Lo
assertValid()
_isValid = false
_destroySite = Utils.getCallSite().shortForm
logInfo("Destroying %s (from %s)".format(toString, _destroySite))
logInfo(log"Destroying ${MDC(LogKeys.BROADCAST, toString)} " +
log"(from ${MDC(LogKeys.CALL_SITE_SHORT_FORM, _destroySite)})")
doDestroy(blocking)
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,8 @@ class ExternalShuffleService(sparkConf: SparkConf, securityManager: SecurityMana
if (localDirs.length >= 1) {
new File(localDirs.find(new File(_, dbName).exists()).getOrElse(localDirs(0)), dbName)
} else {
logWarning(s"'spark.local.dir' should be set first when we use db in " +
s"ExternalShuffleService. Note that this only affects standalone mode.")
logWarning("'spark.local.dir' should be set first when we use db in " +
"ExternalShuffleService. Note that this only affects standalone mode.")
null
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ package org.apache.spark.deploy.worker

import java.util.concurrent.atomic.AtomicBoolean

import org.apache.spark.internal.{Logging, MDC}
import org.apache.spark.internal.{Logging, LogKeys, MDC}
import org.apache.spark.internal.LogKeys.WORKER_URL
import org.apache.spark.rpc._

Expand Down Expand Up @@ -64,7 +64,7 @@ private[spark] class WorkerWatcher(
}

override def receive: PartialFunction[Any, Unit] = {
case e => logWarning(s"Received unexpected message: $e")
case e => logWarning(log"Received unexpected message: ${MDC(LogKeys.ERROR, e)}")
}

override def onConnected(remoteAddress: RpcAddress): Unit = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import com.google.common.cache.CacheBuilder

import org.apache.spark.{SecurityManager, SparkConf, SparkEnv, SparkException}
import org.apache.spark.errors.SparkCoreErrors
import org.apache.spark.internal.{config, Logging, MDC}
import org.apache.spark.internal.{config, Logging, LogKeys, MDC}
import org.apache.spark.internal.LogKeys._
import org.apache.spark.io.NioBufferedFileInputStream
import org.apache.spark.network.buffer.{FileSegmentManagedBuffer, ManagedBuffer}
Expand Down Expand Up @@ -436,8 +436,8 @@ private[spark] class IndexShuffleBlockResolver(
if (checksumTmp.exists()) {
try {
if (!checksumTmp.delete()) {
logError(s"Failed to delete temporary checksum file " +
s"at ${checksumTmp.getAbsolutePath}")
logError(log"Failed to delete temporary checksum file at " +
log"${MDC(LogKeys.PATH, checksumTmp.getAbsolutePath)}")
}
} catch {
case e: Exception =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ import scala.util.control.NonFatal
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs._

import org.apache.spark.internal.Logging
import org.apache.spark.internal.{Logging, LogKeys, MDC}
import org.apache.spark.sql.execution.streaming.AbstractFileContextBasedCheckpointFileManager
import org.apache.spark.sql.execution.streaming.CheckpointFileManager.CancellableFSDataOutputStream

Expand All @@ -36,7 +36,7 @@ class AbortableStreamBasedCheckpointFileManager(path: Path, hadoopConf: Configur
s" an fs (path: $path) with abortable stream support")
}

logInfo(s"Writing atomically to $path based on abortable stream")
logInfo(log"Writing atomically to ${MDC(LogKeys.PATH, path)} based on abortable stream")

class AbortableStreamBasedFSDataOutputStream(
fsDataOutputStream: FSDataOutputStream,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -503,8 +503,8 @@ class LogisticRegression @Since("1.2.0") (
tol, fitIntercept, maxBlockSizeInMB)

if (dataset.storageLevel != StorageLevel.NONE) {
instr.logWarning(s"Input instances will be standardized, blockified to blocks, and " +
s"then cached during training. Be careful of double caching!")
instr.logWarning("Input instances will be standardized, blockified to blocks, and " +
"then cached during training. Be careful of double caching!")
}

val instances = dataset.select(
Expand Down Expand Up @@ -569,8 +569,8 @@ class LogisticRegression @Since("1.2.0") (

val isConstantLabel = histogram.count(_ != 0.0) == 1
if ($(fitIntercept) && isConstantLabel && !usingBoundConstrainedOptimization) {
instr.logWarning(s"All labels are the same value and fitIntercept=true, so the " +
s"coefficients will be zeros. Training is not needed.")
instr.logWarning("All labels are the same value and fitIntercept=true, so the " +
"coefficients will be zeros. Training is not needed.")
val constantLabelIndex = Vectors.dense(histogram).argmax
val coefMatrix = new SparseMatrix(numCoefficientSets, numFeatures,
new Array[Int](numCoefficientSets + 1), Array.emptyIntArray, Array.emptyDoubleArray,
Expand All @@ -584,8 +584,8 @@ class LogisticRegression @Since("1.2.0") (
}

if (!$(fitIntercept) && isConstantLabel) {
instr.logWarning(s"All labels belong to a single class and fitIntercept=false. It's a " +
s"dangerous ground, so the algorithm may not converge.")
instr.logWarning("All labels belong to a single class and fitIntercept=false. It's a " +
"dangerous ground, so the algorithm may not converge.")
}

val featuresMean = summarizer.mean.toArray
Expand Down
Loading