Skip to content

Commit 53a330b

Browse files
xinrong-mengHyukjinKwon
authored andcommitted
[SPARK-53602][PYTHON] Profile dump improvement and profiler doc fix
### What changes were proposed in this pull request? - Avoid race condition when creating directory of profile dump - Fix profiler docs ### Why are the changes needed? Race-safe dump and better docs. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #52360 from xinrong-meng/profile_impr. Authored-by: Xinrong Meng <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent 78aba00 commit 53a330b

File tree

2 files changed

+7
-8
lines changed

2 files changed

+7
-8
lines changed

python/pyspark/sql/profiler.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -194,8 +194,7 @@ def dump(id: int) -> None:
194194
s = stats.get(id)
195195

196196
if s is not None:
197-
if not os.path.exists(path):
198-
os.makedirs(path)
197+
os.makedirs(path, exist_ok=True)
199198
p = os.path.join(path, f"udf_{id}_perf.pstats")
200199
s.dump_stats(p)
201200

@@ -231,8 +230,7 @@ def dump(id: int) -> None:
231230
cm = code_map.get(id)
232231

233232
if cm is not None:
234-
if not os.path.exists(path):
235-
os.makedirs(path)
233+
os.makedirs(path, exist_ok=True)
236234
p = os.path.join(path, f"udf_{id}_memory.txt")
237235

238236
with open(p, "w+") as f:
@@ -316,7 +314,7 @@ class Profile:
316314
"""User-facing profile API. This instance can be accessed by
317315
:attr:`spark.profile`.
318316
319-
.. versionadded: 4.0.0
317+
.. versionadded:: 4.0.0
320318
"""
321319

322320
def __init__(self, profiler_collector: ProfilerCollector):
@@ -421,7 +419,8 @@ def render(
421419
id : int
422420
The UDF ID whose profiling results should be rendered.
423421
type : str, optional
424-
The profiler type to clear results for, which can be either "perf" or "memory".
422+
The profiler type to render results for, which can be either "perf" or "memory".
423+
If not specified, defaults to "perf".
425424
renderer : str or callable, optional
426425
The renderer to use. If not specified, the default renderer will be "flameprof"
427426
for the "perf" profiler, which returns an :class:`IPython.display.HTML` object in

python/pyspark/sql/session.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -950,13 +950,13 @@ def dataSource(self) -> "DataSourceRegistration":
950950

951951
@property
952952
def profile(self) -> Profile:
953-
"""Returns a :class:`Profile` for performance/memory profiling.
953+
"""Returns a :class:`pyspark.sql.profile.Profile` for performance/memory profiling.
954954
955955
.. versionadded:: 4.0.0
956956
957957
Returns
958958
-------
959-
:class:`Profile`
959+
:class:`pyspark.sql.profile.Profile`
960960
961961
Notes
962962
-----

0 commit comments

Comments
 (0)