Skip to content

Commit 2b76ef4

Browse files
committed
[SPARK-47683][PYTHON][BUILD][FOLLOW-UP] Exclude lib/py4j*zip in pyspark-connect package
### What changes were proposed in this pull request? This PR is a followup of #45053 that includes `lib/py4j*zip` in the package. Currently it's being picked up by https://github.com/apache/spark/blob/master/python/MANIFEST.in#L26. For other files, we don't create `deps` directory in `setup.py` for `pyspark-connect` so they are not included. But `lib` is being included. ### Why are the changes needed? To exclude unrelated files. ### Does this PR introduce _any_ user-facing change? No, the main change has not been released out yet. ### How was this patch tested? Manually packaged, and checked the contents via `vi`. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46331 from HyukjinKwon/SPARK-47683-followup. Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent 2fbfb21 commit 2b76ef4

File tree

2 files changed

+14
-1
lines changed

2 files changed

+14
-1
lines changed

python/packaging/classic/setup.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -204,8 +204,13 @@ def run(self):
204204
copyfile("pyspark/shell.py", "pyspark/python/pyspark/shell.py")
205205

206206
if in_spark:
207+
# !!HACK ALTERT!!
208+
# `setup.py` has to be located with the same directory with the package.
209+
# Therefore, we copy the current file, and place it at `spark/python` directory.
210+
# After that, we remove it in the end.
207211
copyfile("packaging/classic/setup.py", "setup.py")
208212
copyfile("packaging/classic/setup.cfg", "setup.cfg")
213+
209214
# Construct the symlink farm - this is nein_sparkcessary since we can't refer to
210215
# the path above the package root and we need to copy the jars and scripts which
211216
# are up above the python root.

python/packaging/connect/setup.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
import sys
2626
from setuptools import setup
2727
import os
28-
from shutil import copyfile
28+
from shutil import copyfile, move
2929
import glob
3030
from pathlib import Path
3131

@@ -109,6 +109,13 @@
109109

110110
try:
111111
if in_spark:
112+
# !!HACK ALTERT!!
113+
# 1. `setup.py` has to be located with the same directory with the package.
114+
# Therefore, we copy the current file, and place it at `spark/python` directory.
115+
# After that, we remove it in the end.
116+
# 2. Here it renames `lib` to `lib.ack` so MANIFEST.in does not pick `py4j` up.
117+
# We rename it back in the end.
118+
move("lib", "lib.back")
112119
copyfile("packaging/connect/setup.py", "setup.py")
113120
copyfile("packaging/connect/setup.cfg", "setup.cfg")
114121

@@ -207,5 +214,6 @@
207214
)
208215
finally:
209216
if in_spark:
217+
move("lib.back", "lib")
210218
os.remove("setup.py")
211219
os.remove("setup.cfg")

0 commit comments

Comments
 (0)