Skip to content

Commit ab00533

Browse files
committed
[SPARK-47933][PYTHON][TESTS][FOLLOW-UP] Enable doctest pyspark.sql.connect.column
### What changes were proposed in this pull request? Enable doctest `pyspark.sql.connect.column` ### Why are the changes needed? test coverage ### Does this PR introduce _any_ user-facing change? no, test only ### How was this patch tested? manually check: I manually broke some doctests in `Column`, then found `pyspark.sql.connect.column` didn't fail: ``` (spark_dev_312) ➜ spark git:(master) ✗ python/run-tests -k --python-executables python3 --testnames 'pyspark.sql.classic.column' Running PySpark tests. Output is in /Users/ruifeng.zheng/Dev/spark/python/unit-tests.log Will test against the following Python executables: ['python3'] Will test the following Python tests: ['pyspark.sql.classic.column'] python3 python_implementation is CPython python3 version is: Python 3.12.2 Starting test(python3): pyspark.sql.classic.column (temp output: /Users/ruifeng.zheng/Dev/spark/python/target/4bdd14b8-92ba-43ba-a7fb-655e6769aeb9/python3__pyspark.sql.classic.column__i2_c1zct.log) WARNING: Using incubator modules: jdk.incubator.vector Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). ********************************************************************** File "/Users/ruifeng.zheng/Dev/spark/python/pyspark/sql/column.py", line 385, in pyspark.sql.column.Column.contains Failed example: df.filter(df.name.contains('o')).collect() Differences (ndiff with -expected +actual): - [Row(age=5, name='Bobx')] ? - + [Row(age=5, name='Bob')] ********************************************************************** 1 of 2 in pyspark.sql.column.Column.contains ***Test Failed*** 1 failures. Had test failures in pyspark.sql.classic.column with python3; see logs. (spark_dev_312) ➜ spark git:(master) ✗ python/run-tests -k --python-executables python3 --testnames 'pyspark.sql.connect.column' Running PySpark tests. Output is in /Users/ruifeng.zheng/Dev/spark/python/unit-tests.log Will test against the following Python executables: ['python3'] Will test the following Python tests: ['pyspark.sql.connect.column'] python3 python_implementation is CPython python3 version is: Python 3.12.2 Starting test(python3): pyspark.sql.connect.column (temp output: /Users/ruifeng.zheng/Dev/spark/python/target/2acaff3c-ef1d-41eb-b63e-509f3e0192c0/python3__pyspark.sql.connect.column__66td62h9.log) Finished test(python3): pyspark.sql.connect.column (3s) Tests passed in 3 seconds ``` after this PR, it fails as expected: ``` (spark_dev_312) ➜ spark git:(master) ✗ python/run-tests -k --python-executables python3 --testnames 'pyspark.sql.connect.column' Running PySpark tests. Output is in /Users/ruifeng.zheng/Dev/spark/python/unit-tests.log Will test against the following Python executables: ['python3'] Will test the following Python tests: ['pyspark.sql.connect.column'] python3 python_implementation is CPython python3 version is: Python 3.12.2 Starting test(python3): pyspark.sql.connect.column (temp output: /Users/ruifeng.zheng/Dev/spark/python/target/390ff7ae-7683-425c-b0d2-ee336e1ad452/python3__pyspark.sql.connect.column__f69b3smc.log) WARNING: Using incubator modules: jdk.incubator.vector Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). org.apache.spark.SparkSQLException: [INVALID_CURSOR.DISCONNECTED] The cursor is invalid. The cursor has been disconnected by the server. SQLSTATE: HY109 at org.apache.spark.sql.connect.execution.ExecuteGrpcResponseSender.execute(ExecuteGrpcResponseSender.scala:281) at org.apache.spark.sql.connect.execution.ExecuteGrpcResponseSender$$anon$1.run(ExecuteGrpcResponseSender.scala:101) ********************************************************************** File "/Users/ruifeng.zheng/Dev/spark/python/pyspark/sql/column.py", line 385, in pyspark.sql.column.Column.contains Failed example: df.filter(df.name.contains('o')).collect() Expected: [Row(age=5, name='Bobx')] Got: [Row(age=5, name='Bob')] ********************************************************************** 1 of 2 in pyspark.sql.column.Column.contains ***Test Failed*** 1 failures. Had test failures in pyspark.sql.connect.column with python3; see logs. ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #46895 from zhengruifeng/fix_connect_column_doc_test. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
1 parent 8cb78a7 commit ab00533

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

python/pyspark/sql/connect/column.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -579,17 +579,17 @@ def _test() -> None:
579579
import sys
580580
import doctest
581581
from pyspark.sql import SparkSession as PySparkSession
582-
import pyspark.sql.connect.column
582+
import pyspark.sql.column
583583

584-
globs = pyspark.sql.connect.column.__dict__.copy()
584+
globs = pyspark.sql.column.__dict__.copy()
585585
globs["spark"] = (
586586
PySparkSession.builder.appName("sql.connect.column tests")
587587
.remote(os.environ.get("SPARK_CONNECT_TESTING_REMOTE", "local[4]"))
588588
.getOrCreate()
589589
)
590590

591591
(failure_count, test_count) = doctest.testmod(
592-
pyspark.sql.connect.column,
592+
pyspark.sql.column,
593593
globs=globs,
594594
optionflags=doctest.ELLIPSIS
595595
| doctest.NORMALIZE_WHITESPACE

0 commit comments

Comments
 (0)