[SPARK-37244][PYTHON][FOLLOWUP] Adjust `pyspark.rdd` doctest #34529

dongjoon-hyun · 2021-11-09T05:43:12Z

What changes were proposed in this pull request?

This PR is a follow-up of #34526 to adjust one pyspark.rdd doctest additionally.

- >>> b''.join(result).decode('utf-8')
+ >>> ''.join([r.decode('utf-8') if isinstance(r, bytes) else r for r in result])

Why are the changes needed?

Python 3.8/3.9

Using Python version 3.8.12 (default, Nov  8 2021 17:15:19)
Spark context Web UI available at http://localhost:4040
Spark context available as 'sc' (master = local[*], app id = local-1636432954207).
SparkSession available as 'spark'.
>>> from tempfile import NamedTemporaryFile
>>> tempFile3 = NamedTemporaryFile(delete=True)
>>> tempFile3.close()
>>> codec = "org.apache.hadoop.io.compress.GzipCodec"
>>> sc.parallelize(['foo', 'bar']).saveAsTextFile(tempFile3.name, codec)
>>> from fileinput import input, hook_compressed
>>> from glob import glob
>>> result = sorted(input(glob(tempFile3.name + "/part*.gz"), openhook=hook_compressed))
>>> result
[b'bar\n', b'foo\n']

Python 3.10

Using Python version 3.10.0 (default, Oct 29 2021 14:35:18)
Spark context Web UI available at http://localhost:4040
Spark context available as 'sc' (master = local[*], app id = local-1636433378727).
SparkSession available as 'spark'.
>>> from tempfile import NamedTemporaryFile
>>> tempFile3 = NamedTemporaryFile(delete=True)
>>> tempFile3.close()
>>> codec = "org.apache.hadoop.io.compress.GzipCodec"
>>> sc.parallelize(['foo', 'bar']).saveAsTextFile(tempFile3.name, codec)
>>> from fileinput import input, hook_compressed
>>> from glob import glob
>>> result = sorted(input(glob(tempFile3.name + "/part*.gz"), openhook=hook_compressed))
>>> result
['bar\n', 'foo\n']

Does this PR introduce any user-facing change?

No.

How was this patch tested?

$ python/run-tests --testnames pyspark.rdd

SparkQA · 2021-11-09T06:30:40Z

Test build #145018 has finished for PR 34529 at commit 795f083.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2021-11-09T06:35:31Z

Merged to master.

SparkQA · 2021-11-09T06:47:12Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49490/

dongjoon-hyun · 2021-11-09T06:51:03Z

Thank you, @HyukjinKwon !

SparkQA · 2021-11-09T07:46:01Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49490/

This PR aims to support building and running tests on Python 3.10. Python 3.10 added many new features and breaking changes. - https://docs.python.org/3/whatsnew/3.10.html This PR is a follow-up of apache#34526 to adjust one `pyspark.rdd` doctest additionally. ```python - >>> b''.join(result).decode('utf-8') + >>> ''.join([r.decode('utf-8') if isinstance(r, bytes) else r for r in result]) ``` **Python 3.8/3.9** ```python Using Python version 3.8.12 (default, Nov 8 2021 17:15:19) Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = local[*], app id = local-1636432954207). SparkSession available as 'spark'. >>> from tempfile import NamedTemporaryFile >>> tempFile3 = NamedTemporaryFile(delete=True) >>> tempFile3.close() >>> codec = "org.apache.hadoop.io.compress.GzipCodec" >>> sc.parallelize(['foo', 'bar']).saveAsTextFile(tempFile3.name, codec) >>> from fileinput import input, hook_compressed >>> from glob import glob >>> result = sorted(input(glob(tempFile3.name + "/part*.gz"), openhook=hook_compressed)) >>> result [b'bar\n', b'foo\n'] ``` **Python 3.10** ```python Using Python version 3.10.0 (default, Oct 29 2021 14:35:18) Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = local[*], app id = local-1636433378727). SparkSession available as 'spark'. >>> from tempfile import NamedTemporaryFile >>> tempFile3 = NamedTemporaryFile(delete=True) >>> tempFile3.close() >>> codec = "org.apache.hadoop.io.compress.GzipCodec" >>> sc.parallelize(['foo', 'bar']).saveAsTextFile(tempFile3.name, codec) >>> from fileinput import input, hook_compressed >>> from glob import glob >>> result = sorted(input(glob(tempFile3.name + "/part*.gz"), openhook=hook_compressed)) >>> result ['bar\n', 'foo\n'] ``` No. ``` $ python/run-tests --testnames pyspark.rdd ``` Closes apache#34529 from dongjoon-hyun/SPARK-37244-2. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit 47ceae4) Signed-off-by: Dongjoon Hyun <[email protected]>

fix doctest

795f083

github-actions bot added CORE PYTHON labels Nov 9, 2021

HyukjinKwon approved these changes Nov 9, 2021

View reviewed changes

HyukjinKwon closed this in 47ceae4 Nov 9, 2021

dongjoon-hyun deleted the SPARK-37244-2 branch November 9, 2021 06:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-37244][PYTHON][FOLLOWUP] Adjust `pyspark.rdd` doctest #34529

[SPARK-37244][PYTHON][FOLLOWUP] Adjust `pyspark.rdd` doctest #34529

Uh oh!

dongjoon-hyun commented Nov 9, 2021 •

edited

Loading

Uh oh!

SparkQA commented Nov 9, 2021

Uh oh!

HyukjinKwon commented Nov 9, 2021

Uh oh!

SparkQA commented Nov 9, 2021

Uh oh!

dongjoon-hyun commented Nov 9, 2021

Uh oh!

SparkQA commented Nov 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-37244][PYTHON][FOLLOWUP] Adjust pyspark.rdd doctest #34529

[SPARK-37244][PYTHON][FOLLOWUP] Adjust pyspark.rdd doctest #34529

Uh oh!

Conversation

dongjoon-hyun commented Nov 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Nov 9, 2021

Uh oh!

HyukjinKwon commented Nov 9, 2021

Uh oh!

SparkQA commented Nov 9, 2021

Uh oh!

dongjoon-hyun commented Nov 9, 2021

Uh oh!

SparkQA commented Nov 9, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-37244][PYTHON][FOLLOWUP] Adjust `pyspark.rdd` doctest #34529

[SPARK-37244][PYTHON][FOLLOWUP] Adjust `pyspark.rdd` doctest #34529

dongjoon-hyun commented Nov 9, 2021 •

edited

Loading