Skip to content

Conversation

@holdenk
Copy link
Contributor

@holdenk holdenk commented Aug 20, 2019

What changes were proposed in this pull request?

This PR allows Python toLocalIterator to prefetch the next partition while the first partition is being collected. The PR also adds a demo micro bench mark in the examples directory, we may wish to keep this or not.

Why are the changes needed?

In https://issues.apache.org/jira/browse/SPARK-23961 / 5e79ae3 we changed PySpark to only pull one partition at a time. This is memory efficient, but if partitions take time to compute this can mean we're spending more time blocking.

Does this PR introduce any user-facing change?

A new param is added to toLocalIterator

How was this patch tested?

New unit test inside of test_rdd.py checks the time that the elements are evaluated at. Another test that the results remain the same are added to test_dataframe.py.

I also ran a micro benchmark in the examples directory prefetch.py which shows an improvement of ~40% in this specific use case.

19/08/16 17:11:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Running timers:

[Stage 32:> (0 + 1) / 1]
Results:

Prefetch time:

100.228110831

Regular time:

188.341721614

@holdenk
Copy link
Contributor Author

holdenk commented Aug 20, 2019

cc @BryanCutler who created 5e79ae3

@SparkQA
Copy link

SparkQA commented Aug 20, 2019

Test build #109437 has finished for PR 25515 at commit c477fec.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon changed the title [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalItr [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator Aug 21, 2019
@SparkQA
Copy link

SparkQA commented Aug 21, 2019

Test build #109448 has finished for PR 25515 at commit e0327a2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@BryanCutler BryanCutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @holdenk ! It can definitely improve performance when calculating partitions takes some time. I know this issue was just for Python, but Scala toLocalIterators could also benefit from prefetch, I believe. WDYT?

if (prefetchPartitions) {
prefetchIter.headOption
}
val partitionArray = ThreadUtils.awaitResult(partitionFuture, Duration.Inf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be best to avoid awaitResult if possible. Could you make a buffered iterator yourself?
maybe something like

var next = collectPartitionIter.next()
val prefetchIter = collectPartitionIter.map { part =>
  val tmp = next
  next = part
  tmp
} ++ Iterator(next)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the awaitFuture (or something similar) is required for us to use futures. If we just used a buffered iterator without allowing the job to schedule separately we'd just block for both partitions right away instead of evaluating the other future in the background while we block on the first. (Implicitly this awaitResult is already effectively done inside of the previous DAGScheduler's runJob.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, you are totally right. That would block while getting the prefetched partition. This looks pretty good to me then.

One question though, when should the first job be triggered? I think the old behavior used to start the first job as soon as toLocalIterator() was called. From what I can tell, this will wait until the first iteration and then trigger the first 2 jobs. Either way is probably fine, but you might get slightly better performance by starting the first job immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In either case it waits for reading a request of data from the Python side before starting a job, because the map on the partition indices is lazily evaluated.

timesPrefetchNext = next(timesIterPrefetch)
print("With prefetch times are: " + str(timesPrefetchHead) + "," + str(timesPrefetchNext))
self.assertTrue(timesNext - timesHead >= timedelta(seconds=2))
self.assertTrue(timesPrefetchNext - timesPrefetchHead < timedelta(seconds=1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty clever test! Anything with timings make me a bit worried about flakiness, but I don't have any other idea how to test this.. Is it possible to see if the jobs were scheduled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could if we used a fresh SparkContext but with the reused context I'm not sure how I'd know if the job was run or not.

@holdenk
Copy link
Contributor Author

holdenk commented Aug 23, 2019

I think Scala support is worth exploring too, I'm happy to file a follow up issue.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just quick nit and to double check, is the benchmark was performed with #25515 (comment)? Seems like the feature was mistakenly disabled.

@@ -0,0 +1,86 @@
#
Copy link
Member

@HyukjinKwon HyukjinKwon Aug 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think examples in this directory target to show how the feature or API is used rather than showing the perf results .. - I think it can be just shown in the PR description.
Virtually the example seems it has to be just .toLocalIterator(prefetchPartitions=True) which I don't think worth as a separate example file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reasonable, I'll remove it from the examples, was mostly a simple way to share the microbenchmark.

rdd = self.sc.parallelize(range(2), 2)
times1 = rdd.map(lambda x: datetime.now())
times2 = rdd.map(lambda x: datetime.now())
timesIterPrefetch = times1.toLocalIterator(prefetchPartitions=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we stick to underscore naming rule?

@holdenk
Copy link
Contributor Author

holdenk commented Aug 23, 2019

So the benchmark was done on RDDs not on dataframes (you can see the benchmark code in this PR).

// Client requested more data, attempt to collect the next partition
val partitionArray = collectPartitionIter.next()
val partitionFuture = prefetchIter.next()
// Cause the next job to be submitted if prefecthPartitions is enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: prefecthPartitions -> prefetchPartitions

time.sleep(2)
timesNext = next(timesIter)
timesPrefetchNext = next(timesIterPrefetch)
print("With prefetch times are: " + str(timesPrefetchHead) + "," + str(timesPrefetchNext))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we remove print?

@SparkQA
Copy link

SparkQA commented Sep 10, 2019

Test build #110432 has finished for PR 25515 at commit f8e67f3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor Author

holdenk commented Sep 14, 2019

Filed the follow up issue in https://issues.apache.org/jira/browse/SPARK-29083

@holdenk
Copy link
Contributor Author

holdenk commented Sep 14, 2019

If there are no more comments by Monday I'll merge this :)

@asfgit asfgit closed this in 42050c3 Sep 20, 2019
@holdenk
Copy link
Contributor Author

holdenk commented Sep 20, 2019

Merged to master

@BryanCutler
Copy link
Member

Late review, but LGTM. Thanks @holdenk !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants