Skip to content

Conversation

@BryanCutler
Copy link
Member

What changes were proposed in this pull request?

When using PySpark broadcast variables in a multi-threaded environment, SparkContext._pickled_broadcast_vars becomes a shared resource. A race condition can occur when broadcast variables that are pickled from one thread get added to the shared _pickled_broadcast_vars and become part of the python command from another thread. This PR introduces a thread-safe pickled registry using thread local storage so that when python command is pickled (causing the broadcast variable to be pickled and added to the registry) each thread will have their own view of the pickle registry to retrieve and clear the broadcast variables used.

How was this patch tested?

Added a unit test that causes this race condition using another thread.

added regression test for multithreaded broadcast pickle
@BryanCutler
Copy link
Member Author

backport for 2.2

@HyukjinKwon
Copy link
Member

LGTM pending Jenkins tests. Thanks @BryanCutler, I just wanted to be sure if it passes the tests and careful of my vert first proper merge :)..

@HyukjinKwon
Copy link
Member

@BryanCutler mind adding something like [BRANCH-2.2] in the PR title?

@BryanCutler
Copy link
Member Author

Sure no prob. I can add that to the PR title too, but I don't think I've done that in past backports.

@BryanCutler BryanCutler changed the title [SPARK-12717][PYTHON] Adding thread-safe broadcast pickle registry [SPARK-12717][PYTHON][BRANCH-2.2] Adding thread-safe broadcast pickle registry Aug 2, 2017
@HyukjinKwon
Copy link
Member

Yea, that's not in the guide and not required IIRC but just a little suggestion by me.

@SparkQA
Copy link

SparkQA commented Aug 3, 2017

Test build #80185 has finished for PR 18823 at commit 67b14db.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class BroadcastPickleRegistry(threading.local):

asfgit pushed a commit that referenced this pull request Aug 3, 2017
… registry

## What changes were proposed in this pull request?

When using PySpark broadcast variables in a multi-threaded environment,  `SparkContext._pickled_broadcast_vars` becomes a shared resource.  A race condition can occur when broadcast variables that are pickled from one thread get added to the shared ` _pickled_broadcast_vars` and become part of the python command from another thread.  This PR introduces a thread-safe pickled registry using thread local storage so that when python command is pickled (causing the broadcast variable to be pickled and added to the registry) each thread will have their own view of the pickle registry to retrieve and clear the broadcast variables used.

## How was this patch tested?

Added a unit test that causes this race condition using another thread.

Author: Bryan Cutler <[email protected]>

Closes #18823 from BryanCutler/branch-2.2.
@HyukjinKwon
Copy link
Member

Merged into branch-2.2.

@BryanCutler
Copy link
Member Author

BryanCutler commented Aug 3, 2017 via email

@HyukjinKwon
Copy link
Member

@BryanCutler, I know you know this but a reminder to close this as merged into branches.

@BryanCutler BryanCutler closed this Aug 3, 2017
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
… registry

## What changes were proposed in this pull request?

When using PySpark broadcast variables in a multi-threaded environment,  `SparkContext._pickled_broadcast_vars` becomes a shared resource.  A race condition can occur when broadcast variables that are pickled from one thread get added to the shared ` _pickled_broadcast_vars` and become part of the python command from another thread.  This PR introduces a thread-safe pickled registry using thread local storage so that when python command is pickled (causing the broadcast variable to be pickled and added to the registry) each thread will have their own view of the pickle registry to retrieve and clear the broadcast variables used.

## How was this patch tested?

Added a unit test that causes this race condition using another thread.

Author: Bryan Cutler <[email protected]>

Closes apache#18823 from BryanCutler/branch-2.2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants