Skip to content
This repository was archived by the owner on Oct 23, 2024. It is now read-only.

Conversation

@susanxhuynh
Copy link

@susanxhuynh susanxhuynh commented Jun 5, 2018

Issue

Spark jobs with a shuffle fail when CNI network is used. The shuffle block fetcher fails to connect to the Executor IP address, which is 0.0.0.0.

What changes were proposed in this pull request?

When launching Executors, instead of hardcoding the Executor IP to "0.0.0.0", use the "bootstrap" utility to get the Executor's IP address. This will return the correct IP address within a CNI network.

How was this patch tested?

Manual test. Fails without this patch:
dcos spark run --submit-args="
--conf spark.cores.max=4 --conf spark.executor.cores=1
--conf spark.mesos.containerizer=mesos
--conf spark.mesos.network.name=dcos
https://xhuynh-dev.s3.amazonaws.com/shuffle.py"

@susanxhuynh susanxhuynh changed the title [WIP] Use bootstrap to get Executor IP in Executor command [WIP][DCOS-37643] Use bootstrap to get Executor IP in Executor command Jun 5, 2018
@akirillov
Copy link

Closing this in favour of #44 which doesn't introduce a dependency on bootstrap script inside Spark code

@akirillov akirillov closed this Jan 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants