Skip to content

Conversation

@marmbrus
Copy link
Contributor

@marmbrus marmbrus commented Jan 5, 2015

Current spark lets you set the ip address using SPARK_LOCAL_IP, but then this is given to akka after doing a reverse DNS lookup. This makes it difficult to run spark in Docker. You can already change the hostname that is used programmatically, but it would be nice to be able to do this with an environment variable as well.

@SparkQA
Copy link

SparkQA commented Jan 5, 2015

Test build #25039 has started for PR 3893 at commit 85045b6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 5, 2015

Test build #25039 has finished for PR 3893 at commit 85045b6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25039/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already an environment variable called SPARK_PUBLIC_DNS in the docs. This is used to override the default host name in some cases (however, confusingly, in a smaller subset of cases). I wonder if we should just fall back to SPARK_PUBLIC_DNS here and expand its scope slightly. We'd need to audit all of the cases where this is used, but that might be preferable to introducing another override. We could also just have two overrides here, and we explain that SPARK_PUBLIC_DNS takes precedence and is only used in certain cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW - here is how it is defined now:

  <tr>
    <td><code>SPARK_PUBLIC_DNS</code></td>
    <td>Hostname your Spark program will advertise to other machines.</td>
  </tr>

In practice though it looks like it might only be used for the UI links.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that SPARK_LOCAL_HOSTNAME is actually the DNS we want to advertise internally, not externally, so I don't think it's the same as SPARK_PUBLIC_DNS in DBC's case.

For instance, the external DNS is ec2-x-x-x-x.us-west-2.compute.amazonaws.com and the internal one is ip-y-y-y-y.us-west-2.compute.internal.

@pwendell
Copy link
Contributor

pwendell commented Jan 6, 2015

I put some thoughts on a new JIRA about how to clean this up overall in Spark - as for this patch. I'm fine to merge it, but it would be good if we did a proper clean-up of this for 1.3.

https://issues.apache.org/jira/browse/SPARK-5113

@pwendell
Copy link
Contributor

Okay let's pull this in for now. Don't want to block this on SPARK-5113.

asfgit pushed a commit that referenced this pull request Jan 12, 2015
Current spark lets you set the ip address using SPARK_LOCAL_IP, but then this is given to akka after doing a reverse DNS lookup. This makes it difficult to run spark in Docker. You can already change the hostname that is used programmatically, but it would be nice to be able to do this with an environment variable as well.

Author: Michael Armbrust <[email protected]>

Closes #3893 from marmbrus/localHostnameEnv and squashes the following commits:

85045b6 [Michael Armbrust] Optionally read from SPARK_LOCAL_HOSTNAME

(cherry picked from commit a3978f3)
Signed-off-by: Patrick Wendell <[email protected]>
@asfgit asfgit closed this in a3978f3 Jan 12, 2015
@marmbrus marmbrus deleted the localHostnameEnv branch February 17, 2015 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants