-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-5078] Optionally read from SPARK_LOCAL_HOSTNAME #3893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #25039 has started for PR 3893 at commit
|
|
Test build #25039 has finished for PR 3893 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already an environment variable called SPARK_PUBLIC_DNS in the docs. This is used to override the default host name in some cases (however, confusingly, in a smaller subset of cases). I wonder if we should just fall back to SPARK_PUBLIC_DNS here and expand its scope slightly. We'd need to audit all of the cases where this is used, but that might be preferable to introducing another override. We could also just have two overrides here, and we explain that SPARK_PUBLIC_DNS takes precedence and is only used in certain cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW - here is how it is defined now:
<tr>
<td><code>SPARK_PUBLIC_DNS</code></td>
<td>Hostname your Spark program will advertise to other machines.</td>
</tr>
In practice though it looks like it might only be used for the UI links.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that SPARK_LOCAL_HOSTNAME is actually the DNS we want to advertise internally, not externally, so I don't think it's the same as SPARK_PUBLIC_DNS in DBC's case.
For instance, the external DNS is ec2-x-x-x-x.us-west-2.compute.amazonaws.com and the internal one is ip-y-y-y-y.us-west-2.compute.internal.
|
I put some thoughts on a new JIRA about how to clean this up overall in Spark - as for this patch. I'm fine to merge it, but it would be good if we did a proper clean-up of this for 1.3. |
|
Okay let's pull this in for now. Don't want to block this on |
Current spark lets you set the ip address using SPARK_LOCAL_IP, but then this is given to akka after doing a reverse DNS lookup. This makes it difficult to run spark in Docker. You can already change the hostname that is used programmatically, but it would be nice to be able to do this with an environment variable as well. Author: Michael Armbrust <[email protected]> Closes #3893 from marmbrus/localHostnameEnv and squashes the following commits: 85045b6 [Michael Armbrust] Optionally read from SPARK_LOCAL_HOSTNAME (cherry picked from commit a3978f3) Signed-off-by: Patrick Wendell <[email protected]>
Current spark lets you set the ip address using SPARK_LOCAL_IP, but then this is given to akka after doing a reverse DNS lookup. This makes it difficult to run spark in Docker. You can already change the hostname that is used programmatically, but it would be nice to be able to do this with an environment variable as well.