-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-42769][K8S] Add SPARK_DRIVER_POD_IP env variable to executor pods
#40392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi, @HyukjinKwon . Could you review this PR when you have some time? |
ENV_DRIVER_POD_IP env variable to executor podsSPARK_DRIVER_POD_IP env variable to executor pods
Hi @dongjoon-hyun, I think it's quite useful, but in #39160 (review), you left a concern
do you still concern that now? |
|
@pan3793 . The goal of PR is different from your PR's goal.
In addition, this is a kind of propagation of the information from the driver pod to the executor pods instead of exposing the executor pods' internal information. |
|
Hi, @viirya . Could you review this PR when you have some time? |
| val UI_PORT_NAME = "spark-ui" | ||
|
|
||
| // Environment Variables | ||
| val ENV_DRIVER_POD_IP = "SPARK_DRIVER_POD_IP" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it different to DRIVER_HOST_ADDRESS? I saw K8s uses DRIVER_HOST_ADDRESS to derive driver url for env var ENV_DRIVER_URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for review, @viirya .
In K8s, DRIVER_HOST_ADDRESS is protected by DriverServiceFeatureStep here.
Line 109 in 7fccd67
| val DRIVER_HOST_KEY = config.DRIVER_HOST_ADDRESS.key |
Lines 38 to 40 in 7fccd67
| require(kubernetesConf.getOption(DRIVER_HOST_KEY).isEmpty, | |
| s"$DRIVER_HOST_KEY is not supported in Kubernetes mode, as the driver's hostname will be " + | |
| "managed via a Kubernetes service.") |
It's because we inject like this systematically.
Lines 67 to 68 in 7fccd67
| val driverHostname = s"$resolvedServiceName.${kubernetesConf.namespace}.svc" | |
| Map(DRIVER_HOST_KEY -> driverHostname, |
However, when DNS doesn't work, we need IP which is unknown from the executor pods so far.
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not used in this PR, will you make other change to k8s executor pods to use this env var?
|
Yes, correct, @viirya ! Thank you for the approval. |
|
Merged to master for Apache Spark 3.5.0. |
… pods
### What changes were proposed in this pull request?
Like `SPARK_EXECUTOR_POD_IP`, this PR aims to add a new environment variable `ENV_DRIVER_POD_IP` to all executor pods.
```bash
$ kubectl get pod pi-exec-1 -oyaml | grep -C1 SPARK_DRIVER_POD_IP
value: "0"
- name: SPARK_DRIVER_POD_IP
value: 10.1.0.99
```
### Why are the changes needed?
This is helpful for some executor pods to connect driver pods via IP.
### Does this PR introduce _any_ user-facing change?
No, this is a new environment variable.
### How was this patch tested?
Pass the CIs with the newly added test case.
Closes apache#40392 from dongjoon-hyun/SPARK-42769.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Yikun Jiang <[email protected]>
Signed-off-by: Yikun Jiang <[email protected]>
…ARK_DRIVER_POD_IP}}:{{SPARK_UI_PORT}}`
### Why are the changes needed?
We are using [virtual-kubelet](https://github.com/virtual-kubelet/virtual-kubelet) for spark on kubernetes, and spark kubernetes pods would be allocated across kubernetes clusters.
And we use the driver POD ip as driver host, see apache/spark#40392, which is supported since spark-3.5.
The kubernetes context and namespace are virtual and we can not build the app URL by spark driver svc.
And the spark driver pod IP is accessible for our use case, so raise this PR to build the spark app url by spark driver pod id and spark ui port.
### How was this patch tested?
UT.
<img width="1532" height="626" alt="image" src="https://github.com/user-attachments/assets/5cb54602-9e79-40b7-b51c-0b873c17560b" />
<img width="710" height="170" alt="image" src="https://github.com/user-attachments/assets/6d1c9580-62d6-423a-a04f-dc6cdcee940a" />
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #7141 from turboFei/app_url_v2.
Closes #7141
1277952 [Wang, Fei] VAR
d15e6be [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala
1535e00 [Wang, Fei] spark driver pod ip
Lead-authored-by: Wang, Fei <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
…//{{SPARK_DRIVER_POD_IP}}:{{SPARK_UI_PORT}}`
### Why are the changes needed?
We are using [virtual-kubelet](https://github.com/virtual-kubelet/virtual-kubelet) for spark on kubernetes, and spark kubernetes pods would be allocated across kubernetes clusters.
And we use the driver POD ip as driver host, see apache/spark#40392, which is supported since spark-3.5.
The kubernetes context and namespace are virtual and we can not build the app URL by spark driver svc.
And the spark driver pod IP is accessible for our use case, so raise this PR to build the spark app url by spark driver pod id and spark ui port.
### How was this patch tested?
UT.
<img width="1532" height="626" alt="image" src="https://github.com/user-attachments/assets/5cb54602-9e79-40b7-b51c-0b873c17560b" />
<img width="710" height="170" alt="image" src="https://github.com/user-attachments/assets/6d1c9580-62d6-423a-a04f-dc6cdcee940a" />
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes apache#7141 from turboFei/app_url_v2.
Closes apache#7141
1277952 [Wang, Fei] VAR
d15e6be [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala
1535e00 [Wang, Fei] spark driver pod ip
Lead-authored-by: Wang, Fei <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
What changes were proposed in this pull request?
Like
SPARK_EXECUTOR_POD_IP, this PR aims to add a new environment variableENV_DRIVER_POD_IPto all executor pods.Why are the changes needed?
This is helpful for some executor pods to connect driver pods via IP.
Does this PR introduce any user-facing change?
No, this is a new environment variable.
How was this patch tested?
Pass the CIs with the newly added test case.