Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Mar 13, 2023

What changes were proposed in this pull request?

Like SPARK_EXECUTOR_POD_IP, this PR aims to add a new environment variable ENV_DRIVER_POD_IP to all executor pods.

$ kubectl get pod pi-exec-1 -oyaml | grep -C1 SPARK_DRIVER_POD_IP
      value: "0"
    - name: SPARK_DRIVER_POD_IP
      value: 10.1.0.99

Why are the changes needed?

This is helpful for some executor pods to connect driver pods via IP.

Does this PR introduce any user-facing change?

No, this is a new environment variable.

How was this patch tested?

Pass the CIs with the newly added test case.

@dongjoon-hyun
Copy link
Member Author

Hi, @HyukjinKwon . Could you review this PR when you have some time?

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-42769][K8S] Add ENV_DRIVER_POD_IP env variable to executor pods [SPARK-42769][K8S] Add SPARK_DRIVER_POD_IP env variable to executor pods Mar 13, 2023
@pan3793
Copy link
Member

pan3793 commented Mar 13, 2023

... for some executor pods to connect driver pods via IP.

Hi @dongjoon-hyun, I think it's quite useful, but in #39160 (review), you left a concern

... I have a concern. Currently, Apache Spark uses K8s Service entity via DriverServiceFeatureStep to access Spark driver pod in K8s environment.

do you still concern that now?

@dongjoon-hyun
Copy link
Member Author

@pan3793 .

The goal of PR is different from your PR's goal.

  • Your PR tried to add SPARK_DRIVER_POD_NAME to Driver Pod to expose it to 3rd party pods.
  • This PR aims to add SPARK_DRIVER_POD_IP to Executor Pod in order to help internal communications between Spark executors and Spark driver.

In addition, this is a kind of propagation of the information from the driver pod to the executor pods instead of exposing the executor pods' internal information.

@dongjoon-hyun
Copy link
Member Author

Hi, @viirya . Could you review this PR when you have some time?

val UI_PORT_NAME = "spark-ui"

// Environment Variables
val ENV_DRIVER_POD_IP = "SPARK_DRIVER_POD_IP"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it different to DRIVER_HOST_ADDRESS? I saw K8s uses DRIVER_HOST_ADDRESS to derive driver url for env var ENV_DRIVER_URL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for review, @viirya .

In K8s, DRIVER_HOST_ADDRESS is protected by DriverServiceFeatureStep here.

require(kubernetesConf.getOption(DRIVER_HOST_KEY).isEmpty,
s"$DRIVER_HOST_KEY is not supported in Kubernetes mode, as the driver's hostname will be " +
"managed via a Kubernetes service.")

It's because we inject like this systematically.

val driverHostname = s"$resolvedServiceName.${kubernetesConf.namespace}.svc"
Map(DRIVER_HOST_KEY -> driverHostname,

However, when DNS doesn't work, we need IP which is unknown from the executor pods so far.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not used in this PR, will you make other change to k8s executor pods to use this env var?

@dongjoon-hyun
Copy link
Member Author

Yes, correct, @viirya ! Thank you for the approval.

@dongjoon-hyun
Copy link
Member Author

Merged to master for Apache Spark 3.5.0.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-42769 branch March 13, 2023 17:10
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
… pods

### What changes were proposed in this pull request?

Like `SPARK_EXECUTOR_POD_IP`, this PR aims to add a new environment variable `ENV_DRIVER_POD_IP` to all executor pods.
```bash
$ kubectl get pod pi-exec-1 -oyaml | grep -C1 SPARK_DRIVER_POD_IP
      value: "0"
    - name: SPARK_DRIVER_POD_IP
      value: 10.1.0.99
```

### Why are the changes needed?

This is helpful for some executor pods to connect driver pods via IP.

### Does this PR introduce _any_ user-facing change?

No, this is a new environment variable.

### How was this patch tested?

Pass the CIs with the newly added test case.

Closes apache#40392 from dongjoon-hyun/SPARK-42769.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Yikun added a commit to Yikun/spark-docker that referenced this pull request Sep 14, 2023
Signed-off-by: Yikun Jiang <[email protected]>
Yikun added a commit to Yikun/spark-docker that referenced this pull request Sep 14, 2023
pan3793 added a commit to apache/kyuubi that referenced this pull request Jul 21, 2025
…ARK_DRIVER_POD_IP}}:{{SPARK_UI_PORT}}`

### Why are the changes needed?

We are using [virtual-kubelet](https://github.com/virtual-kubelet/virtual-kubelet) for spark on kubernetes, and spark kubernetes pods would be allocated across kubernetes clusters.

And we use the driver POD ip as driver host, see apache/spark#40392, which is supported since spark-3.5.

The kubernetes context and namespace are virtual and we can not build the app URL by spark driver svc.

And the spark driver pod IP is accessible for our use case, so raise this PR to build the spark app url by spark driver pod id and spark ui port.

### How was this patch tested?

UT.

<img width="1532" height="626" alt="image" src="https://github.com/user-attachments/assets/5cb54602-9e79-40b7-b51c-0b873c17560b" />
<img width="710" height="170" alt="image" src="https://github.com/user-attachments/assets/6d1c9580-62d6-423a-a04f-dc6cdcee940a" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7141 from turboFei/app_url_v2.

Closes #7141

1277952 [Wang, Fei] VAR
d15e6be [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala
1535e00 [Wang, Fei] spark driver pod ip

Lead-authored-by: Wang, Fei <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
turboFei added a commit to turboFei/kyuubi that referenced this pull request Aug 27, 2025
…//{{SPARK_DRIVER_POD_IP}}:{{SPARK_UI_PORT}}`

### Why are the changes needed?

We are using [virtual-kubelet](https://github.com/virtual-kubelet/virtual-kubelet) for spark on kubernetes, and spark kubernetes pods would be allocated across kubernetes clusters.

And we use the driver POD ip as driver host, see apache/spark#40392, which is supported since spark-3.5.

The kubernetes context and namespace are virtual and we can not build the app URL by spark driver svc.

And the spark driver pod IP is accessible for our use case, so raise this PR to build the spark app url by spark driver pod id and spark ui port.

### How was this patch tested?

UT.

<img width="1532" height="626" alt="image" src="https://github.com/user-attachments/assets/5cb54602-9e79-40b7-b51c-0b873c17560b" />
<img width="710" height="170" alt="image" src="https://github.com/user-attachments/assets/6d1c9580-62d6-423a-a04f-dc6cdcee940a" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#7141 from turboFei/app_url_v2.

Closes apache#7141

1277952 [Wang, Fei] VAR
d15e6be [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/engine/KubernetesApplicationOperation.scala
1535e00 [Wang, Fei] spark driver pod ip

Lead-authored-by: Wang, Fei <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants