Skip to content

Conversation

@ocworld
Copy link
Contributor

@ocworld ocworld commented Apr 29, 2021

What changes were proposed in this pull request?

Supporting '--packages' in the k8s cluster mode

Why are the changes needed?

In spark 3, '--packages' in the k8s cluster mode is not supported. I expected that managing dependencies by using packages like spark 2.

Spark 2.4.5

https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

     if (!isMesosCluster && !isStandAloneCluster) {
      // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files
      // too for packages that include Python code
      val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies(
        args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath,
        args.ivySettingsPath)
      
      if (!StringUtils.isBlank(resolvedMavenCoordinates)) {
        args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates)
        if (args.isPython || isInternal(args.primaryResource)) {
          args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates)
        }
      } 
      
      // install any R packages that may have been passed through --jars or --packages.
      // Spark Packages may contain R source code inside the jar.
      if (args.isR && !StringUtils.isBlank(args.jars)) {
        RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose)
      }
    } 

Spark 3.0.2

https://github.com/apache/spark/blob/v3.0.2/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

       if (!StringUtils.isBlank(resolvedMavenCoordinates)) {
        // In K8s client mode, when in the driver, add resolved jars early as we might need
        // them at the submit time for artifact downloading.
        // For example we might use the dependencies for downloading
        // files from a Hadoop Compatible fs eg. S3. In this case the user might pass:
        // --packages com.amazonaws:aws-java-sdk:1.7.4:org.apache.hadoop:hadoop-aws:2.7.6
        if (isKubernetesClusterModeDriver) {
          val loader = getSubmitClassLoader(sparkConf)
          for (jar <- resolvedMavenCoordinates.split(",")) {
            addJarToClasspath(jar, loader)
          }
        } else if (isKubernetesCluster) {
          // We need this in K8s cluster mode so that we can upload local deps
          // via the k8s application, like in cluster mode driver
          childClasspath ++= resolvedMavenCoordinates.split(",")
        } else {
          args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates)
          if (args.isPython || isInternal(args.primaryResource)) {
            args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates)
          }
        }
      }

unlike spark2, in spark 3, jars are not added in any place.

Does this PR introduce any user-facing change?

Unlike spark 2, resolved jars are added not in cluster mode spark submit but in driver.

It's because in spark 3, the feature is added that is uploading jars with prefix "file://" to s3.
So, if resolved jars are added in spark submit, every jars from packages are uploading to s3! When I tested it, it is very bad experience to me.

How was this patch tested?

In my k8s environment, i tested the code.

@ocworld
Copy link
Contributor Author

ocworld commented Apr 29, 2021

@AhnLab-OSS

@ocworld
Copy link
Contributor Author

ocworld commented Apr 29, 2021

Can I see logs about failure of build and test?

@github-actions github-actions bot added the CORE label Apr 29, 2021
@dongjoon-hyun
Copy link
Member

Thank you for your contribution, @ocworld .

@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun
Copy link
Member

Can I see logs about failure of build and test?

Apache Spark GitHub Action is running in your GitHub Account.

Screen Shot 2021-05-01 at 4 11 15 PM

If you click the failure link, you can see the error log in your repo. The current failure occurs because your master branch is too outdate. Please follow the guideline to sync your master branch.

Unable to detect the workflow run for testing the changes in your PR.

If you did not enable GitHub Actions in your forked repository, please enable it. See also Disabling or limiting GitHub Actions for a repository for more details.
It is possible your branch is based on the old master branch in Apache Spark, please sync your branch to the latest master branch. For example as below:
git fetch upstream
git rebase upstream/master
git push origin YOUR_BRANCH --force

In addition, I also triggered Jenkins machine too. For those users, we have another test machine additionally.

@SparkQA
Copy link

SparkQA commented May 2, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42657/

@SparkQA
Copy link

SparkQA commented May 2, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42657/

@ocworld
Copy link
Contributor Author

ocworld commented May 2, 2021

@dongjoon-hyun Thank you for your test and reply.
Though rebasing from upstream, checking build and test is failed now.
The jenkins build is also failed with logs.

Setting status of fe0a4b5625ffceaa6c81e9f63637214d08369642 to FAILURE with url https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42657/ and message: 'Build finished. '
FileNotFoundException means that the credentials Jenkins is using is probably wrong. Or the user account does not have write access to the repo.
org.kohsuke.github.GHFileNotFoundException: https://api.github.com/repos/apache/spark/statuses/fe0a4b5625ffceaa6c81e9f63637214d08369642 {"message":"Not Found","documentation_url":"https://docs.github.com/rest/reference/repos#create-a-commit-status"}
	at org.kohsuke.github.GitHubClient.interpretApiError(GitHubClient.java:492)
	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:420)
	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:363)
	at org.kohsuke.github.Requester.fetch(Requester.java:74)
	at org.kohsuke.github.GHRepository.createCommitStatus(GHRepository.java:1906)
	at org.jenkinsci.plugins.ghprb.extensions.status.GhprbSimpleStatus.createCommitStatus(GhprbSimpleStatus.java:283)
	at org.jenkinsci.plugins.ghprb.extensions.status.GhprbSimpleStatus.onBuildComplete(GhprbSimpleStatus.java:241)
	at org.jenkinsci.plugins.ghprb.GhprbBuilds.onCompleted(GhprbBuilds.java:205)
	at org.jenkinsci.plugins.ghprb.GhprbBuildListener.onCompleted(GhprbBuildListener.java:28)
	at hudson.model.listeners.RunListener.fireCompleted(RunListener.java:209)
	at hudson.model.Run.execute(Run.java:1952)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:97)
	at hudson.model.Executor.run(Executor.java:429)
Caused by: java.io.FileNotFoundException: https://api.github.com/repos/apache/spark/statuses/fe0a4b5625ffceaa6c81e9f63637214d08369642
	at sun.reflect.GeneratedConstructorAccessor1029.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1950)
	at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1945)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1944)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1514)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
	at org.kohsuke.github.GitHubHttpUrlConnectionClient$HttpURLConnectionResponseInfo.bodyStream(GitHubHttpUrlConnectionClient.java:197)
	at org.kohsuke.github.GitHubResponse$ResponseInfo.getBodyAsString(GitHubResponse.java:326)
	at org.kohsuke.github.GitHubResponse.parseBody(GitHubResponse.java:91)
	at org.kohsuke.github.Requester.lambda$fetch$1(Requester.java:74)
	at org.kohsuke.github.GitHubClient.createResponse(GitHubClient.java:461)
	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:412)
	... 12 more
Caused by: java.io.FileNotFoundException: https://api.github.com/repos/apache/spark/statuses/fe0a4b5625ffceaa6c81e9f63637214d08369642
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1896)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
	at org.kohsuke.github.GitHubHttpUrlConnectionClient.getResponseInfo(GitHubHttpUrlConnectionClient.java:69)
	at org.kohsuke.github.GitHubClient.sendRequest(GitHubClient.java:400)
	... 12 more

@SparkQA
Copy link

SparkQA commented May 2, 2021

Test build #138136 has finished for PR 32397 at commit fe0a4b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 2, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42658/

@SparkQA
Copy link

SparkQA commented May 2, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42658/

@SparkQA
Copy link

SparkQA commented May 2, 2021

Test build #138137 has finished for PR 32397 at commit 31783a6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

@ocworld can you take a look https://github.com/apache/spark/pull/32397/checks?check_run_id=2484554770 and enable Github Actions in your forked repository?

@ocworld
Copy link
Contributor Author

ocworld commented May 4, 2021

@HyukjinKwon Github actions setting is enabled as allow all actions in my forked repository.

@ocworld
Copy link
Contributor Author

ocworld commented May 4, 2021

@HyukjinKwon

image

@HyukjinKwon
Copy link
Member

Can you rebase and force push in this PR to retrigger the build?

@ocworld
Copy link
Contributor Author

ocworld commented May 4, 2021

@HyukjinKwon I've already rebase and force push. So, now, everything is up-to-date.
After merging others' some pr, i'll retry it.

@HyukjinKwon
Copy link
Member

did you do something like #32400 (comment)? The GitHub Actions should run the build in your forked repository but it seems not.

@ocworld
Copy link
Contributor Author

ocworld commented May 4, 2021

@HyukjinKwon Oh! you're right. Workflow in actions tab was not activated. It is enabled now. Thank you.

@SparkQA
Copy link

SparkQA commented May 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42679/

@SparkQA
Copy link

SparkQA commented May 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42679/

@SparkQA
Copy link

SparkQA commented May 5, 2021

Test build #138158 has finished for PR 32397 at commit bc366a9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mridulm
Copy link
Contributor

mridulm commented May 6, 2021

+CC @shardulm94, @xkrogen

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more about the procedure how you test in the PR description?

# How was this patch tested?
In my k8s environment, i tested the code.

The best way to protect this feature is having a test coverage at BasicTestsSuite or DepsTestsSuite. Do you think you can add a new test case test, @ocworld ?

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42903/

@ocworld
Copy link
Contributor Author

ocworld commented May 11, 2021

@dongjoon-hyun I reviewed "BasicTestsSuite" and "DepsTestsSuite"

I think it is hard to add test case in it.

It's because it is needed to get spark.jars in the running application's spark context on the driver pod. I don't have any idea about it, now.

Do you have any idea about test?

@SparkQA
Copy link

SparkQA commented May 11, 2021

Test build #138380 has finished for PR 32397 at commit 76bdccc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 28, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43556/

@SparkQA
Copy link

SparkQA commented Aug 1, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46433/

@SparkQA
Copy link

SparkQA commented Aug 20, 2021

Test build #142659 has finished for PR 32397 at commit 554127b.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47160/

@SparkQA
Copy link

SparkQA commented Aug 20, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47160/

@GaruGaru
Copy link

GaruGaru commented Nov 2, 2021

Any update on this ?

@ocworld ocworld changed the title [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode [WIP][SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode Feb 9, 2022
@grcanosa
Copy link

Any news on when this is going to be merged?

@ocworld
Copy link
Contributor Author

ocworld commented Feb 25, 2022

Any news on when this is going to be merged?

It is not merged now. I've tried to add test.

@github-actions
Copy link

github-actions bot commented Jun 6, 2022

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jun 6, 2022
@github-actions github-actions bot closed this Jun 7, 2022
@rahulbhatia2702
Copy link

is there a solution for this?

@AlexandrePieroux
Copy link

AlexandrePieroux commented Oct 20, 2022

is there a solution for this?

In my case I just build the docker images (from the official repo) with the dependencies included by hand.

ex:
FROM apache/spark-py:v3.3.0
ARG spark_uid=185
USER 0
RUN chown ${spark_uid} -R /opt/spark

then

RUN curl https://repo1.maven.org/maven2/org/apache/spark/spark-sql-kafka-0-10_2.12/3.1.2/spark-sql-kafka-0-10_2.12-3.1.2.jar --output /opt/spark/jars/spark-sql-kafka-0-10_2.12-3.1.2.jar

and at the end

USER ${spark_uid}

Still looking for the day this is included in spark release.

@jbguerraz
Copy link

@dongjoon-hyun may be this PR could be re-opened and stale removed since this is still an issue ? Thank you!

@holdenk
Copy link
Contributor

holdenk commented Nov 28, 2022

The repo that submitted the PR is deleted so we can't re-open it. But feel free to recreate and go for it.

@ocworld
Copy link
Contributor Author

ocworld commented Nov 29, 2022

I'll reopen the pr soon.

@ocworld
Copy link
Contributor Author

ocworld commented Nov 29, 2022

@jbguerraz @holdenk The pr is recreated in #38828

@ulan-yisaev
Copy link

RUN curl https://repo1.maven.org/maven2/org/apache/spark/spark-sql-kafka-0-10_2.12/3.1.2/spark-sql-kafka-0-10_2.12-3.1.2.jar --output /opt/spark/jars/spark-sql-kafka-0-10_2.12-3.1.2.jar

Thank you so much for the workaround, and it's a pity that this bug took two days of my life :)

@ocworld
Copy link
Contributor Author

ocworld commented Jan 24, 2023

@jbguerraz @GaruGaru The pr(#38828) is merged now

@jbguerraz
Copy link

Thank you @ocworld :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.