Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR cleans up the syntax, and properly copy protobuf assembly jar (currently it copies connect assembly jar by mistake).

Why are the changes needed?

For consistent code style, and correct SBT build.

Does this PR introduce any user-facing change?

No, this isn't released yet.

How was this patch tested?

CI in this PR should test it out.

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

@LuciferYang
Copy link
Contributor

LuciferYang commented Oct 13, 2022

hmm... should we add assemblyPackageScala and assemblyExcludedJars to SparkProtobuf module too, the assembly jar is a little fat

[debug] Including from cache: spark-tags_2.12-3.4.0-SNAPSHOT.jar
[debug] Including from cache: unused-1.0.0.jar
[debug] Including from cache: spark-protobuf_2.12-3.4.0-SNAPSHOT.jar
[debug] Including from cache: scala-collection-compat_2.12-2.2.0.jar
[debug] Including from cache: pmml-model-1.4.8.jar
[debug] Including from cache: protobuf-java-3.21.1.jar
[debug] Including from cache: guava-14.0.1.jar
[debug] Including from cache: scala-library-2.12.17.jar

for example, similar as SparkConnect module:

    // Exclude `scala-library` from assembly.
    (assembly / assemblyPackageScala / assembleArtifact) := false,

    // Exclude `pmml-model-*.jar`, `scala-collection-compat_*.jar`,
    // `spark-tags_*.jar`, "guava-*.jar" and `unused-1.0.0.jar` from assembly.
    (assembly / assemblyExcludedJars) := {
      val cp = (assembly / fullClasspath).value
      cp filter { v =>
        val name = v.data.getName
        name.startsWith("pmml-model-") || name.startsWith("scala-collection-compat_") ||
          name.startsWith("spark-tags_") || name.startsWith("guava-") || name == "unused-1.0.0.jar"
      }
    },

then the assembly jar just includes:

[debug] Including: spark-protobuf_2.12-3.4.0-SNAPSHOT.jar
[debug] Including from cache: protobuf-java-3.21.1.jar

),

(assembly / test) := false,
(assembly / test) := { },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for my own education, is this the same: false -> { }?

looks like a bit magic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they are same. I believe { } is much more common (googled a bit)

@HyukjinKwon
Copy link
Member Author

Let me get this in first in any event since it works (and the built is technically broken without this change).

@HyukjinKwon
Copy link
Member Author

Merged to master.

DeZepTup pushed a commit to DeZepTup/spark-custom that referenced this pull request Oct 31, 2022
### What changes were proposed in this pull request?

This PR cleans up the syntax, and properly copy protobuf assembly jar (currently it copies connect assembly jar by mistake).

### Why are the changes needed?

For consistent code style, and correct SBT build.

### Does this PR introduce _any_ user-facing change?
No, this isn't released yet.

### How was this patch tested?

CI in this PR should test it out.

Closes apache#38240 from HyukjinKwon/SPARK-40654.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?

This PR cleans up the syntax, and properly copy protobuf assembly jar (currently it copies connect assembly jar by mistake).

### Why are the changes needed?

For consistent code style, and correct SBT build.

### Does this PR introduce _any_ user-facing change?
No, this isn't released yet.

### How was this patch tested?

CI in this PR should test it out.

Closes apache#38240 from HyukjinKwon/SPARK-40654.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
@HyukjinKwon HyukjinKwon deleted the SPARK-40654 branch January 15, 2024 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants