[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build failure in SBT Hadoop 2.6 master on Jenkins #17651

HyukjinKwon · 2017-04-16T16:09:20Z

What changes were proposed in this pull request?

This PR proposes to force Avro's version to 1.7.7 in core to resolve the build failure as below:

[error] /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.6/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala:123: value createDatumWriter is not a member of org.apache.avro.generic.GenericData
[error]     writerCache.getOrElseUpdate(schema, GenericData.get.createDatumWriter(schema))
[error]

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/2770/consoleFull

Note that this is a hack and should be removed in the future.

How was this patch tested?

I only tested this actually overrides the dependency.

I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this.

Please refer the comments in #17477 (comment)

…2.6 master on Jenkins

SparkQA · 2017-04-16T18:57:35Z

Test build #75837 has finished for PR 17651 at commit 56c1eba.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung · 2017-04-16T20:41:53Z

perhaps have a reference in pom.xml to this so they both change together the next time?

HyukjinKwon · 2017-04-16T20:51:30Z

I left a uesless comment and removed it back (I misunderstood). Yes, I will add a small comment.

HyukjinKwon · 2017-04-16T21:20:30Z

cc @srowen, could you check if it makes sense to you?

SparkQA · 2017-04-17T00:08:56Z

Test build #75843 has finished for PR 17651 at commit 0031804.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2017-04-17T09:31:52Z

I'd be interested to know if this resolves the issue, and the only real way is to merge it. It's not wrong, but it is hacky. That said I don't have a better idea at the moment.

HyukjinKwon · 2017-04-17T11:54:59Z

Yes... I hope anyone is able to reproduce this by the steps I did - #17477 (comment) or confirm I did this wrong.

I am fine with leaving this open for some days more and see if there is anyone who at least confirm this fxies the issue or has a better idea.

HyukjinKwon · 2017-04-17T11:59:05Z

FWIW, I at least checked this overrides the dependency (after manually mismatching the versions between pom and sbt) if I haven't done something wrong.

srowen · 2017-04-18T10:05:15Z

Merged to master to see what it does for the SBT 2.6 build

HyukjinKwon · 2017-04-18T10:45:00Z

hm.. sorry, it seems not working with this too ...

[error] /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.6/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala:123: value createDatumWriter is not a member of org.apache.avro.generic.GenericData
[error]     writerCache.getOrElseUpdate(schema, GenericData.get.createDatumWriter(schema))
[error]

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastBuild/

HyukjinKwon · 2017-04-18T10:45:23Z

I will open a PR to revert this.

srowen · 2017-04-18T10:48:05Z

Hm. In addition to that, am I right that it works to just not try to build scaladoc in SBT? That may be the best we can do in the short term

HyukjinKwon · 2017-04-18T10:54:27Z

Yes, I think it only fails to build scaladoc in SBT. Let me check if I can only disable it alone.

HyukjinKwon · 2017-04-18T14:34:08Z

@srowen, maybe it sounds (super) stupid. However, what do you think about disabling unidoc build only in that machine via environment variables?

I assume we are checking AMPLAB_JENKINS_BUILD_TOOL and AMPLAB_JENKINS_BUILD_PROFILE here https://github.com/apache/spark/blob/master/dev/run-tests.py#L519-L520

Assuming from the logs, it seems environment variables are as below:

spark-master-test-maven-hadoop-2.6 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/lastBuild/consoleFull

JAVA_HOME=/usr/java/jdk1.8.0_60
JAVA_7_HOME=/usr/java/jdk1.7.0_79
HADOOP_PROFILE=hadoop-2.6
HADOOP_VERSION=
SPARK_BRANCH=master
AMPLAB_JENKINS="true"

spark-master-test-maven-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/lastBuild/consoleFull

JAVA_HOME=/usr/java/jdk1.8.0_60
JAVA_7_HOME=/usr/java/jdk1.7.0_79
HADOOP_PROFILE=hadoop-2.7
HADOOP_VERSION=
SPARK_BRANCH=master
AMPLAB_JENKINS="true"

spark-master-test-sbt-hadoop-2.6 (this one is being failed) - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastBuild/consoleFull

JAVA_HOME=/usr/java/jdk1.8.0_60
JAVA_7_HOME=/usr/java/jdk1.7.0_79
SPARK_BRANCH=master
AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6
AMPLAB_JENKINS="true"

spark-master-test-sbt-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/lastBuild/consoleFull

JAVA_HOME=/usr/java/jdk1.8.0_60
JAVA_7_HOME=/usr/java/jdk1.7.0_79
SPARK_BRANCH=master
AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.7
AMPLAB_JENKINS="true"

PR builder - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75843/consoleFull

JENKINS_MASTER_HOSTNAME=amp-jenkins-master
JAVA_HOME=/usr/java/jdk1.8.0_60
JAVA_7_HOME=/usr/java/jdk1.7.0_79

So, probably, I could just add a if-else via AMPLAB_JENKINS_BUILD_TOOL and AMPLAB_JENKINS_BUILD_PROFILE as a band-aid fix.

HyukjinKwon · 2017-04-18T14:36:10Z

Or, do you maybe know an easy way to exclude Scaladoc? I just searched a bit but I could not find a short and clean way. Up to my knowledge, I should add a weird logics in SparkBuild.scala to make it working fine locally and with jekyll build but not working in some environments.

srowen · 2017-04-18T15:59:22Z

I see, so only disable scaladoc generation for the SBT + 2.6 build? that sounds even better. I don't know much about SBT I'm afraid. If it's complex, maybe best to punt right now. But yes that seems better if possible.

HyukjinKwon · 2017-04-18T23:21:44Z

Oh.. actually, I meant little bit different I guess. I meant disabling sbt unidoc if both AMPLAB_JENKINS_BUILD_TOOL and AMPLAB_JENKINS_BUILD_PROFILE are set to hafoop-2.6 and sbt. So, this will not be skipped unless someone set two variables to the environment variables (and I hope this is only the case in that machine) if I understood correctly.

Probably, let me add this change to the revert PR (I guess it might be okay). I will cc you after looking into this more and then push some commits.

…tly set in SBT build ## What changes were proposed in this pull request? This PR proposes two things as below: - Avoid Unidoc build only if Hadoop 2.6 is explicitly set in SBT build Due to a different dependency resolution in SBT & Unidoc by an unknown reason, the documentation build fails on a specific machine & environment in Jenkins but it was unable to reproduce. So, this PR just checks an environment variable `AMPLAB_JENKINS_BUILD_PROFILE` that is set in Hadoop 2.6 SBT build against branches on Jenkins, and then disables Unidoc build. **Note that PR builder will still build it with Hadoop 2.6 & SBT.** ``` ======================================================================== Building Unidoc API Documentation ======================================================================== [info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive unidoc Using /usr/java/jdk1.8.0_60 as default JAVA_HOME. ... ``` I checked the environment variables from the logs (first bit) as below: - **spark-master-test-sbt-hadoop-2.6** (this one is being failed) - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 <- I use this variable AMPLAB_JENKINS="true" ``` - spark-master-test-sbt-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.7 AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.6 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.7 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - PR builder - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75843/consoleFull ``` JENKINS_MASTER_HOSTNAME=amp-jenkins-master JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 ``` Assuming from other logs in branch-2.1 - SBT & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 AMPLAB_JENKINS="true" ``` - Maven & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS="true" ``` We have been using the same convention for those variables. These are actually being used in `run-tests.py` script - here https://github.com/apache/spark/blob/master/dev/run-tests.py#L519-L520 - Revert the previous try After #17651, it seems the build still fails on SBT Hadoop 2.6 master. I am unable to reproduce this - #17477 (comment) and the reviewer was too. So, this got merged as it looks the only way to verify this is to merge it currently (as no one seems able to reproduce this). ## How was this patch tested? I only checked `is_hadoop_version_2_6 = os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6"` is working fine as expected as below: ```python >>> import collections >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.6"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") False >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.7"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True >>> os = collections.namedtuple('os', 'environ')(environ={}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True ``` I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this. Please refer the comments in #17477 (comment) Author: hyukjinkwon <[email protected]> Closes #17669 from HyukjinKwon/revert-SPARK-20343.

…tly set in SBT build ## What changes were proposed in this pull request? This PR proposes two things as below: - Avoid Unidoc build only if Hadoop 2.6 is explicitly set in SBT build Due to a different dependency resolution in SBT & Unidoc by an unknown reason, the documentation build fails on a specific machine & environment in Jenkins but it was unable to reproduce. So, this PR just checks an environment variable `AMPLAB_JENKINS_BUILD_PROFILE` that is set in Hadoop 2.6 SBT build against branches on Jenkins, and then disables Unidoc build. **Note that PR builder will still build it with Hadoop 2.6 & SBT.** ``` ======================================================================== Building Unidoc API Documentation ======================================================================== [info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: -Phadoop-2.6 -Pmesos -Pkinesis-asl -Pyarn -Phive-thriftserver -Phive unidoc Using /usr/java/jdk1.8.0_60 as default JAVA_HOME. ... ``` I checked the environment variables from the logs (first bit) as below: - **spark-master-test-sbt-hadoop-2.6** (this one is being failed) - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 <- I use this variable AMPLAB_JENKINS="true" ``` - spark-master-test-sbt-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=master AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.7 AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.6 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - spark-master-test-maven-hadoop-2.7 - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.7 HADOOP_VERSION= SPARK_BRANCH=master AMPLAB_JENKINS="true" ``` - PR builder - https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75843/consoleFull ``` JENKINS_MASTER_HOSTNAME=amp-jenkins-master JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 ``` Assuming from other logs in branch-2.1 - SBT & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-sbt-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.6 AMPLAB_JENKINS="true" ``` - Maven & Hadoop 2.6 against branch-2.1 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.1-test-maven-hadoop-2.6/lastBuild/consoleFull ``` JAVA_HOME=/usr/java/jdk1.8.0_60 JAVA_7_HOME=/usr/java/jdk1.7.0_79 HADOOP_PROFILE=hadoop-2.6 HADOOP_VERSION= SPARK_BRANCH=branch-2.1 AMPLAB_JENKINS="true" ``` We have been using the same convention for those variables. These are actually being used in `run-tests.py` script - here https://github.com/apache/spark/blob/master/dev/run-tests.py#L519-L520 - Revert the previous try After #17651, it seems the build still fails on SBT Hadoop 2.6 master. I am unable to reproduce this - #17477 (comment) and the reviewer was too. So, this got merged as it looks the only way to verify this is to merge it currently (as no one seems able to reproduce this). ## How was this patch tested? I only checked `is_hadoop_version_2_6 = os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6"` is working fine as expected as below: ```python >>> import collections >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.6"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") False >>> os = collections.namedtuple('os', 'environ')(environ={"AMPLAB_JENKINS_BUILD_PROFILE": "hadoop2.7"}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True >>> os = collections.namedtuple('os', 'environ')(environ={}) >>> print(not os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE") == "hadoop2.6") True ``` I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this. Please refer the comments in #17477 (comment) Author: hyukjinkwon <[email protected]> Closes #17669 from HyukjinKwon/revert-SPARK-20343. (cherry picked from commit 3537876) Signed-off-by: Sean Owen <[email protected]>

…ailure in SBT Hadoop 2.6 master on Jenkins ## What changes were proposed in this pull request? This PR proposes to force Avro's version to 1.7.7 in core to resolve the build failure as below: ``` [error] /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.6/core/src/main/scala/org/apache/spark/serializer/GenericAvroSerializer.scala:123: value createDatumWriter is not a member of org.apache.avro.generic.GenericData [error] writerCache.getOrElseUpdate(schema, GenericData.get.createDatumWriter(schema)) [error] ``` https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/2770/consoleFull Note that this is a hack and should be removed in the future. ## How was this patch tested? I only tested this actually overrides the dependency. I tried many ways but I was unable to reproduce this in my local. Sean also tried the way I did but he was also unable to reproduce this. Please refer the comments in apache#17477 (comment) Author: hyukjinkwon <[email protected]> Closes apache#17651 from HyukjinKwon/SPARK-20343-sbt.

Force Avro 1.7.7 in sbt build to resolve build failure in SBT Hadoop …

56c1eba

…2.6 master on Jenkins

Add a simple comment as a reminder

0031804

asfgit closed this in d4f10cb Apr 18, 2017

HyukjinKwon mentioned this pull request Apr 18, 2017

[SPARK-20343][BUILD] Avoid Unidoc build only if Hadoop 2.6 is explicitly set in SBT build #17669

Closed

HyukjinKwon deleted the SPARK-20343-sbt branch January 2, 2018 03:38

HyukjinKwon mentioned this pull request Mar 25, 2020

[MINOR][DOCS] Fix some links for python api doc #28017

Closed

[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build failure in SBT Hadoop 2.6 master on Jenkins #17651

[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build failure in SBT Hadoop 2.6 master on Jenkins #17651

Uh oh!

Conversation

HyukjinKwon commented Apr 16, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Apr 16, 2017

Uh oh!

felixcheung commented Apr 16, 2017

Uh oh!

HyukjinKwon commented Apr 16, 2017

Uh oh!

HyukjinKwon commented Apr 16, 2017

Uh oh!

SparkQA commented Apr 17, 2017

Uh oh!

srowen commented Apr 17, 2017

Uh oh!

HyukjinKwon commented Apr 17, 2017

Uh oh!

HyukjinKwon commented Apr 17, 2017

Uh oh!

srowen commented Apr 18, 2017

Uh oh!

HyukjinKwon commented Apr 18, 2017

Uh oh!

HyukjinKwon commented Apr 18, 2017

Uh oh!

srowen commented Apr 18, 2017

Uh oh!

HyukjinKwon commented Apr 18, 2017

Uh oh!

HyukjinKwon commented Apr 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Apr 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen commented Apr 18, 2017

Uh oh!

HyukjinKwon commented Apr 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HyukjinKwon commented Apr 18, 2017 •

edited

Loading

HyukjinKwon commented Apr 18, 2017 •

edited

Loading

HyukjinKwon commented Apr 18, 2017 •

edited

Loading