Skip to content

Conversation

@jerryshao
Copy link
Contributor

Current Java file stream doesn't support custom key/value type because of loss of type information, details can be seen in SPARK-5297. Fix this problem by getting correct ClassTag from Class[_].

@SparkQA
Copy link

SparkQA commented Jan 19, 2015

Test build #25749 has started for PR 4101 at commit ec0131c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 19, 2015

Test build #25749 has finished for PR 4101 at commit ec0131c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25749/
Test PASSed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll have to keep and deprecate the existing method. At least, that removes the need for a Mima exclusion, and doesn't do any harm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to deprecate the old method or just to modify the old one, which is best? because the problem is that the old method actually has a bug, not outdated. If we just deprecate it, users can still call this with only warning, and this will lead to exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I doubt there are callers of the old method since, if we're right, it does not work at all. Still it is simple to retain and deprecate it to avoid removing a public API method.

@pwendell
Copy link
Contributor

So on this one, it took me a minute to figure out why this is breaking. The issue is that our workaround of casting to classtag[Any] doesn't work because we actually do need to pass a specific class when the NewHadoopRDD is created and down the chain Spark grabs this from the ClassTag:

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L765

In terms of removing this, from what I can tell, this definitely would immediately fail for anyone who wanted to use this at runtime since it will always assume Object is the input format. So it seems fine to just remove the old method. Would you be okay with removing it @srowen?

@pwendell
Copy link
Contributor

This LGTM in the current form, but awaiting feedback from @srowen.

@srowen
Copy link
Member

srowen commented Jan 20, 2015

@pwendell I'm OK with either approach. I agree that I can't see how this would work for anyone now, so removing the method only 'breaks' an API technically. I had thought it was merely simpler to deprecate rather than Mima-exclude, remove, and worry about it, but I have no objection to removing the method.

@pwendell
Copy link
Contributor

@jerryshao mind bringing this up to date?

@jerryshao
Copy link
Contributor Author

OK, will do.

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25861 has started for PR 4101 at commit e022ca3.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25861 has finished for PR 4101 at commit e022ca3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25861/
Test FAILed.

@jerryshao
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25867 has started for PR 4101 at commit e022ca3.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25867 has finished for PR 4101 at commit e022ca3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25867/
Test PASSed.

@pwendell
Copy link
Contributor

Okay - I'll merge it as-is. I prefer removing the old ones because otherwise users might be confused about which to use. I'm pretty ambivalent though, but no strong opinions from @srowen either.

@pwendell
Copy link
Contributor

@jerryshao - this did not merge cleanly into Spark 1.2. Can you make a separate PR?

@asfgit asfgit closed this in 424d8c6 Jan 21, 2015
@srowen
Copy link
Member

srowen commented Jan 21, 2015

Feel free to remove it, I dont have a strong preference either.

@jerryshao
Copy link
Contributor Author

Thanks a lot :), I will backport this to 1.2 branch.

bomeng pushed a commit to Huawei-Spark/spark that referenced this pull request Jan 22, 2015
Current Java file stream doesn't support custom key/value type because of loss of type information, details can be seen in [SPARK-5297](https://issues.apache.org/jira/browse/SPARK-5297). Fix this problem by getting correct `ClassTag` from `Class[_]`.

Author: jerryshao <[email protected]>

Closes apache#4101 from jerryshao/SPARK-5297 and squashes the following commits:

e022ca3 [jerryshao] Add Mima exclusion
ecd61b8 [jerryshao] Fix Java fileInputStream type erasure problem
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants