-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29755][CORE] Provide @JsonDeserialize for Option[Long] in LogInfo & AttemptInfoWrapper #26397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc. @vanzin This is a follow-up patch of SPARK-28869. Thanks in advance. |
|
Test build #113258 has finished for PR 26397 at commit
|
|
Test build #113260 has finished for PR 26397 at commit
|
|
Why didn't unit tests catch this? (Or maybe: add a unit test?) |
|
Actually I have been trying to construct/modify UT to let it fail on master and pass with this patch, but still no luck. |
|
I added those annotations because I hit errors when I was modifying the SHS to use the disk store. Maybe you need to try rolling logs with the disk store to hit it. |
|
Thanks for the quick response! That's actually one of things what I tried - no luck. I'll try to shutdown and restore KVStore to see whether it triggers failure. |
|
OK, I added the test which fails on master branch and passes with the patch. I guess I found it earlier but checkForLogs() swallow the exception and log is not printed in console so I missed the exception. |
3bd2760 to
b3f3479
Compare
|
Test build #113296 has finished for PR 26397 at commit
|
|
Test build #113298 has finished for PR 26397 at commit
|
| // The issue happens only the value in Option is being unboxed. Simple comparison sometimes | ||
| // doesn't go though unboxing the value, hence ClassCastException is not occurred. | ||
| // Here we ensure unboxing to Long succeeds. Please refer SPARK-29755 for more details. | ||
| assert(BoxesRunTime.unboxToLong(opt.get) === expected.get) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you just use .toLong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
scala> val i = Option(1)
i: Option[Int] = Some(1)
scala> val l = i.asInstanceOf[Option[Long]]
l: Option[Long] = Some(1)
scala> l.get.toLong
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long
at scala.runtime.BoxesRunTime.unboxToLong(BoxesRunTime.java:105)
... 28 elided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here opt really is an Option[Long], not a Option[Int]. You can't assign the latter to the former in Scala anyway. I feel like I'm missing the whole point here. Is the idea that Jackson is coming up with an Option[Int] when deserializing, without this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, that's the bug. Jackson deserializes the value "1" as an Option[Int] because it fits in an int, and sticks it into that variable using reflection, regardless of the type parameter (since at runtime Jackson can only see it's an Option, not an Option[Long]).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, but if the bug is fixed, then this really is an Option[Long]. Would handling the case that it isn't not cover up the bug if it existed? again I probably misunderstand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand your question.
As Jungtaek said, this test fails before the fix, because even if the code says "Option[Long]", Jackson actually put an "Option[Int]" in there.
So, as far as this test goes:
- fails without the fix
- works with the fix
Again, isn't that what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jackson-Scala FAQ page I've linked in description explains what is happening here and why users should apply the workaround. They couldn't find a good way to fix it in Jackson-Scala side.
Why I simply pick BoxesRunTime.unboxToLong directly is, that's the actual path where exception is thrown. (Please refer JIRA description - https://issues.apache.org/jira/browse/SPARK-29755)
Simple comparison (like opt === expect) didn't bring exception, so I wanted to make sure it definitely fails on current master, not probably fails.
I'd prefer to have a test that exercised this through the normal SHS operation
Yeah that was what I tried first, and checkLogs() swallowed the exception (so we can't catch CCE in caller side) and another unexpected error was coming when trying to verify. It might also reflect something is not working, but looks to be less clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I was just commenting you can use opt.get.toLong to hit the bug without the fix, which looks a bit better than calling into BoxesRunTime, but either is fine really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it. No strong voice on either. I'll change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, .toLong hits the bug. So I was wondering what the deal is with BoxesRunTime, and though it was there to make Option[Int] -> long work. But then that would not fail the test. If that's not what it does, and it effectively does .toLong, then yeah we're on the same page: just use .toLong
| attemptId: Option[String], | ||
| fileSize: Long, | ||
| lastIndex: Option[Long], | ||
| @JsonDeserialize(contentAs = classOf[JLong]) lastIndex: Option[Long], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically the same approach used elsewhere right? if so that's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes right. There're other spots we already took care of.
|
Test build #113343 has finished for PR 26397 at commit
|
gaborgsomogyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checked, the change solves the problem. Only minors found.
| import org.apache.spark.util.{Clock, JsonProtocol, ManualClock, Utils} | ||
| import org.apache.spark.util.logging.DriverLogger | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why this needed?
| test("SPARK-29755 LogInfo should be serialized/deserialized by jackson properly") { | ||
| def assertSerDe(serializer: KVStoreScalaSerializer, info: LogInfo): Unit = { | ||
| val infoAfterSerDe = serializer.deserialize( | ||
| serializer.serialize(info), classOf[LogInfo]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: No break needed.
| val logPath: String, | ||
| val fileSize: Long, | ||
| val lastIndex: Option[Long], | ||
| @JsonDeserialize(contentAs = classOf[JLong]) val lastIndex: Option[Long], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe a break break between deser and the val would make it more readable.
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.spark.deploy.history |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: There are couple of unused imports at the beginning of the file. Maybe worth to clean them up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's occurred from here I'll remove. If not, it might make other PRs be broken, so I might take it carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since FsHistoryProvider.scala is one of the main target of this development (I mean not the PR) it would be good to clean it up somewhere. What do you think where should we do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's OK to do with minor PR, but commiter could finally judge the worth. (treat my comment as 2 cents)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this resolution. Such way we separate the concerns. I've created #26436, in case of disagreement it can be just dropped.
|
|
||
| import scala.collection.JavaConverters._ | ||
| import scala.concurrent.duration._ | ||
| import scala.runtime.BoxesRunTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why this needed? Tests show the same result with and without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see now. This was coming from a previous code:
assert(BoxesRunTime.unboxToLong(opt.get) === expected.get)
|
Test build #113415 has finished for PR 26397 at commit
|
gaborgsomogyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, assuming the resolution offered in #26397 (comment) is accepted.
| if (expected.isEmpty) { | ||
| assert(opt.isEmpty) | ||
| } else { | ||
| // The issue happens only the value in Option is being unboxed. Here we ensure unboxing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/happens only the/happens only when the
|
Test build #113478 has finished for PR 26397 at commit
|
### What changes were proposed in this pull request? As it has been discussed in #26397 (comment) `FsHistoryProvider` import section has to be cleaned up. ### Why are the changes needed? Unused imports. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing unit tests. Closes #26436 from gaborgsomogyi/SPARK-29755. Authored-by: Gabor Somogyi <[email protected]> Signed-off-by: Sean Owen <[email protected]>
d1bc497 to
698a657
Compare
|
Test build #113521 has finished for PR 26397 at commit
|
|
Bump |
|
Merging to master. |
|
Thanks all for reviewing and merging! |
What changes were proposed in this pull request?
This patch adds
@JsonDeserializeannotation for the field which type isOption[Long]in LogInfo/AttemptInfoWrapper. It hits https://github.com/FasterXML/jackson-module-scala/wiki/FAQ#deserializing-optionint-and-other-primitive-challenges - other existing json models take care of this, but we missed to add annotation to these classes.Why are the changes needed?
Without this change, SHS will throw ClassNotFoundException when rebuilding App UI.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manually tested.