-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48490][CORE][FOLLOWUP] Fix Unescape about MessageWithContext #47029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| def basicMsgWithEscapeCharMDC: LogEntry = | ||
| log"This is a log message\nThis is a new line \t other msg" | ||
| log"This is a log message" + LF + log"This is a new line " + TAB + log" other msg" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the above writing may seem a bit ugly, so far, the only method I can find that meets the requirements and is relatively acceptable
|
|
Regarding the unexpected behaviors that already exist in our Spark, I will work together to fix them once this PR is completed. eg:
|
|
The root cause is that it be recognized as a string, not a character in
|
| * Companion class for lazy evaluation of the MessageWithContext instance. | ||
| */ | ||
| class LogEntry(messageWithContext: => MessageWithContext) { | ||
| def message: String = StringEscapeUtils.unescapeJava(messageWithContext.message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe simply try-catch, and return the original string if it fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Windows users are arguably less in any event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and logging shouldn't fail the application anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm..., this may have successfully unescaped, but the result does not meet expectations, such as:
println(StringEscapeUtils.unescapeJava("C:\\Users\\runneradmin\\AppData\\Local\\remp\\spark-9f4b7e74-ea9e-46fc-bde6-1449cc9f6e8e"))
We may expect it to output as:
C:\Users\runneradmin\AppData\Local\remp\spark-9f4b7e74-ea9e-46fc-bde6-1449cc9f6e8e
Actual, its output is:
empspark-9f4b7e74-ea9e-46fc-bde6-1449cc9f6e8e
So it's better not to execute the unescape operation and let the user manually use the method I wrote in UT to achieve its expected behavior, although it does look a bit ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and logging shouldn't fail the application anyway.
Yes, we can use try ... catch temporarily fixes it, but it has some of my flaws mentioned above.
Or should we do this (use try ... catch) first and let build_sparkr_window pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with this approach but I think we need @gengliangwang 's look at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's wait for @gengliangwang's review.
Thanks!
@panbingkun I don't get it. How does this happen in windows? We are passing |
|
|
||
| private final val LF_CHAR: Char = '\n' | ||
| private final val TAB_CHAR: Char = '\t' | ||
| final val LF = MessageWithContext(s"$LF_CHAR", new java.util.HashMap[String, String]()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait..are you suggesting that we should use this constant in log messages? If so, I would prefer try...catch as @HyukjinKwon suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
This is just to make it
easierfor developers to use commonly usedescapecharacters, such as:
line break -LF_CHAR, tab -TAB_CHAR
Of course, we can also use it this way:

-
Additional explanation, if we write it as follows, it will exhibit unexpected behavior:

Because it is written in this way, in our framework it will consider "\n" as astringrather than acharacter

that is to say:s"\n" != log"\n".message
|
| MessageWithContext(sb.toString(), context) | ||
| } | ||
|
|
||
| def log(c: Char): MessageWithContext = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have provided a new method for the character
|
@panbingkun I created a new approach in #47050 |
|
closing in favor of #47050 |
What changes were proposed in this pull request?
The PR aims to solve the issue encountered in the following scenarios:
#46897 (comment)
We force the execution of
StringEscapeUtils.unescapeJavaoperation onmessageofMessageWithContext, which will result in unexpected behavior.Why are the changes needed?
Fix bug.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass GA
Existed UT
Was this patch authored or co-authored using generative AI tooling?
No.