-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-47602][CORE][K8S][FOLLOWUP] Improve structure logging for isExecutorIdleTimedOut #45849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -97,20 +97,23 @@ trait Logging { | |
| } | ||
|
|
||
| implicit class LogStringContext(val sc: StringContext) { | ||
| def log(args: MDC*): MessageWithContext = { | ||
| def log(args: Any*): MessageWithContext = { | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gengliangwang I'm not sure if there were discussions on the following case:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In fact, the above case has already been discussed
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, great, let hold on this one for a while
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah let's make every variable MDC for now. We can decide which context keys to show in the context by default once we got all the log keys.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changing a log4j configuration will be much easier than debating in logging entries of the code base.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
if so, this might not be a good case for the 1st round migration, IMO it's critical to inject a string variable(a large JSON-like pod spec) to the error message
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pan3793 ok, we can skip this one |
||
| val processedParts = sc.parts.iterator | ||
| val sb = new StringBuilder(processedParts.next()) | ||
| val context = new java.util.HashMap[String, String]() | ||
|
|
||
| args.foreach { mdc => | ||
| sb.append(mdc.value.toString) | ||
| if (Logging.isStructuredLoggingEnabled) { | ||
| context.put(mdc.key.toString.toLowerCase(Locale.ROOT), mdc.value.toString) | ||
| } | ||
| args.foreach { | ||
| case mdc: MDC => | ||
| sb.append(mdc.value.toString) | ||
| if (Logging.isStructuredLoggingEnabled) { | ||
| context.put(mdc.key.toString.toLowerCase(Locale.ROOT), mdc.value.toString) | ||
| } | ||
|
|
||
| if (processedParts.hasNext) { | ||
| sb.append(processedParts.next()) | ||
| } | ||
| if (processedParts.hasNext) { | ||
| sb.append(processedParts.next()) | ||
| } | ||
| case any: Any => | ||
| sb.append(any) | ||
| } | ||
|
|
||
| MessageWithContext(sb.toString(), context) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -533,8 +533,10 @@ class ExecutorPodsAllocator( | |
| currentTime - creationTime > executorIdleTimeout | ||
| } catch { | ||
| case e: Exception => | ||
| logError(log"Cannot get the creationTimestamp of the pod: " + | ||
| log"${MDC(LogKey.POD_ID, state.pod)}", e) | ||
| logError(log"Cannot get the creationTimestamp of the pod " + | ||
| log"${MDC(LogKey.KUBERNETES_POD_NAME, state.pod.getMetadata.getName)} in namespace " + | ||
| log"${MDC(LogKey.KUBERNETES_NAMESPACE, state.pod.getMetadata.getNamespace)}. " + | ||
| log"Resource details: ${state.pod}", e) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes,
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pod ObjectMeta
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gengliangwang
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes a simple POD_ID -> POD sounds good. Having the extra POD name nad namespace in the log for quick search is also fine.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I read the K8s docs and found there is a UID concept, but TBH, it's rare to use it ...
https://kubernetes.io/docs/concepts/overview/working-with-objects/names/ the most common cases to identify a pod are |
||
| true | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some thoughts for the
KUBERNETES_prefix,POD_IDshould be identical butNAMESPACEdoes not.for
CONTAINER_ID, maybe we should change it toYARN_CONTAINER_IDtooThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KUBERNETES->K8S?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I mentioned this in #45862
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for providing a Guideline