-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48629] Migrate the residual code to structured logging framework #46986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
In addition, I have also found similar issues in the |
|
@panbingkun Could you update this one? |
Okay, sure, let me do it. |
| } else { | ||
| logError(log"Application Master lost connection with driver! Shutting down. " + | ||
| log"${MDC(LogKeys.REMOTE_ADDRESS, remoteAddress)}") | ||
| log"${MDC(LogKeys.HOST_PORT, remoteAddress)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why renaming this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of this code, it is called HOST_PORT, and we need to be consistent.
spark/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
Lines 862 to 873 in e972dae
| logInfo(log"Driver terminated or disconnected! Shutting down. " + | |
| log"${MDC(LogKeys.HOST_PORT, remoteAddress)}") | |
| finish(FinalApplicationStatus.SUCCEEDED, ApplicationMaster.EXIT_SUCCESS) | |
| } else { | |
| logError(log"Driver terminated with exit code ${MDC(LogKeys.EXIT_CODE, exitCode)}! " + | |
| log"Shutting down. ${MDC(LogKeys.HOST_PORT, remoteAddress)}") | |
| finish(FinalApplicationStatus.FAILED, exitCode) | |
| } | |
| } else { | |
| logError(log"Application Master lost connection with driver! Shutting down. " + | |
| log"${MDC(LogKeys.REMOTE_ADDRESS, remoteAddress)}") | |
| finish(FinalApplicationStatus.FAILED, ApplicationMaster.EXIT_DISCONNECTED) |
|
|
||
| logError(log"Query ${MDC(LogKeys.PRETTY_ID_STRING, prettyIdString)} " + | ||
| log"terminated with error", e) | ||
| logError(log"Query ${MDC(QUERY_ID, prettyIdString)} terminated with error", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why renaming this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's restore it, I misread it
|
Let me double check myself. |
| case object HOST extends LogKey | ||
| case object HOSTS extends LogKey | ||
| case object HOST_LOCAL_BLOCKS_SIZE extends LogKey | ||
| case object HOST_NAME extends LogKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's unify HOST_NAME to HOST and remove HOST_NAME
| case object HOSTS extends LogKey | ||
| case object HOST_LOCAL_BLOCKS_SIZE extends LogKey | ||
| case object HOST_NAME extends LogKey | ||
| case object HOST_NAMES extends LogKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's unify HOST_NAMES to HOSTS and remove HOST_NAMES
|
|
||
| logInfo(log"Starting executor ID ${LogMDC(LogKeys.EXECUTOR_ID, executorId)}" + | ||
| log" on host ${LogMDC(HOST_NAME, executorHostname)}") | ||
| log" on host ${LogMDC(HOST, executorHostname)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's unify it as HOST, the general rule is
HOST: Only host name, not including port, eg: host_name
Port: Just the port, eg: 10240
HOST_PORT: includes both host name and port, eg: host_name:10240
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
Line 765 in e972dae
| log"on host ${MDC(LogKeys.HOST, executorHostname)} for " + |
| val removedWorkers = masterEndpointRef.askSync[Integer]( | ||
| DecommissionWorkersOnHosts(hostnames)) | ||
| logInfo(log"Decommissioning of hosts ${MDC(HOST_NAMES, hostnames)}" + | ||
| logInfo(log"Decommissioning of hosts ${MDC(HOSTS, hostnames)}" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's unify it as HOSTS, there is a detailed explanation below.
| logWarning( | ||
| log"Unknown StreamingQueryListener event: " + | ||
| log"${MDC(LogKeys.EVENT, event)}") | ||
| logWarning(log"Unknown StreamingQueryListener event: ${MDC(LogKeys.EVENT, event)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make our code format look pretty
| } catch { | ||
| case e: Throwable => | ||
| abort() | ||
| logError(log"Fail to commit changelog file ${MDC(LogKeys.FILE_NAME, file)} " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call it PATH, and make e as one of the parameters for logError
| log"""SparkContext did not initialize after waiting for | ||
| |${MDC(LogKeys.TIMEOUT, totalWaitTime)} ms. | ||
| | Please check earlier log output for errors. Failing the application.""".stripMargin) | ||
| logError(log"SparkContext did not initialize after waiting for " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the code format pretty
| if (operationLog == null) { | ||
| LOG.error("Operation [ {} ] logging is enabled, " + | ||
| "but its OperationLog object cannot be found.", | ||
| MDC.of(LogKeys.OPERATION_HANDLE_IDENTIFIER$.MODULE$, opHandle.getHandleIdentifier())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's abbreviate IDENTIFIER as ID
| val cmdStr = s"catalog : $catalogName, schemaPattern : $schemaName" | ||
| val logMsg = s"Listing functions '$cmdStr, functionName : $functionName'" | ||
| logInfo(log"${MDC(LogKeys.MESSAGE, logMsg)} with ${MDC(LogKeys.STATEMENT_ID, statementId)}") | ||
| val cmdMDC = log"catalog : ${MDC(LogKeys.CATALOG_NAME, catalogName)}, " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make them more reasonable
| // Do not change cmdStr. It's used for Hive auditing and authorization. | ||
| val cmdStr = s"catalog : $catalogName, schemaPattern : $schemaName" | ||
| val logMsg = s"Listing functions '$cmdStr, functionName : $functionName'" | ||
| logInfo(log"${MDC(LogKeys.MESSAGE, logMsg)} with ${MDC(LogKeys.STATEMENT_ID, statementId)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expand logMsg
|
@gengliangwang |
gengliangwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your meticulous work!
|
Merging to master |
Thank you for your patient review. ❤️ |
What changes were proposed in this pull request?
The pr aims to migrate the
residual codeto structured logging framework.Why are the changes needed?
When I reviewed the spark code, I found that some logs in the some module were not fully migrated to the structured logging framework, let's to complete unfinished work.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass GA.
Was this patch authored or co-authored using generative AI tooling?
No.