Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Jun 14, 2024

What changes were proposed in this pull request?

The pr aims to migrate the residual code to structured logging framework.

Why are the changes needed?

When I reviewed the spark code, I found that some logs in the some module were not fully migrated to the structured logging framework, let's to complete unfinished work.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

@panbingkun
Copy link
Contributor Author

In addition, I have also found similar issues in the core module, but currently there seems to be a PR https://issues.apache.org/jira/browse/SPARK-47579 still in progress, so to prevent conflicts, I skipped it.

@gengliangwang
Copy link
Member

@panbingkun Could you update this one?

@panbingkun
Copy link
Contributor Author

@panbingkun Could you update this one?

Okay, sure, let me do it.

@github-actions github-actions bot added CORE and removed PYTHON labels Jun 22, 2024
@panbingkun panbingkun changed the title [SPARK-48629] Migrate the remaining code to structured logging framework [SPARK-48629] Migrate the residual code to structured logging framework Jun 22, 2024
} else {
logError(log"Application Master lost connection with driver! Shutting down. " +
log"${MDC(LogKeys.REMOTE_ADDRESS, remoteAddress)}")
log"${MDC(LogKeys.HOST_PORT, remoteAddress)}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why renaming this one?

Copy link
Contributor Author

@panbingkun panbingkun Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of this code, it is called HOST_PORT, and we need to be consistent.

logInfo(log"Driver terminated or disconnected! Shutting down. " +
log"${MDC(LogKeys.HOST_PORT, remoteAddress)}")
finish(FinalApplicationStatus.SUCCEEDED, ApplicationMaster.EXIT_SUCCESS)
} else {
logError(log"Driver terminated with exit code ${MDC(LogKeys.EXIT_CODE, exitCode)}! " +
log"Shutting down. ${MDC(LogKeys.HOST_PORT, remoteAddress)}")
finish(FinalApplicationStatus.FAILED, exitCode)
}
} else {
logError(log"Application Master lost connection with driver! Shutting down. " +
log"${MDC(LogKeys.REMOTE_ADDRESS, remoteAddress)}")
finish(FinalApplicationStatus.FAILED, ApplicationMaster.EXIT_DISCONNECTED)


logError(log"Query ${MDC(LogKeys.PRETTY_ID_STRING, prettyIdString)} " +
log"terminated with error", e)
logError(log"Query ${MDC(QUERY_ID, prettyIdString)} terminated with error", e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why renaming this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's restore it, I misread it

@panbingkun
Copy link
Contributor Author

Let me double check myself.

case object HOST extends LogKey
case object HOSTS extends LogKey
case object HOST_LOCAL_BLOCKS_SIZE extends LogKey
case object HOST_NAME extends LogKey
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unify HOST_NAME to HOST and remove HOST_NAME

case object HOSTS extends LogKey
case object HOST_LOCAL_BLOCKS_SIZE extends LogKey
case object HOST_NAME extends LogKey
case object HOST_NAMES extends LogKey
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unify HOST_NAMES to HOSTS and remove HOST_NAMES


logInfo(log"Starting executor ID ${LogMDC(LogKeys.EXECUTOR_ID, executorId)}" +
log" on host ${LogMDC(HOST_NAME, executorHostname)}")
log" on host ${LogMDC(HOST, executorHostname)}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unify it as HOST, the general rule is
HOST: Only host name, not including port, eg: host_name
Port: Just the port, eg: 10240
HOST_PORT: includes both host name and port, eg: host_name:10240

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log"on host ${MDC(LogKeys.HOST, executorHostname)} for " +

val removedWorkers = masterEndpointRef.askSync[Integer](
DecommissionWorkersOnHosts(hostnames))
logInfo(log"Decommissioning of hosts ${MDC(HOST_NAMES, hostnames)}" +
logInfo(log"Decommissioning of hosts ${MDC(HOSTS, hostnames)}" +
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's unify it as HOSTS, there is a detailed explanation below.

logWarning(
log"Unknown StreamingQueryListener event: " +
log"${MDC(LogKeys.EVENT, event)}")
logWarning(log"Unknown StreamingQueryListener event: ${MDC(LogKeys.EVENT, event)}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make our code format look pretty

} catch {
case e: Throwable =>
abort()
logError(log"Fail to commit changelog file ${MDC(LogKeys.FILE_NAME, file)} " +
Copy link
Contributor Author

@panbingkun panbingkun Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call it PATH, and make e as one of the parameters for logError

log"""SparkContext did not initialize after waiting for
|${MDC(LogKeys.TIMEOUT, totalWaitTime)} ms.
| Please check earlier log output for errors. Failing the application.""".stripMargin)
logError(log"SparkContext did not initialize after waiting for " +
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the code format pretty

if (operationLog == null) {
LOG.error("Operation [ {} ] logging is enabled, " +
"but its OperationLog object cannot be found.",
MDC.of(LogKeys.OPERATION_HANDLE_IDENTIFIER$.MODULE$, opHandle.getHandleIdentifier()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's abbreviate IDENTIFIER as ID

val cmdStr = s"catalog : $catalogName, schemaPattern : $schemaName"
val logMsg = s"Listing functions '$cmdStr, functionName : $functionName'"
logInfo(log"${MDC(LogKeys.MESSAGE, logMsg)} with ${MDC(LogKeys.STATEMENT_ID, statementId)}")
val cmdMDC = log"catalog : ${MDC(LogKeys.CATALOG_NAME, catalogName)}, " +
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make them more reasonable

// Do not change cmdStr. It's used for Hive auditing and authorization.
val cmdStr = s"catalog : $catalogName, schemaPattern : $schemaName"
val logMsg = s"Listing functions '$cmdStr, functionName : $functionName'"
logInfo(log"${MDC(LogKeys.MESSAGE, logMsg)} with ${MDC(LogKeys.STATEMENT_ID, statementId)}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expand logMsg

@panbingkun
Copy link
Contributor Author

@gengliangwang
It's ready.

@panbingkun panbingkun marked this pull request as ready for review June 24, 2024 03:28
@panbingkun panbingkun requested a review from gengliangwang June 24, 2024 03:38
Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your meticulous work!

@gengliangwang
Copy link
Member

Merging to master

@panbingkun
Copy link
Contributor Author

Thank you for your meticulous work!

Thank you for your patient review. ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants