-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-48592][INFRA] Add scala style check for logging message inline variables #46947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
scalastyle-config.xml
Outdated
| </check> | ||
|
|
||
| <check customId="logInlineVariable" level="error" class="org.scalastyle.file.RegexChecker" enabled="true"> | ||
| <parameters><parameter name="regex">log(?:Info|Warning|Error)\(s".*(\$|\+\s*[^\s"]).*"\)</parameter></parameters> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to detect the ".format(..)` as well.
Also, need to exclude the testing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar rules also need to be added to dev/checkstyle.xml to check Java code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, need to exclude the testing code.
This might be difficult, as for scalastyle-maven-plugin, there is currently no mechanism similar to Java checkstyle's SuppressionFilter that can ignore certain rules in specific files or code blocks. The rule ignoring of scalastyle is global and cannot be targeted at specific files or code blocks. After defining the check rule, we may only be able to skip the relevant checks in a way similar to the following:
// scalastyle:off logInlineVariable
logInfo(s"....")
// scalastyle:on logInlineVariable
|
@gengliangwang It might be aggressive to ban all I suppose that the values defined in LogKeys are public APIs, and they should not be removed after 4.0.0 is released. Well, the current list might need to polish, for example, Values of LogKeys like |
|
I strongly suggest reconsidering SPARK-47688 and being conservative on defining also cc @cloud-fan @LuciferYang |
|
@pan3793 I appreciate the feedback and concerns raised regarding the migration of variables in our new structured logging framework for Apache Spark. I'd like to address these concerns and provide some clarity on our approach. Why Migrate All Variables?The primary reason for migrating all variables in our log messages is to avoid ongoing debates about which specific keys should be included as log keys. By disallowing the inlining of variables in log messages and ensuring all variables are explicitly named and migrated, we achieve a consistent and comprehensive logging structure. This approach ensures uniformity across all log entries. Current Status of MigrationThe migration process is nearly complete, which means we've already invested significant effort into ensuring that our log entries adhere to the new framework. Moving forward, as long as we follow the new logging rules, we maintain control over all the variables in our logs. This control is crucial for effective log analysis and troubleshooting. Managing Variable OverloadTo address concerns about the potential overload of variables for users, we have a couple of solutions:
Our goal with these changes is to enhance the clarity, consistency, and utility of our logs. |
|
@gengliangwang thanks for the summary.
Major concerns come from public API exposure. Spark is conservative for deleting deprecated public API, for example,
IMO the debates are valuable, I do think NOT all variables are suitable for the |
LogKey is totally internal. It is under the package
Yes, we should improve this. But this is orthogonal to what this PR tries to do. With the enforcement of making variables as MDC, we can remind developers to put variables as MDC or verbose MDC in new log entries. |
b33c37e to
47710c9
Compare
### What changes were proposed in this pull request? This PR migrates Scala logging to comply with the scala style changes in [#46979](#46947) ### Why are the changes needed? This makes development and PR review of the structured logging migration easier. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested by ensuring `dev/scalastyle` checks pass ### Was this patch authored or co-authored using generative AI tooling? No Closes #46980 from asl3/logging-migrationscala. Authored-by: Amanda Liu <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
| val scalaStyleOnCompileConfig: String = { | ||
| val in = "scalastyle-config.xml" | ||
| val base_style_config = "scalastyle-config.xml" | ||
| val prod_style_config = "scalastyle-config-prod.xml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is this a convention? Does something named
-prodonly check the format of a specific directory? - It is necessary to synchronize the modifications to the scalastyle-maven-plugin in the parent pom.xml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here: scalastyle-config-prod.xml is for production code, which means it won't check the style of tests.
But I can't tell the difference of the target files from the PR changes...
### What changes were proposed in this pull request? This PR makes additional Scala logging migrations to comply with the scala style changes in #46947 ### Why are the changes needed? This makes development and PR review of the structured logging migration easier. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Tested by ensuring dev/scalastyle checks pass ### Was this patch authored or co-authored using generative AI tooling? No Closes #47256 from asl3/morestructuredloggingmigrations. Authored-by: Amanda Liu <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request? This PR makes additional Scala logging migrations to comply with the scala style changes in #46947 ### Why are the changes needed? This makes development and PR review of the structured logging migration easier. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Tested by ensuring dev/scalastyle checks pass ### Was this patch authored or co-authored using generative AI tooling? No Closes #47275 from asl3/formatstructuredlogmigrations. Lead-authored-by: Amanda Liu <[email protected]> Co-authored-by: Gengliang Wang <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request? This PR migrates `src/main/scala/org/apache/spark/util/logging/FileAppender.scala` to comply with the scala style changes in #46947 ### Why are the changes needed? This makes development and PR review of the structured logging migration easier. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Tested by ensuring dev/scalastyle checks pass ### Was this patch authored or co-authored using generative AI tooling? No Closes #47394 from asl3/asl3/migratenewfiles. Authored-by: Amanda Liu <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request? This PR makes additional Scala logging migrations to comply with the scala style changes in apache#46947 ### Why are the changes needed? This makes development and PR review of the structured logging migration easier. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Tested by ensuring dev/scalastyle checks pass ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#47256 from asl3/morestructuredloggingmigrations. Authored-by: Amanda Liu <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request? This PR makes additional Scala logging migrations to comply with the scala style changes in apache#46947 ### Why are the changes needed? This makes development and PR review of the structured logging migration easier. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Tested by ensuring dev/scalastyle checks pass ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#47275 from asl3/formatstructuredlogmigrations. Lead-authored-by: Amanda Liu <[email protected]> Co-authored-by: Gengliang Wang <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
### What changes were proposed in this pull request? This PR migrates `src/main/scala/org/apache/spark/util/logging/FileAppender.scala` to comply with the scala style changes in apache#46947 ### Why are the changes needed? This makes development and PR review of the structured logging migration easier. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Tested by ensuring dev/scalastyle checks pass ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#47394 from asl3/asl3/migratenewfiles. Authored-by: Amanda Liu <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>
|
I am closing this one since we already merged #47239 |
What changes were proposed in this pull request?
This PR bans logging messages using logInfo, logWarning, logError and containing variables without MDC wrapper
Why are the changes needed?
This makes development and PR review of the structured logging migration easier.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Manual test, verified it will throw errors on invalid logging messages.
Was this patch authored or co-authored using generative AI tooling?
No