From d9badc48915ef7e27b4b90afb29dbc5eeb887d60 Mon Sep 17 00:00:00 2001
From: Gengliang Wang <gengliang@apache.org>
Date: Wed, 3 Apr 2024 12:59:30 -0700
Subject: [PATCH 1/2] Guidelines for the Structured Logging Framework

---
 .../main/scala/org/apache/spark/internal/README.md  | 13 +++++++++++++
 1 file changed, 13 insertions(+)
 create mode 100644 common/utils/src/main/scala/org/apache/spark/internal/README.md

diff --git a/common/utils/src/main/scala/org/apache/spark/internal/README.md b/common/utils/src/main/scala/org/apache/spark/internal/README.md
new file mode 100644
index 000000000000..1c4cdad77eaa
--- /dev/null
+++ b/common/utils/src/main/scala/org/apache/spark/internal/README.md
@@ -0,0 +1,13 @@
+# Guidelines for the Structured Logging Framework
+
+## LogKey
+
+LogKeys serve as identifiers for mapped diagnostic contexts (MDC) within logs. Follow these guidelines when adding new LogKeys:
+* Define all structured logging keys in `LogKey.scala`, and sort them alphabetically for ease of search.
+* Use `UPPER_SNAKE_CASE` for key names.
+* Key names should be both simple and broad, yet include specific identifiers like `STAGE_ID`, `TASK_ID`, and `JOB_ID` when needed for clarity. For instance, use `MAX_ATTEMPTS` as a general key instead of creating separate keys for each scenario such as `EXECUTOR_STATE_SYNC_MAX_ATTEMPTS` and `MAX_TASK_FAILURES`. This balances simplicity with the detail needed for effective logging.
+* Avoid abbreviations in names. Use `APPLICATION_ID` instead of `APP_ID`.
+
+## Exceptions
+
+To ensure logs are compatible with Spark SQL and log analysis tools, avoid `Exception.printStackTrace()`. Use `logError`, `logWarning`, and `logInfo` methods from the `Logging` trait to log exceptions, maintaining structured and parsable logs.

From 33968dc0febe57fa66e8a327848480a34ca6cf9f Mon Sep 17 00:00:00 2001
From: Gengliang Wang <gengliang@apache.org>
Date: Wed, 3 Apr 2024 13:33:51 -0700
Subject: [PATCH 2/2] encourage using  abbreviations in names

---
 common/utils/src/main/scala/org/apache/spark/internal/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/common/utils/src/main/scala/org/apache/spark/internal/README.md b/common/utils/src/main/scala/org/apache/spark/internal/README.md
index 1c4cdad77eaa..ed3d77333806 100644
--- a/common/utils/src/main/scala/org/apache/spark/internal/README.md
+++ b/common/utils/src/main/scala/org/apache/spark/internal/README.md
@@ -6,7 +6,7 @@ LogKeys serve as identifiers for mapped diagnostic contexts (MDC) within logs. F
 * Define all structured logging keys in `LogKey.scala`, and sort them alphabetically for ease of search.
 * Use `UPPER_SNAKE_CASE` for key names.
 * Key names should be both simple and broad, yet include specific identifiers like `STAGE_ID`, `TASK_ID`, and `JOB_ID` when needed for clarity. For instance, use `MAX_ATTEMPTS` as a general key instead of creating separate keys for each scenario such as `EXECUTOR_STATE_SYNC_MAX_ATTEMPTS` and `MAX_TASK_FAILURES`. This balances simplicity with the detail needed for effective logging.
-* Avoid abbreviations in names. Use `APPLICATION_ID` instead of `APP_ID`.
+* Use abbreviations in names if they are widely understood, such as `APP_ID` for APPLICATION_ID, and `K8S` for KUBERNETES.
 
 ## Exceptions