Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
2c1d5d8
Prototype
itholic Mar 4, 2024
376fc46
Merge branch 'master' of https://github.com/apache/spark into error_c…
itholic Apr 1, 2024
174a929
Merge branch 'master' of https://github.com/apache/spark into error_c…
itholic Apr 2, 2024
8ab1edf
Support query context testing and added UTs
itholic Apr 2, 2024
5906852
Merge branch 'master' of https://github.com/apache/spark into error_c…
itholic Apr 3, 2024
f3a7bd4
resolve comments
itholic Apr 3, 2024
bbaa399
Add JIRA pointer for testing
itholic Apr 3, 2024
b9f54f1
Silence the linter
itholic Apr 3, 2024
c8d98ea
Adjusted comments
itholic Apr 3, 2024
ef7f1df
Merge branch 'master' of https://github.com/apache/spark into error_c…
itholic Apr 4, 2024
cc52aab
Update displayed string and add comment for PySparkCurrentOrigin
itholic Apr 5, 2024
9c323d4
Using queue to ensure multiple call sites can be logged in order and …
itholic Apr 5, 2024
f5ad1c4
remove unnecessary comment
itholic Apr 5, 2024
4f12dc7
Extends Origin and WithOrigin to PySpark context support
itholic Apr 8, 2024
001c71e
Reusing fn for PySpark logging
itholic Apr 9, 2024
daa08cd
Add document for extended PySpark specific logging functions
itholic Apr 9, 2024
92faffe
remove unused code
itholic Apr 9, 2024
2514afb
Merge branch 'master' of https://github.com/apache/spark into error_c…
itholic Apr 9, 2024
672c176
Adress None properly
itholic Apr 9, 2024
1304c2b
Simplifying
itholic Apr 9, 2024
ff4037b
Merge branch 'master' of https://github.com/apache/spark into error_c…
itholic Apr 10, 2024
1d8df34
Respect spark.sql.stackTracesInDataFrameContext
itholic Apr 10, 2024
95f7848
Add captureStackTrace to remove duplication
itholic Apr 10, 2024
1dd53ed
pysparkLoggingInfo -> pysparkErrorContext
itholic Apr 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add document for extended PySpark specific logging functions
  • Loading branch information
itholic committed Apr 9, 2024
commit daa08cdeabc766d98b5a0d583ebe0cb2e73e92ae
32 changes: 25 additions & 7 deletions sql/core/src/main/scala/org/apache/spark/sql/Column.scala
Original file line number Diff line number Diff line change
Expand Up @@ -171,19 +171,37 @@ class Column(val expr: Expression) extends Logging {
Column.fn(name, this, lit(other))
}

// For PySpark logging
private def fn(
name: String, pysparkLoggingInfo: java.util.Map[String, String]): Column = {
withOrigin(Some(pysparkLoggingInfo)) {
Column.fn(name, this)
}
}
/**
* A version of the `fn` method specifically designed for binary operations in PySpark
* that require logging information.
* This method is used when the operation involves another Column.
*
* @param name The name of the operation to be performed.
* @param other The other Column involved in the operation.
* @param pysparkLoggingInfo A map containing logging information such as the fragment and
* call site from PySpark.
* @return A Column resulting from the operation.
*/
private def fn(
Copy link
Contributor

@cloud-fan cloud-fan Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon This probably can't cover all the cases, and we may need to add more overloads for certain functions that require non-expression parameters, but it shouldn't be many.

I think it's better than using ThreadLocal which can be quite fragile to pass values between Python and JVM.

name: String, other: Column, pysparkLoggingInfo: java.util.Map[String, String]): Column = {
withOrigin(Some(pysparkLoggingInfo)) {
Column.fn(name, this, other)
}
}

/**
* A version of the `fn` method for binary operations that involve a value other than a Column
* and require PySpark logging information.
* This method is used to apply an operation with any value, converting it to a Column
* if necessary.
*
* @param name The name of the operation to be performed.
* @param other The value to be used in the operation, which will be converted to a
* Column if not already one.
* @param pysparkLoggingInfo A map containing logging information such as the fragment and
* call site from PySpark.
* @return A Column resulting from the operation.
*/
private def fn(
name: String, other: Any, pysparkLoggingInfo: java.util.Map[String, String]): Column = {
withOrigin(Some(pysparkLoggingInfo)) {
Expand Down