Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
19b9a68
Stub implementation and a test
MaxGekk Sep 15, 2018
90832f9
Saving all plans to file
MaxGekk Sep 15, 2018
673ae56
Output attributes
MaxGekk Sep 15, 2018
fbde812
Output whole stage codegen
MaxGekk Sep 15, 2018
dca19d3
Reusing codegenToOutputStream
MaxGekk Sep 15, 2018
66351a0
Code de-duplication
MaxGekk Sep 15, 2018
2ee75bc
Do not truncate fields
MaxGekk Sep 15, 2018
9b2a3e6
Moving the test up because previous one leaved a garbage
MaxGekk Sep 15, 2018
51c196e
Removing string interpolation in the test
MaxGekk Sep 16, 2018
c66a616
Getting Hadoop's conf from session state
MaxGekk Sep 16, 2018
ed57c8e
Using java.io.Writer
MaxGekk Sep 16, 2018
ce2c086
Using java.io.Writer
MaxGekk Sep 16, 2018
37326e2
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Sep 17, 2018
7abf14c
Using StringWriter
MaxGekk Sep 17, 2018
d1188e3
Removing unneeded buffering and flushing
MaxGekk Sep 17, 2018
71ff7d1
Code de-duplication among toString and toFile
MaxGekk Sep 17, 2018
ac94a86
Using StringBuilderWriter and fix tests
MaxGekk Sep 18, 2018
f2906d9
Do not change maxFields so far
MaxGekk Sep 18, 2018
d3fede1
Added tests
MaxGekk Sep 18, 2018
c153838
Using StringBuilderWriter in treeString
MaxGekk Sep 18, 2018
6fe08bf
Propagating numFields to truncatedString
MaxGekk Sep 18, 2018
3324927
Bug fix + test
MaxGekk Sep 18, 2018
d63f862
Bug fix: passing maxFields to simpleString
MaxGekk Sep 18, 2018
24dbbba
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Sep 18, 2018
deb5315
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Sep 24, 2018
7fd88d3
Passing parameters by names
MaxGekk Sep 24, 2018
732707a
Getting file system from file path
MaxGekk Sep 24, 2018
3a133ae
Using the buffered writer
MaxGekk Sep 24, 2018
7452b82
Removing default value for maxFields in simpleString
MaxGekk Sep 24, 2018
4ec5732
Removing unnecessary signature of truncatedString
MaxGekk Sep 24, 2018
be16175
Minor improvement - passing maxFields by name
MaxGekk Sep 25, 2018
90ff7b5
Moving truncatedString out of core
MaxGekk Sep 25, 2018
2ba6624
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Sep 26, 2018
1fcfc23
Adding SQL config to control maximum number of fields
MaxGekk Sep 27, 2018
2bf11fc
Adding Spark Core config to control maximum number of fields
MaxGekk Sep 27, 2018
5e2d3a6
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Sep 27, 2018
bd331c5
Revert indentations
MaxGekk Sep 27, 2018
3cf564b
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Oct 11, 2018
2375064
Making writeOrError multi-line
MaxGekk Oct 11, 2018
8befa13
Removing core config: spark.debug.maxToStringFields
MaxGekk Oct 11, 2018
28795c7
Improving description of spark.sql.debug.maxToStringFields
MaxGekk Oct 11, 2018
a246db4
Limit number of fields in structs too
MaxGekk Oct 11, 2018
d4da29b
Description of simpleString of TreeNode.
MaxGekk Oct 11, 2018
41b57bc
Added description of maxFields param of truncatedString
MaxGekk Oct 11, 2018
28cce2e
Fix typo
MaxGekk Oct 11, 2018
e4567cb
Passing maxField
MaxGekk Oct 11, 2018
9b72104
Fix for the warning
MaxGekk Oct 11, 2018
9f1d11d
Merge branch 'master' into plan-to-file
MaxGekk Oct 12, 2018
76f4248
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Oct 31, 2018
f7de26d
Merge remote-tracking branch 'origin/master' into plan-to-file
MaxGekk Nov 5, 2018
bda6ac2
Merge branch 'master' into plan-to-file
MaxGekk Nov 5, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Adding Spark Core config to control maximum number of fields
  • Loading branch information
MaxGekk committed Sep 27, 2018
commit 2bf11fcb1c22d7118c2bcb24cc279494ea953482
10 changes: 10 additions & 0 deletions core/src/main/scala/org/apache/spark/internal/config/package.scala
Original file line number Diff line number Diff line change
Expand Up @@ -633,4 +633,14 @@ package object config {
.stringConf
.toSequence
.createWithDefault(Nil)

private[spark] val MAX_TO_STRING_FIELDS =
ConfigBuilder("spark.debug.maxToStringFields")
.internal()
.doc("Maximum number of fields of sequence-like entries that can be converted to strings " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a sequence like entry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how else I can describe all kind of classes where the parameter is applicable. If you have better words, you are welcome.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to change it to Maximum number of fields of a tree node that can be ... for the SQL config.

"in debug output. Any elements beyond the limit will be dropped and replaced by a" +
""" "... N more fields" placeholder. The config will be removed in Spark 3.0.""")
.intConf
.checkValue(v => v > 0, "The value should be a positive integer.")
.createWithDefault(25)
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import java.util.concurrent.atomic.AtomicBoolean

import org.apache.spark.SparkEnv
import org.apache.spark.internal.Logging
import org.apache.spark.internal.config
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types.{NumericType, StringType}
Expand Down Expand Up @@ -174,15 +175,14 @@ package object util extends Logging {
/**
* The performance overhead of creating and logging strings for wide schemas can be large. To
* limit the impact, we bound the number of fields to include by default. This can be overridden
* by setting the 'spark.debug.maxToStringFields' conf in SparkEnv.
* by setting the 'spark.debug.maxToStringFields' conf in SparkEnv or by settings the SQL config
* `spark.sql.debug.maxToStringFields`.
*/
val DEFAULT_MAX_TO_STRING_FIELDS = 25

private[spark] def maxNumToStringFields = {
private[spark] def maxNumToStringFields: Int = {
val legacyLimit = if (SparkEnv.get != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for context why do you want to retain the legacy behavior? It is probably fine to break it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking into account that old config wasn't well documented and could use mostly in debugging, I think we can remove it in Spark 3.0. Initially the PR targeted to Spark 2.4, in the minor version removing a public config can break user apps potentially. If you are ok to remove it, I will do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed old core config and leaved only SQL config

SparkEnv.get.conf.getInt("spark.debug.maxToStringFields", DEFAULT_MAX_TO_STRING_FIELDS)
SparkEnv.get.conf.get(config.MAX_TO_STRING_FIELDS)
} else {
DEFAULT_MAX_TO_STRING_FIELDS
config.MAX_TO_STRING_FIELDS.defaultValue.get
}
val sqlConfLimit = SQLConf.get.maxToStringFields

Expand Down