[SPARK-48602][SQL] Make csv generator support different output style with spark.sql.binaryOutputStyle #46956
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
In SPARK-47911, we introduced a universal BinaryFormatter to make binary output consistent
across all clients, such as beeline, spark-sql, and spark-shell, for both primitive and nested binaries.
But unfortunately,
to_csvandcsv writerhave interceptors for binary output which is hard-coded to useSparkStringUtils.getHexString. In this PR we make it also configurable.Why are the changes needed?
feature parity
Does this PR introduce any user-facing change?
Yes, we have make spark.sql.binaryOutputStyle work for csv but the AS-IS behavior is kept.
How was this patch tested?
new tests
Was this patch authored or co-authored using generative AI tooling?
no