Skip to content

Conversation

heyihong
Copy link
Contributor

@heyihong heyihong commented Sep 19, 2025

What changes were proposed in this pull request?

This PR fixes the handling of null literal values in LiteralValueProtoConverter for Spark Connect. The main changes include:

  1. Reordered pattern matching: Moved null and Option handling to the top of the pattern matching in toLiteralProtoBuilderInternal to ensure null values are processed before other type-specific logic.

  2. Fixed converter logic: Updated the getScalaConverter method to properly handle null values by checking hasNull before applying type-specific conversion logic.

Why are the changes needed?

The previous implementation had several issues with null value handling:

  1. Incorrect null processing order: Null values were being processed after type-specific logic, which could lead to exceptions.

  2. Missing null checks in converters: The converter functions didn't properly check for null values before applying type-specific conversion logic.

Does this PR introduce any user-facing change?

Yes. This PR fixes a bug where null values in literals (especially in arrays and maps) were not being properly handled in Spark Connect. Users who were experiencing issues with null value serialization in complex types should now see correct behavior.

How was this patch tested?

build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite -- -z SPARK-53553"
build/sbt "connect/testOnly *LiteralExpressionProtoConverterSuite"
build/sbt "connect-client-jvm/testOnly org.apache.spark.sql.PlanGenerationTestSuite"
build/sbt "connect/testOnly org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite"

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.5.11

…toConverter

This PR fixes the handling of null literal values in `LiteralValueProtoConverter` for Spark Connect. The main changes include:

1. **Added proper null value handling**: Created a new `setNullValue` method that correctly sets null values in proto literals with appropriate data type information.

2. **Reordered pattern matching**: Moved null and Option handling to the top of the pattern matching in `toLiteralProtoBuilderInternal` to ensure null values are processed before other type-specific logic.

3. **Fixed converter logic**: Updated the `getScalaConverter` method to properly handle null values by checking `hasNull` before applying type-specific conversion logic.

The previous implementation had several issues with null value handling:

1. **Incorrect null processing order**: Null values were being processed after type-specific logic, which could lead to exceptions.

2. **Missing null checks in converters**: The converter functions didn't properly check for null values before applying type-specific conversion logic.

**Yes**. This PR fixes a bug where null values in literals (especially in arrays and maps) were not being properly handled in Spark Connect. Users who were experiencing issues with null value serialization in complex types should now see correct behavior.

`build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite -- -z SPARK-53553"`
`build/sbt "connect/testOnly *LiteralExpressionProtoConverterSuite"`
`build/sbt "connect-client-jvm/testOnly org.apache.spark.sql.PlanGenerationTestSuite"`
`build/sbt "connect/testOnly org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite"`

Generated-by: Cursor 1.5.11

Closes apache#52310 from heyihong/SPARK-53553.

Authored-by: Yihong He <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@heyihong heyihong changed the title [SPARK-53553][CONNECT] Fix handling of null values in LiteralValueProtoConverter [SPARK-53553][CONNECT][4.0] Fix handling of null values in LiteralValueProtoConverter Sep 19, 2025
@heyihong
Copy link
Contributor Author

@cloud-fan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant