Skip to content

Conversation

scovich
Copy link
Contributor

@scovich scovich commented Sep 25, 2025

Which issue does this PR close?

Rationale for this change

While working on #8438, I noticed that the enum variants of ShreddingState actually made the code (a lot) more complex than if I just referenced the (optional) value and typed_value columns directly. That made me wonder if ShreddingState would be better as a simple two-field struct.

What changes are included in this PR?

Change ShreddingState to a two-field struct and update the few call sites that noticed.

While we're at it, improve the docs about how shredding works.

Are these changes tested?

Existing tests cover what is mostly an internal change

Are there any user-facing changes?

ShreddingState is pub and changed from enum to struct.

@github-actions github-actions bot added parquet Changes to the parquet crate parquet-variant parquet-variant* crates labels Sep 25, 2025
@scovich
Copy link
Contributor Author

scovich commented Sep 25, 2025

CC @alamb @klion26 who almost certainly have opinions about this change

/// Create a new `ShreddingState` from the given `value` and `typed_value` fields
///
/// Note you can create a `ShreddingState` from a &[`StructArray`] using
/// `ShreddingState::try_from(&struct_array)`, for example:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShreddingState constructor became infallible some time ago, and it changed to impl From instead of TryFrom at the same time. But this example kept working because of a blanket impl TryFrom in rust that leverages From under the hood.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this code is simpler -- let's go with this one

Thank you @scovich and @mbrobbel

/// | non-NULL | non-NULL | The value is present and is a partially shredded object |
///
///
/// Applying the above rules to entire columns, we obtain the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for preserving this information

@scovich scovich changed the title Simpler shredding state [Variant] Simpler shredding state Sep 26, 2025
@scovich
Copy link
Contributor Author

scovich commented Sep 26, 2025

Any reason not to merge?

@mbrobbel mbrobbel merged commit 92005ba into apache:main Sep 26, 2025
19 of 20 checks passed
@alamb
Copy link
Contributor

alamb commented Sep 26, 2025

🚀

///
/// | value | typed_value | Meaning |
/// |--------|-------------|---------|
/// | -- | -- | **Missing**: The value is always missing; only valid for shredded object fields |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate parquet-variant parquet-variant* crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants