Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Update name in output builder
  • Loading branch information
alamb committed Aug 7, 2025
commit 2355247e0f20f1d7fcb71bfa6f7c43d70a7d6e7d
2 changes: 1 addition & 1 deletion parquet-variant-compute/src/variant_get/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ pub fn variant_get(input: &ArrayRef, options: GetOptions) -> Result<ArrayRef> {
ShreddingState::Typed {
metadata,
typed_value,
} => output_builder.fully_shredded(variant_array, metadata, typed_value),
} => output_builder.typed(variant_array, metadata, typed_value),
ShreddingState::Unshredded { metadata, value } => {
output_builder.unshredded(variant_array, metadata, value)
}
Expand Down
2 changes: 1 addition & 1 deletion parquet-variant-compute/src/variant_get/output/mod.rs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the output builder seems to be fully column-oriented -- it assumes that all values for each leaf column are extracted in a tight loop. This can work for primitive builders, but nested builders will quickly run into pathing and efficiency problems.

I think we'll need to do something similar to the JSON builder, with a row-oriented approach where each level of a nested builder receives an already-constructed Variant for the current row and does a field extract for each child builder; child builders can then cast the result directly or recurse further as needed (based on their own type). And then the top-level builder call would construct a Variant for each row to kick-start the process.

But see the other comment -- to the extend that the shredding aligns nicely, we can hoist a subset of this per-row pathing of the append method up to columnar pathing of the builder's constructor and finalizer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this aligns with my thinking and is nicely described -- I will use it as the base for the next set of tickets perhaps

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ pub(crate) trait OutputBuilder {
) -> Result<ArrayRef>;

/// output for a perfectly shredded variant array
fn fully_shredded(
fn typed(
&self,
variant_array: &VariantArray,
metadata: &BinaryViewArray,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ impl<'a, T: ArrowPrimitiveVariant> OutputBuilder for PrimitiveOutputBuilder<'a,
Ok(Arc::new(array))
}

fn fully_shredded(
fn typed(
&self,
_variant_array: &VariantArray,
_metadata: &BinaryViewArray,
Expand Down
2 changes: 1 addition & 1 deletion parquet-variant-compute/src/variant_get/output/variant.rs
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ impl<'a> OutputBuilder for VariantOutputBuilder<'a> {
Ok(Arc::new(array_builder.build()))
}

fn fully_shredded(
fn typed(
&self,
variant_array: &VariantArray,
// TODO(perf): can reuse the metadata field here to avoid re-creating it
Expand Down
Loading