-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[Variant] Add variant_get and Shredded VariantArray
#8021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
3ba9fe0
[Variant] Strawman / infrastructure for variant_get of shredded values
alamb f458ec5
Apply suggestions from code review
alamb 52cf52b
Merge remote-tracking branch 'apache/main' into alamb/variant_shreddi…
alamb 6b66650
remove completed todo comment
alamb 924155c
Move validation logic to `ShreddingState::try_new`
alamb 5cb1068
Apply suggestions from code review
alamb 9c22f85
cleanup
alamb eee199e
Avoid panics in production code
alamb 264356b
Merge remote-tracking branch 'apache/main' into alamb/variant_shreddi…
alamb 336a0a3
comments
alamb 016cee1
Merge remote-tracking branch 'apache/main' into alamb/variant_shreddi…
alamb 3e8820b
Fix up merge
alamb 55a7ae6
Update parquet-variant-compute/src/variant_array.rs
alamb 85e7194
Merge branch 'alamb/variant_shredding_strawman' of github.com:alamb/a…
alamb 6e38490
fix compilation
alamb 0ccd0f8
update docs
alamb f8b2df4
Rename ShreddingState to conform better to spec
alamb 2355247
Update name in output builder
alamb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev
Previous commit
Update name in output builder
- Loading branch information
commit 2355247e0f20f1d7fcb71bfa6f7c43d70a7d6e7d
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the output builder seems to be fully column-oriented -- it assumes that all values for each leaf column are extracted in a tight loop. This can work for primitive builders, but nested builders will quickly run into pathing and efficiency problems.
I think we'll need to do something similar to the JSON builder, with a row-oriented approach where each level of a nested builder receives an already-constructed
Variantfor the current row and does a field extract for each child builder; child builders can then cast the result directly or recurse further as needed (based on their own type). And then the top-level builder call would construct aVariantfor each row to kick-start the process.But see the other comment -- to the extend that the shredding aligns nicely, we can hoist a subset of this per-row pathing of the append method up to columnar pathing of the builder's constructor and finalizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this aligns with my thinking and is nicely described -- I will use it as the base for the next set of tickets perhaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed
variant_get#8083