Skip to content

Conversation

@scottsand-db
Copy link
Collaborator

@scottsand-db scottsand-db commented Sep 16, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

How was this patch tested?

Does this PR introduce any user-facing changes?

@scottsand-db scottsand-db changed the title done first pass [Kernel] Vibe-coded prototype: Kernel postCommitSnapshot, UC publish (aka backfill) operation, with E2E test Sep 16, 2025
* @return the highest contiguous version that was successfully published, or -1 if no commits
* were published or this is a filesystem-managed table
*/
default long publish(Engine engine, Snapshot snapshot) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be: long publish(Engine engine, List<ParsedLogData> ratifiedCommits).

Then we don't need Snapshot::getRatifiedCommits.

But then we need some additional metadata to pass into publish, just like how we have CommitMetadata --> perhaps we could have PublishMetadata

* this is not a time-travel-by-timestamp query
*/
private SnapshotQueryContext(
protected SnapshotQueryContext(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allisonport-db I think we might want to refactor this to be more extensible FYI

* @throws UnsupportedOperationException if the implementation doesn't support file copying
* @since 3.4.0
*/
default void copy(String srcPath, String destPath, boolean overwrite) throws IOException {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be in Kernel-APi?

scottsand-db added a commit that referenced this pull request Oct 1, 2025
## 🥞 Stacked PR
Use this [link](https://github.com/delta-io/delta/pull/5262/files) to
review incremental changes.
-
[**stack/kernel_post_commit_snapshot_1_log_segment_extension**](#5262)
[[Files changed](https://github.com/delta-io/delta/pull/5262/files)]
-
[stack/kernel_post_commit_snapshot_2_snapshot_extension](#5280)
[[Files
changed](https://github.com/delta-io/delta/pull/5280/files/94155a89873b25094372f2de53e3c9346e7e6296..4b5dd1cec83f66a4d7ae70461d9ad479b3263944)]

---------
#### Which Delta project/connector is this regarding?

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [X] Kernel
- [ ] Other (fill in here)

## Description

This PR adds two new internal LogSegment APIs that will help us create
updated LogSegments after a transaction.

This is PR 1 as part of the post-commit-snapshot effort.

For the e2e prototype that has already derisked all of this, please see
#5213

## How was this patch tested?

New UTs.

## Does this PR introduce _any_ user-facing changes?

No.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant