Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix lint
  • Loading branch information
HonahX committed Jun 4, 2024
commit 984ca4185e4b85c9c4f2d9610bc58d6d0067cb04
2 changes: 1 addition & 1 deletion mkdocs/docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Iceberg tables support table properties to configure table behavior.
<!-- prettier-ignore-start -->

!!! note "Fast append"
PyIceberg default to the [fast append](https://iceberg.apache.org/spec/#snapshots) which ignores `commit.manifest*` and does not merge manifests on writes. To make table commit respect `commit.manifest*`, use [`merge_append`](api.md#write-support) instead.
PyIceberg default to the [fast append](https://iceberg.apache.org/spec/#snapshots) which ignores `commit.manifest*` and does not merge manifests on writes. To make table merge manifests on writes and respect `commit.manifest*`, use [`merge_append`](api.md#write-support) instead.

<!-- prettier-ignore-end -->

Expand Down
21 changes: 12 additions & 9 deletions pyiceberg/table/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,12 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT)

def merge_append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT) -> None:
"""
Shorthand API for appending a PyArrow table to a table transaction.
Shorthand API for appending a PyArrow table to a table transaction and merging manifests on write.

The manifest merge behavior is controlled by table properties:
- commit.manifest.target-size-bytes
- commit.manifest.min-count-to-merge
- commit.manifest-merge.enabled

Args:
df: The Arrow dataframe that will be appended to overwrite the table
Expand Down Expand Up @@ -1392,7 +1397,12 @@ def append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT)

def merge_append(self, df: pa.Table, snapshot_properties: Dict[str, str] = EMPTY_DICT) -> None:
"""
Shorthand API for appending a PyArrow table to the table.
Shorthand API for appending a PyArrow table to a table transaction and merging manifests on write.

The manifest merge behavior is controlled by table properties:
- commit.manifest.target-size-bytes
- commit.manifest.min-count-to-merge
- commit.manifest-merge.enabled

Args:
df: The Arrow dataframe that will be appended to overwrite the table
Expand Down Expand Up @@ -3070,13 +3080,6 @@ def __init__(
TableProperties.MANIFEST_MERGE_ENABLED_DEFAULT,
)

def _deleted_entries(self) -> List[ManifestEntry]:
"""To determine if we need to record any deleted manifest entries.

In case of an append, nothing is deleted.
"""
return []

def _process_manifests(self, manifests: List[ManifestFile]) -> List[ManifestFile]:
"""To perform any post-processing on the manifests before writing them to the new snapshot.

Expand Down