-
Notifications
You must be signed in to change notification settings - Fork 413
Description
common prefix listing that includes uncommitted changes on a branch (i.e. first screen seen in the UI when clicking on a repo) is actually o(n) where n is the amount of uncommitted changes.
The problem here is recognizing common prefixes that were deleted.
Given an underlying commit and a list of uncommitted change, the way to tell whether a common prefix exists, is by making sure not all entries within it are tombstoned in the list of uncommitted changes.
See this: https://github.com/treeverse/lakeFS/blob/v0.42.0/pkg/graveler/combined_iterator.go#L87
This is the iterator that combines a set of changes (iterA) and an underlying commit (iterB). Assuming prefix "a/" contains 1m entries in the underlying commit, and 999,999 tombstones in the diff, this iterator will scan all of them when calling Next().
Not sure if current data model allows doing anything better, but probably worth exploring and understanding the tradeoff of doing something "better".