Skip to content

Conversation

@nicktobey
Copy link
Contributor

@nicktobey nicktobey commented Oct 23, 2025

This change allows users to add foreign key relations on nonlocal tables (table names that match entries in dolt_nonlocal_tables and resolve to tables on other branches).

The biggest obstacle was that the foreign key verification logic operates on the DB directly, but resolving the references in the dolt_nonlocal_tables table requires a Dolt SQL engine. Because we now use the engine for most operations, the engine is guaranteed to exist, but it wasn't obvious how to allow the storage code to access the engine in a way that didn't break encapsulation or create dependency cycles.

The way this PR accomplishes this is by creating a new interface called doltdb.TableResolver, which has the method GetDoltTableInsensitiveWithRoot, which can resolve table names at a supplied root value. This object can be instantiated by the Dolt Session and passed into the DB layer.

I'm not thrilled about adding the extra confusingly similar methods to doltdb.Database, but hopefully the differences between them are clear.

@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
bd1b42e ok 5937471
version total_tests
bd1b42e 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
d9076bd ok 5937471
version total_tests
d9076bd 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
3f583e6 ok 5937471
version total_tests
3f583e6 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
4b44b80 ok 5937471
version total_tests
4b44b80 5937471
correctness_percentage
100.0

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems fine, the table resolution logic is getting away from us but I don't know if you need to address that right now.

Testing should be shored up a bit.

I can't see any obvious reason this is breaking Doltgres. I would expect that some place resolve.Table() was getting called is now not being called, but I didn't see any place that was happening. If you have a link to specific failures I might have better advice.

// This is useful because the user-backed system table dolt_nonlocal_tables allows table names to resolve to
// tables on other refs, but sqle.Database is necessary to resolve those refs.
type TableResolver interface {
GetDoltDBTableInsensitiveWithRoot(ctx *sql.Context, root RootValue, tblName TableName) (trueTableName TableName, table *Table, found bool, err error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a mouthful, what about just Resolve or ResolveTable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because doltdb.Database implements it and I needed the method name to distinguish itself from all of the other table name resolution methods. And it's not infeasible that we might need to add other methods to this interface that differ in subtle ways (although we should make attempts not to do so.)

But you're right that this interface shouldn't pay for the Database type's sins. I made a wrapper type in doltdb to implement the interface instead, and renamed the interface.

}
trueTableName = TableName{
Name: trueTableNameString,
Schema: tblName.Schema,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be trueTableName.Schema. I don't know if ResolveTableName actually case-corrects the schema name but it probably should.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResolveTableName only returns the table name string, and looking at the current implementation, having it case-correct the schema and return the normalized schema may require additional changes to the rootValueStorage interface. I agree it should probably both case-correct the schema name and return a TableName, and I think migrating to using TableName everywhere is important, but that seems like it would be a larger refactor that's probably better saved for a separate PR.

}
}

func (db Database) getNonlocalTableNames(ctx *sql.Context, root doltdb.RootValue) (nonlocalTableNames []string, error error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

}

func (db Database) getAllTableNames(ctx *sql.Context, root doltdb.RootValue, includeGeneratedSystemTables bool, includeRootObjects bool) ([]string, error) {
func (db Database) getAllTableNames(ctx *sql.Context, root doltdb.RootValue, includeGeneratedSystemTables bool, includeRootObjects bool, includeNonlocalTables bool) ([]string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use use a type alias for includNonLocalTables elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

return d.provider
}

func (d *DoltSession) GetTableResolver(ctx *sql.Context, dbName string) (doltdb.TableResolver, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear why this needs to live here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It requires the session provider?

I inlined it into the GetTableResolver helper function in the same file and got rid of the explicit cast.

}

func (t *WritableDoltTable) getDoltTableForFK(ctx *sql.Context, root doltdb.RootValue, lwrTableName string, sqlFk sql.ForeignKeyConstraint) (refTbl *doltdb.Table, err error) {
_, refTbl, nonlocalTableExists, err := t.db.getNonlocalDoltDBTable(ctx, root, doltdb.TableName{Name: lwrTableName, Schema: sqlFk.ParentSchema})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the right time for this but food for thought:

The logic for resolving tables is getting more and more complex and special-cased. It would good to encapsulate it better, and to return some kind of TableMetadata object to inform interested callers about things like "this is a system table", "this table was resolved with a non-local override", etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree. This is only going to get messier and would benefit from some kind of general interface.

I don't think this PR is the right place for that, but it might make sense to do before the next time we have to make things even more complicated. I can ruminate on approaches.

[[ "$output" =~ "Cannot commit changes on more than one branch / database" ]] || false
}

@test "nonlocal: test foreign keys" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good but it would be nice to have some engine tests for this capability as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For engine tests, I would expect to see e.g. tests that foreign key constraints are correctly enforced and that they resolve to the correct root, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
0f9d804 ok 5937471
version total_tests
0f9d804 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
ba3fbd0 ok 5937471
version total_tests
ba3fbd0 5937471
correctness_percentage
100.0

@nicktobey nicktobey force-pushed the nicktobey/nonlocal-fks branch from 21158f5 to 0e2ddb4 Compare October 29, 2025 18:48
@coffeegoddd
Copy link
Contributor

@nicktobey DOLT

comparing_percentages
100.000000 to 100.000000
version result total
0e2ddb4 ok 5937471
version total_tests
0e2ddb4 5937471
correctness_percentage
100.0

@nicktobey nicktobey merged commit ce73118 into main Oct 29, 2025
22 of 27 checks passed
@github-actions
Copy link

@coffeegoddd DOLT

test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.05 2
batching batch sql 10000 1 0.07 2
batching by line sql 10000 1 0.08 1.63
blob 1 blob 200000 1 0.91 4.31 4.25
blob 2 blobs 200000 1 0.88 4.85 4.95
blob no blob 200000 1 0.89 2.99 2.78
col type datetime 200000 1 0.83 2.87 2.83
col type varchar 200000 1 0.69 3.94 3.68
config width 2 cols 200000 1 0.74 3.07 3.12
config width 32 cols 200000 1 1.9 2.93 2.81
config width 8 cols 200000 1 0.96 3.04 3.29
pk type float 200000 1 0.83 2.86 2.94
pk type int 200000 1 0.81 3.05 2.85
pk type varchar 200000 1 1.54 1.86 2.19
row count 1.6mm 1600000 1 5.71 3.36 3.35
row count 400k 400000 1 1.44 3.26 3.28
row count 800k 800000 1 2.85 3.31 3.34
secondary index four index 200000 1 3.66 1.55 1.45
secondary index no secondary 200000 1 0.88 3.06 2.86
secondary index one index 200000 1 1.12 2.93 2.92
secondary index two index 200000 1 1.98 2.07 1.82
sorting shuffled 1mm 1000000 0 5.25 3.15 3.17
sorting sorted 1mm 1000000 1 5.37 3.11 3.08

@github-actions
Copy link

@coffeegoddd DOLT

name detail mean_mult
dolt_blame_basic system table 1.19
dolt_blame_commit_filter system table 2.89
dolt_commit_ancestors_commit_filter system table 0.64
dolt_commits_commit_filter system table 1.04
dolt_diff_log_join_from_commit system table 2.9
dolt_diff_log_join_to_commit system table 2.96
dolt_diff_table_from_commit_filter system table 1.2
dolt_diff_table_to_commit_filter system table 1.2
dolt_diffs_commit_filter system table 1.06
dolt_history_commit_filter system table 1.45
dolt_log_commit_filter system table 1.04

@github-actions
Copy link

@coffeegoddd DOLT

name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 0.86
adds_updates_deletes 60000 60000 60000 4.49
deletes_only 0 60000 0 2.15
updates_only 0 0 60000 2.8

@Hydrocharged Hydrocharged deleted the nicktobey/nonlocal-fks branch December 15, 2025 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants