Skip to content

Indirectly caused schema changes result in confusingly reporting direct modifications #4523

@georgesittas

Description

@georgesittas

The mapping_schema model attribute is propagated throughout the DAG after the initial project loading stage. This means that changing the projections of a model may result in updating the mapping_schema of its children. Given that this attribute is used to compute columns_to_types, which is included in the data hash computation, models that are seemingly untouched can be confusingly reported as "directly modified".

This behavior can be easily reproduced using the example project:

$ sqlmesh init duckdb
$ sqlmesh plan --no-prompts --auto-apply
...
$ cat > models/incremental_model.sql  # append '1' to the 'id' column
MODEL (name sqlmesh_example.incremental_model,
  kind INCREMENTAL_BY_TIME_RANGE (
    time_column event_date
  ),
  start '2020-01-01',
  cron '@daily',
  grain (id, event_date)
);

SELECT
  id1,
  item_id,
  event_date,
FROM
  sqlmesh_example.seed_model
WHERE
  event_date BETWEEN @start_date AND @end_date

$ sqlmesh plan  # the full model is confusingly reported as "directly modified"
...
Models:
└── Directly Modified:
    ├── sqlmesh_example.incremental_model
    └── sqlmesh_example.full_model

Metadata

Metadata

Assignees

No one assigned

    Labels

    ImprovementImproves existing functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions