execution context fixes #20 by tobymao · Pull Request #34 · SQLMesh/sqlmesh

tobymao · 2022-12-09T03:44:31Z

i changed the internal representation of the time format to be python

Co-authored-by: Vincent Chan <vchan@users.noreply.github.com>

izeigerman · 2022-12-09T17:33:16Z

+        self._mapping = mapping
+
+    @property
+    def mapping(self) -> t.Dict[str, str]:


[Nitpick] TBH I really dislike the name mapping. Given its name and the type signature it can be anything at all and it's impossible to tell without reading the docs (if they are even available). Can we be more specific? Like physical_tables_to_model or model_tables or model_to_table_mapping etc.

changed to model_tables

izeigerman · 2022-12-09T17:34:07Z

+    def mapping(self) -> t.Dict[str, str]:
+        """Mapping of model name to physical table name.
+
+        If a snapshot has not been versioned yet, its view name will be returned.


Why return view? So that local evaluation works?

yea, in case you haven't pushed a snapshot yet (because you can run evaluate before plan)

izeigerman · 2022-12-09T17:38:36Z

+        return self.engine_adapter.fetchdf(query)
+
+
+class Context(ExecutionContext):


The way we use the base class here is quite sketchy and can lead to unintended consequences. For example we never invoke the base class constructor and only rely on method overriding hoping it would just do the right thing. As the code evolves custom initialization can be added to the constructor of ExecutionContext which wouldn't be a part of the Context.

I'd rather have an ABC for this instead and 2 concrete implementations. Also we may want to create context package since this module is pretty huge already.

izeigerman · 2022-12-09T17:43:12Z

+                context, start=start, end=end, latest=latest, **kwargs
            )
+            if self.kind == ModelKind.INCREMENTAL:
+                assert self.time_column


I don't think this is helpful. Shouldn't this be a ConfigError? Just a heads up that I was going to work on our validation sequence (configuration + model definitions) holistically soon.

this is only for mypy

izeigerman · 2022-12-09T17:45:42Z

+
+                if pyspark and isinstance(df, pyspark.sql.DataFrame):
+                    self.convert_to_time_column(end)
+                    df = df.where(


This made me realize something. How do we handle time zones? As far as I understand our start / end macros always return UTC. When it comes to spark functions it uses the local time zone by default unless UTC is set explicitly as part of the session config (https://spark.apache.org/docs/latest/sql-ref-syntax-aux-conf-mgmt-set-timezone.html). Is this something that we need to take care of or a responsibility of a user?

i think it might be ok because all of our timestamps are utc aware

izeigerman · 2022-12-09T17:46:55Z

        latest: TimeLike,
-        snapshots: t.Dict[str, Snapshot],
+        limit: int = 0,
+        snapshots: t.Optional[t.Dict[str, Snapshot]] = None,


Shall we get rid of snapshots here as well and just provide mapping upstream?

i think it's more convenient this way so others don't need to form the mapping,

also looking at this code, i realized -- does spark implement running audits yet?

you mean airflow? Unless they are invoked as part of the evaluation I don't think so

tobymao requested review from georgesittas, izeigerman and vchan December 9, 2022 03:44

execution context fixes #20

1f76d48

tobymao force-pushed the toby/execution_context branch from 0c58f64 to 1f76d48 Compare December 9, 2022 03:45

vchan reviewed Dec 9, 2022

View reviewed changes

Comment thread sqlmesh/core/context.py Outdated

vchan reviewed Dec 9, 2022

View reviewed changes

Comment thread sqlmesh/core/context.py Outdated

vchan reviewed Dec 9, 2022

View reviewed changes

Comment thread sqlmesh/core/context.py

tobymao and others added 2 commits December 8, 2022 20:25

Update sqlmesh/core/context.py

3059f79

Co-authored-by: Vincent Chan <vchan@users.noreply.github.com>

Update sqlmesh/core/context.py

2d4c701

Co-authored-by: Vincent Chan <vchan@users.noreply.github.com>

vchan reviewed Dec 9, 2022

View reviewed changes

Comment thread sqlmesh/core/model.py Outdated

vchan approved these changes Dec 9, 2022

View reviewed changes

tobymao and others added 2 commits December 8, 2022 21:14

Update sqlmesh/core/model.py

24dbe57

Co-authored-by: Vincent Chan <vchan@users.noreply.github.com>

Update sqlmesh/core/model.py

23a929b

Co-authored-by: Vincent Chan <vchan@users.noreply.github.com>

izeigerman reviewed Dec 9, 2022

View reviewed changes

pr review

7fbf430

tobymao force-pushed the toby/execution_context branch from 769f29c to 7fbf430 Compare December 9, 2022 18:37

Merge branch 'main' into toby/execution_context

37c0b77

tobymao enabled auto-merge (squash) December 9, 2022 18:39

tobymao merged commit 3fee692 into main Dec 9, 2022

tobymao deleted the toby/execution_context branch December 9, 2022 18:45

		return self.engine_adapter.fetchdf(query)


		class Context(ExecutionContext):

Conversation

tobymao commented Dec 9, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

izeigerman Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

izeigerman Dec 9, 2022 •

edited

Loading