Add logic for recognizing jinja code in models by georgesittas · Pull Request #28 · SQLMesh/sqlmesh

georgesittas · 2022-12-08T16:41:57Z

Posting this so I can get some early feedback; I plan to work on / refine it soon. Some points for discussion:

Do we want to support jinja in the model's metadata? If not, we might be able to parse just this part of the file in order to use the fields inside Model.load (e.g. name is needed to instantiate a model).
How do we want to handle model validation? As far as I can understand, the plan is to store SQL that may contain jinja as a string (see JinjaModel), so validation needs to happen every time we're about to render / execute it? Am I missing something here?
What about jinja context? Do we already have a rough plan of where we'll store the needed information to render correctly?

By the way, a simple way I found to render jinja code from a string is the following (for reference):

>>> from jinja2 import Environment, BaseLoader
>>> template = Environment(loader=BaseLoader).from_string("Hello {{ name }}")
>>> rtemplate.render(**{"name": "foo"})
'Hello foo'

I'm happy to follow a different direction if this doesn't make much sense.

tobymao · 2022-12-08T16:50:00Z

this may not be necessary, maybe just render jinja and see if the outputs are different

I'm skeptical as to whether this is going to add more overhead than a regex search, but we can try.

ok, then only run that after the regex search

tobymao · 2022-12-08T16:50:36Z

yea, i think we need to parse the Model first for other reasons as well, like getting the dialect

tobymao · 2022-12-08T16:51:00Z

i would render the jinja once to validate it, but not validate every time

tobymao · 2022-12-08T16:51:24Z

what information is needed to render? can you store it in the context?

crericha · 2022-12-08T18:00:36Z

Do we want to support jinja in the model's metadata? If not, we might be able to parse just this part of the file in order to use the fields inside Model.load (e.g. name is needed to instantiate a model).

Are you referring to the config() method at the top of dbt models, which is wrapped in jinja? If so, my hope was to render that to obtain the config.

georgesittas · 2022-12-08T20:52:29Z

what information is needed to render? can you store it in the context?

We can have things like

select * from events where event_type = '{{ var("event_type") }}'

So I guess we'd need to use context information to render the jinja correctly (e.g. by scanning a .yml file with all the user's configs or something).

Are you referring to the config() method at the top of dbt models, which is wrapped in jinja? If so, my hope was to render that to obtain the config.

Good point. Yeah, I guess that won't be very hard to support. Another question related to this is: are we going to assume that jinja => dbt project? Under this assumption, we can always just call a parse function for this config section you're referring to, otherwise it might be a bit trickier.

georgesittas · 2022-12-08T20:54:56Z

i would render the jinja once to validate it, but not validate every time

You mean like @ model load time, right? So, we'd validate the model using the rendered SQL but still store the original jinja string (possibly w/out the model meta)?

tobymao · 2022-12-08T20:56:12Z

i would render the jinja once to validate it, but not validate every time

You mean like @ model load time, right? So, we'd validate the model using the rendered SQL but still store the original jinja string (possibly w/out the model meta)?

yea, shouldn't store model meta

eakmanrq · 2022-12-08T21:53:34Z

Another question related to this is: are we going to assume that jinja => dbt project?

I think we will have users that like Jinja and if we couple Jinja with dbt then they will say "Well I need to write dbt models so I can use Jinja". So I think ideally we would support Jinja for all models (not just dbt) but not sure on the technical complexity of that.

tobymao · 2022-12-08T21:56:13Z

right, jinja !+ dbt

crericha · 2022-12-08T21:56:58Z

Are you referring to the config() method at the top of dbt models, which is wrapped in jinja? If so, my hope was to render that to obtain the config.

Good point. Yeah, I guess that won't be very hard to support. Another question related to this is: are we going to assume that jinja => dbt project? Under this assumption, we can always just call a parse function for this config section you're referring to, otherwise it might be a bit trickier.

My thought is I'd call render at load to get the config information and transform it into sqlmesh model meta data (combining it with any general config from the yaml files). Maybe, render could take in an optional dict of jinja overrides, where I can pass in a lambda for config and handle it myself { "config": myConfigMethod}. We need to think a little bit on how best to handle ref and source

I agree with @eakmanrq and @tobymao regarding Jinja support for all models, regardless if it is dbt or not.

georgesittas · 2022-12-08T22:07:50Z

I see, yeah these points sound reasonable. So then ideally we'd want sqlmesh users to be able to use both our macro system and jinja, but for the latter we won't create any special functionality (like how dbt uses ref, for example), since we already have a "sqlmesh" way to do things.

Although, if we allow jinja usage throughout sqlglot, does that mean that we also need to have a dedicated config file (e.g. a yaml or something similar), so that the user can provide kwargs to be used in their jinja code? Does that make sense?

crericha · 2022-12-08T22:12:32Z

For ref and source, we'll need to replace those with the sqlmesh naming either on load or when the query is rendered. Does anyone have opinions when it happens?

crericha · 2022-12-09T17:32:12Z

+                file_contents = file.read()
+
+                if JINJA_RE.search(file_contents):
+                    expressions = [JinjaModel(this=file_contents)]


Why is a jinja model needed?

It's not needed per se. The idea for using it was that in this way we can still represent the query as a SQLGlot expression. An alternative would be to store a raw string for jinja queries, but then we'd have to take that new representation into account in places where query is expected to be an expression.

This is still in an early phase, though. As we're moving on, it might make sense to actually change this, so I'm not arguing that this is the way to go.

I believe render will return a sqlglot expression (jinja or no jinja), no?

Yes, that's correct -- maybe JinjaModel is not necessary after all. I'll think about it and continue working on this PR soon.

georgesittas · 2022-12-09T19:19:35Z

Closing this PR so that @tobymao can do a fresh first pass on this task. I plan to help with it afterwards.

tobymao reviewed Dec 8, 2022

View reviewed changes

Add logic for recognizing jinja code in models

fdd8934

georgesittas force-pushed the jo/jinja_rendering branch from ff1990d to fdd8934 Compare December 9, 2022 16:47

crericha reviewed Dec 9, 2022

View reviewed changes

georgesittas marked this pull request as draft December 9, 2022 17:58

georgesittas closed this Dec 9, 2022

georgesittas deleted the jo/jinja_rendering branch December 9, 2022 19:19

Conversation

georgesittas commented Dec 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tobymao Dec 8, 2022

Choose a reason for hiding this comment

Uh oh!

georgesittas Dec 8, 2022

Choose a reason for hiding this comment

Uh oh!

tobymao Dec 8, 2022

Choose a reason for hiding this comment

Uh oh!

tobymao commented Dec 8, 2022

Uh oh!

tobymao commented Dec 8, 2022

Uh oh!

tobymao commented Dec 8, 2022

Uh oh!

crericha commented Dec 8, 2022

Uh oh!

georgesittas commented Dec 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

georgesittas commented Dec 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tobymao commented Dec 8, 2022

Uh oh!

eakmanrq commented Dec 8, 2022

Uh oh!

tobymao commented Dec 8, 2022

Uh oh!

crericha commented Dec 8, 2022

Uh oh!

georgesittas commented Dec 8, 2022

Uh oh!

crericha commented Dec 8, 2022

Uh oh!

crericha Dec 9, 2022

Choose a reason for hiding this comment

Uh oh!

georgesittas Dec 9, 2022

Choose a reason for hiding this comment

Uh oh!

crericha Dec 9, 2022

Choose a reason for hiding this comment

Uh oh!

georgesittas Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

georgesittas commented Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

georgesittas commented Dec 8, 2022 •

edited

Loading

georgesittas commented Dec 8, 2022 •

edited

Loading

georgesittas commented Dec 8, 2022 •

edited

Loading

georgesittas Dec 9, 2022 •

edited

Loading

georgesittas commented Dec 9, 2022 •

edited

Loading