Conversation
|
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
jtcohen6
left a comment
There was a problem hiding this comment.
Nice work on this!! I took it for a spin and left some comments/questions. Depending on the complexity involved, I'm open to chatting through what's a blocker to merging, and what could be a follow-on UX improvement for later.
| } | ||
|
|
||
| // Q042 | ||
| message CompiledNode { |
There was a problem hiding this comment.
Why are we recreating as a new event type (with a new code)? The additional fields aren't a breaking change, right?
I think this will need to be updated to account for the changes we made last week to our event/proto system (#7190) I was mistaken about what had changed in that PR
There was a problem hiding this comment.
They're not, I just ran out of space in the numbering and wanted to move them to the right spot before shipping.
| if self.output_format == "json": | ||
| if self.is_inline: | ||
| return json.dumps({"compiled": self.compiled}, indent=2) | ||
| else: | ||
| return json.dumps({"node": self.node_name, "compiled": self.compiled}, indent=2) |
There was a problem hiding this comment.
[Not blocking] It makes sense to me that we'd want to support similar arguments for both compile and show, including --output. The current JSON output does feel a bit inconsistent between them:
10:54:04 {
"node": "my_sql_model",
"compiled": "select 1 as id"
}
...
10:54:17 Previewing node 'my_sql_model':
[{"id": 1.0}]
I don't have a very strong feeling about what to show here. The most important use case for JSON-formatted output is programmatic consumers of the show result set.
There was a problem hiding this comment.
Good catch!
This is a regression, fixed with a new test.
| execution_time=0, | ||
| message=None, | ||
| adapter_response=adapter_response.to_dict(), | ||
| agate_table=execute_result, |
There was a problem hiding this comment.
Is this (agate_table) how we'd recommend that programmatic invocations access the result set of the show command?
>>> from dbt.cli.main import dbtRunner
>>> dbt = dbtRunner()
>>> results, success = dbt.invoke(['show', '--select', 'my_sql_model', '--output', 'json'])
>>> results[0].agate_table.print_table()
| id |
| -- |
| 1 |
# do some io.StringIO() hacking to get JSON valueIf someone has requested --output json, I think the show task should also return the JSON-formatted output to the caller, one way or another.
(I understand that the result will also be available in the logs, but this feels like something the result object should really contain directly!)
There was a problem hiding this comment.
I added a test that illustrates this behavior.
|
Docs issue created at dbt-labs/docs.getdbt.com#3097 |
|
Created a backlog ticket to investigate why the length of the Agate tables are wrong in tests: #7234. Results are correct during 🎩. |
|
@ChenyuLInx I've confirmed in Jaffle Shop that neither |
ChenyuLInx
left a comment
There was a problem hiding this comment.
Looks good over all! When I tested locally if I try to show a wide model(has many columns), I would get this
00:20:46 Running with dbt=1.5.0-b4
00:20:46 Found 5 models, 20 tests, 0 snapshots, 0 analyses, 312 macros, 0 operations, 3 seed files, 0 sources, 0 exposures, 0 metrics, 0 groups
00:20:46
00:20:46 Concurrency: 1 threads (target='dev')
00:20:46
00:20:46 Previewing node 'orders':
| order_id | customer_id | order_date | status | credit_card_amount | coupon_amount | ... |
| -------- | ----------- | ---------- | --------- | ------------------ | ------------- | --- |
| 1 | 1 | 2018-01-01 | returned | 10 | 0 | ... |
| 2 | 3 | 2018-01-02 | completed | 20 | 0 | ... |
| 3 | 94 | 2018-01-04 | completed | 0 | 1 | ... |
| 4 | 50 | 2018-01-05 | completed | 0 | 25 | ... |
| 5 | 64 | 2018-01-05 | completed | 0 | 0 | ... |```
| def test_second_ephemeral_model(self, project): | ||
| run_dbt(["deps"]) | ||
| (results, log_output) = run_dbt_and_capture( | ||
| ["show", "--inline", models__second_ephemeral_model] |
There was a problem hiding this comment.
Thanks for adding so many ephemeral model tests! Do we have one that tests a model that have a ref to a ephemeral model?
There was a problem hiding this comment.
Yup, the second ephemeral model is an ephemeral model that references another ephemeral model.
| assert get_lines("first_model") == ["select 1 as fun"] | ||
| assert any("_test_compile as schema" in line for line in get_lines("second_model")) | ||
|
|
||
| @pytest.mark.skip("Investigate flaky test #7179") |
There was a problem hiding this comment.
Can we try to prioritize this issue before the 1.5 release, feels like something we would want to have coverage on.
| matched_results = [] | ||
|
|
||
| fire_event(CompileComplete()) | ||
| for result in matched_results: |
There was a problem hiding this comment.
@jtcohen6 did we agree on firing one event for each selected node is a good idea? what if we accidentally selected 1000 node?
There was a problem hiding this comment.
@ChenyuLInx The node name explicitly needs to be in the selector, so this situation is pretty unlikely. e.g., selector my_model+ will be filtered down to just my_model.
There was a problem hiding this comment.
If I had my druthers, I'd prefer we do this in a way that kept more of this logic within dbt's selection syntax, e.g. "we only want to support the FQN selector method."
As it is, we're going to apply the selection syntax, and then filter it down to only preview the nodes whose names explicitly appear in the --select arg. This means the show command won't support yaml selectors, tag:-based selection (even if only one node has that tag), etc. We also won't be showing a log message explaining why, if a node is selected, it's not being previewed.
$ dbt show -s tag:doesnt_exist
17:26:05 Running with dbt=1.5.0-b4
17:26:05 Found 1 model, 0 tests, 0 snapshots, 1 analysis, 420 macros, 0 operations, 1 seed file, 0 sources, 0 exposures, 0 metrics, 0 groups
17:26:05 The selection criterion 'tag:doesnt_exist' does not match any nodes
17:26:05
17:26:05 Nothing to do. Try checking your model configs and model specification args
$ dbt show -s tag:one_node_has_this_tag
17:26:20 Running with dbt=1.5.0-b4
17:26:21 Found 1 model, 0 tests, 0 snapshots, 1 analysis, 420 macros, 0 operations, 1 seed file, 0 sources, 0 exposures, 0 metrics, 0 groups
17:26:21
17:26:21 Concurrency: 5 threads (target='dev')
17:26:21
In terms of the "happy path" for the show command, the intent is to only show resource(s) explicitly specified in the --select syntax. Could I ask that we at least fire an event here, if a set of nodes is returned from the selection syntax, and then filtered out because the node's name wasn't explicitly passed to --select?
$ dbt show -s tag:one_node_has_this_tag
17:26:20 Running with dbt=1.5.0-b4
17:26:21 Found 1 model, 0 tests, 0 snapshots, 1 analysis, 420 macros, 0 operations, 1 seed file, 0 sources, 0 exposures, 0 metrics, 0 groups
17:26:21 The selection criterion 'tag:one_node_has_this_tag' selected one node, but the 'show' command will only preview models explicitly named in the 'select' argument
17:26:21
17:26:21 Concurrency: 5 threads (target='dev')
17:26:21
If we can do that, then this is fine by me.
stu-k
left a comment
There was a problem hiding this comment.
One comment about using the new postflight decorator. Otherwise after looking at the following nothing stands out
- Command was created in
cli/main.py - New params were created in
cli/params.py - Events in
events/types.protolook correct - Messages for events in
events/types.pylook correct - Tests for both compile and show make sense
Will leave for someone else to approve.
|
@ChenyuLInx @jtcohen6 The issue I discussed during standup was due to some bad configuration locally, recreating from scratch seemed to resolve the issue. |
ChenyuLInx
left a comment
There was a problem hiding this comment.
Two last things, one is related to preview doesn't print all columns when there are many of them(see my last review comment), the other one is related to selection behavior mentioned below.
| matched_results = [] | ||
|
|
||
| fire_event(CompileComplete()) | ||
| for result in matched_results: |
|
When just Examples: vs |
|
@Mathyoub I played around with this a little, unfortunately there isn't a way to do this without making |
ChenyuLInx
left a comment
There was a problem hiding this comment.
We talked about this live, there't an issue regarding seed node is still included in the show but we can's show it. A follow up issue will be created.
dbt show --select +orders
19:52:09 Running with dbt=1.5.0-b5
19:52:10 Found 5 models, 20 tests, 0 snapshots, 0 analyses, 309 macros, 0 operations, 3 seed files, 0 sources, 0 exposures, 0 metrics, 0 groups
19:52:10
19:52:10 Concurrency: 3 threads (target='dev')
19:52:10
Error: Traceback (most recent call last):
File "/Users/chenyuli/git/dbt-core/core/dbt/cli/requires.py", line 100, in wrapper
result, success = func(*args, **kwargs)
File "/Users/chenyuli/git/dbt-core/core/dbt/cli/main.py", line 345, in show
results = task.run()
File "/Users/chenyuli/git/dbt-core/core/dbt/task/runnable.py", line 438, in run
result = self.execute_with_hooks(selected_uids)
File "/Users/chenyuli/git/dbt-core/core/dbt/task/runnable.py", line 401, in execute_with_hooks
res = self.execute_nodes()
File "/Users/chenyuli/git/dbt-core/core/dbt/task/runnable.py", line 350, in execute_nodes
self.run_queue(pool)
File "/Users/chenyuli/git/dbt-core/core/dbt/task/runnable.py", line 259, in run_queue
self._raise_set_error()
File "/Users/chenyuli/git/dbt-core/core/dbt/task/runnable.py", line 240, in _raise_set_error
raise self._raise_next_tick
dbt.exceptions.DbtRuntimeError: Runtime Error
Database Error in seed raw_payments (seeds/raw_payments.csv)
can't execute an empty query
resolves #7207
resolves #7179
resolves #6359
Description
Checklist
changie newto create a changelog entry