Skip to content

[CT-2259] Avoid serialization error of Undefined with JSON-formatted logs #7108

@jtcohen6

Description

@jtcohen6

Motivation: Commands should succeed or error consistently, regardless of the --log-format being used. This is a prerequisite for changing how dbt-core is deployed in dbt Cloud.

Let's say I, as a user, try to log a variable that I have not defined:

-- models/my_model.sql

{{ log("I am here", info=True) }}

{{ log(unknown_variable_name_here, info=True) }}

{{ log("I am now here", info=True) }}

select 1 as id

Rather than raising an exception, dbt renders unknown_variable_name_here to the Jinja type Undefined. This is fine on all versions with text-formatting logging. Unfortunately, on older versions, when run with JSON-formatted logging, it yields this error:

{
  "code": "Z028",
  "data": {
    "msg": "Object of type Undefined is not JSON serializable"
  },
  "invocation_id": "99db7ab4-cc63-4a0e-bb1f-30741bed654c",
  "level": "error",
  "log_version": 2,
  "msg": "\u001b[33mObject of type Undefined is not JSON serializable\u001b[0m",
  "pid": 41046,
  "thread_name": "MainThread",
  "ts": "2023-03-02T20:55:45.427322Z",
  "type": "log_line"
}

The good news is, this is handled gracefully in v1.4+, thanks to changes we made to log serialization:

{
  "data": {
    "msg": "",
    "node_info": {
      "materialized": "view",
      "meta": {},
      "node_finished_at": null,
      "node_name": "my_failing_model",
      "node_path": "my_failing_model.sql",
      "node_started_at": "2023-03-02T20:56:46.648597",
      "node_status": "compiling",
      "resource_type": "model",
      "unique_id": "model.test.my_failing_model"
    }
  },
  "info": {
    "category": "",
    "code": "M011",
    "extra": {},
    "invocation_id": "a2e17f17-5036-4b0a-848f-80fe2c1283dd",
    "level": "info",
    "msg": "",
    "name": "JinjaLogInfo",
    "pid": 41211,
    "thread": "Thread-1",
    "ts": "2023-03-02T20:56:46.653561Z"
  }
}

So we just need a fix for older versions under critical support (v1.1, v1.2, v1.3).

Proposed fix

It's a bit ugly, but it's also very very narrow in scope. Let's update the log context method to handle if the user has passed in an object that renders to type Undefined, and replace it with an empty string:

        import jinja2
        if isinstance(msg, jinja2.Undefined):
            msg = ""
        if info:
            fire_event(MacroEventInfo(msg=str(msg)))
        else:
            fire_event(MacroEventDebug(msg=str(msg)))
        return ""

The effect is identical for text-formatted logs, and avoids serialization errors for JSON-formatted logs.

(In general, we should really stringify whatever the user has passed us - this would also solve for #6568)

Acceptance criteria

  • Avoid serialization error for Undefined object being passed into {{ log() }} on v1.1, v1.2, v1.3
  • Add a test for this behavior, on affected versions and going forward

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinglogging

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions