Skip to content

Conversation

@jackyhu-db
Copy link
Contributor

@jackyhu-db jackyhu-db commented Jul 16, 2025

Motivation

Databricks adds RunAsync (default is false) option for all the thrift operations, when it is set as true, the operation runs async at the backend and the status can be polled by calling getStatus, it helps Databricks backend to better manage capacity and load on client requests. Furthermore, it can avoid the unnecessary retry on thrift operation (e.g. TExecuteStatementReq) when the warehouse is stopped or unavailable, the backend will returns 503 error when RunAsync is false and client has to retry on this thrift operation till the warehouse is up, this generates lots of queries with 503 errors in the Databricks query history and consume more resources. When RunAsync=true, server will return 200 with query state PENDING, the client will poll the status till the warehouse is up, this only generates one query in the query history.

Change

  • Add a connection parameter adbc.databricks.enable_run_async_thrift, default is false (it will be changed to true later)
  • Set RunAsync of TExecuteStatementReq with above connection parameter (RunAsyncInThrift) in DatabricksStatement:SetStatementProperties
  • Fix a bug in BaseDatabricksReader by adding null check on statement.DirectResults.ResultSet
  • Fix the case TestVarcharExceptionDataDatabricks in StringValuesTest: when error is in the DirectResult.status (case for RunAsync=true), it throw the error with DisplayMessage (see here) instead of the Message and DisplayMessage does not include some internal error info such as error exception class at the backend.

Test

  • Run all the E2E tests under csharp/test/Drivers/Databricks/E2E when Connection.RunAsyncInThrift is both on and off

@github-actions github-actions bot added this to the ADBC Libraries 20 milestone Jul 16, 2025
@jackyhu-db
Copy link
Contributor Author

@CurtHagenlocher can you add @jadewang-db as the reviewer before merging this.

@CurtHagenlocher
Copy link
Contributor

@CurtHagenlocher can you add @jadewang-db as the reviewer before merging this.

Only official Arrow committers can be added as reviewers, I'm afraid, but I can certainly wait until @jadewang-db signs off.

Copy link
Contributor

@birschick-bq birschick-bq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks goods. A few small suggestions.

@jackyhu-db jackyhu-db requested a review from birschick-bq July 16, 2025 22:30
this.isLz4Compressed = isLz4Compressed;
this.statement = statement;
if (statement.DirectResults != null && !statement.DirectResults.ResultSet.HasMoreRows)
if (statement.DirectResults?.ResultSet? != null && !statement.DirectResults.ResultSet.HasMoreRows)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (statement.DirectResults?.ResultSet? != null && !statement.DirectResults.ResultSet.HasMoreRows)
if (statement.DirectResults?.ResultSet != null && !statement.DirectResults.ResultSet.HasMoreRows)

Copy link
Contributor

@birschick-bq birschick-bq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@CurtHagenlocher CurtHagenlocher merged commit 5338294 into apache:main Jul 16, 2025
6 checks passed
CurtHagenlocher pushed a commit that referenced this pull request Aug 5, 2025
…lt true (#3232)

## Motivation

In PR #3171, `RunAsync` option
in `TExecuteStatementReq` was added and exposed via connection parameter
`adbc.databricks.enable_run_async_thrift`, but it is not enabled by
default. This is turned on by default in other Databricks drivers, we
should turn in on by default in ADBC as well.

## Change
- Set `DatabricksConnection:_runAsyncInThrift` default value to `true`

## Test
- PBI PQTest with all the test cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants