-
Notifications
You must be signed in to change notification settings - Fork 163
feat(csharp/test/Drivers/Databricks): Enable RunAsync option in TExecuteStatementReq #3171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@CurtHagenlocher can you add @jadewang-db as the reviewer before merging this. |
Only official Arrow committers can be added as reviewers, I'm afraid, but I can certainly wait until @jadewang-db signs off. |
birschick-bq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks goods. A few small suggestions.
| this.isLz4Compressed = isLz4Compressed; | ||
| this.statement = statement; | ||
| if (statement.DirectResults != null && !statement.DirectResults.ResultSet.HasMoreRows) | ||
| if (statement.DirectResults?.ResultSet? != null && !statement.DirectResults.ResultSet.HasMoreRows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (statement.DirectResults?.ResultSet? != null && !statement.DirectResults.ResultSet.HasMoreRows) | |
| if (statement.DirectResults?.ResultSet != null && !statement.DirectResults.ResultSet.HasMoreRows) |
birschick-bq
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
…lt true (#3232) ## Motivation In PR #3171, `RunAsync` option in `TExecuteStatementReq` was added and exposed via connection parameter `adbc.databricks.enable_run_async_thrift`, but it is not enabled by default. This is turned on by default in other Databricks drivers, we should turn in on by default in ADBC as well. ## Change - Set `DatabricksConnection:_runAsyncInThrift` default value to `true` ## Test - PBI PQTest with all the test cases
Motivation
Databricks adds
RunAsync(default is false) option for all the thrift operations, when it is set astrue, the operation runs async at the backend and the status can be polled by calling getStatus, it helps Databricks backend to better manage capacity and load on client requests. Furthermore, it can avoid the unnecessary retry on thrift operation (e.g. TExecuteStatementReq) when the warehouse is stopped or unavailable, the backend will returns 503 error whenRunAsyncis false and client has to retry on this thrift operation till the warehouse is up, this generates lots of queries with 503 errors in the Databricks query history and consume more resources. WhenRunAsync=true, server will return 200 with query statePENDING, the client will poll the status till the warehouse is up, this only generates one query in the query history.Change
adbc.databricks.enable_run_async_thrift, default isfalse(it will be changed totruelater)RunAsyncofTExecuteStatementReqwith above connection parameter (RunAsyncInThrift) inDatabricksStatement:SetStatementPropertiesBaseDatabricksReaderby addingnullcheck onstatement.DirectResults.ResultSetTestVarcharExceptionDataDatabricksinStringValuesTest: when error is in theDirectResult.status(case forRunAsync=true), it throw the error withDisplayMessage(see here) instead of theMessageandDisplayMessagedoes not include some internal error info such as error exception class at the backend.Test
csharp/test/Drivers/Databricks/E2EwhenConnection.RunAsyncInThriftis both on and off