Programming the Power BI Table Object Model to connect to Databricks

Objective

Programming the Power BI Table Object Model to connect to Databricks tables and automate the generation of a tabular model.

Prerequisites

.Net SDK 6.0
Power BI Desktop (runs only on Windows)
Azure Databricks workspace (Premium tier is recommended as it makes connecting from Power BI Desktop easier, but not required. Premium tier trial SKU can be used)
Azure Databricks cluster running, with Databricks Runtime 9.1 LTS (⚠️ I've had problems with Runtime version 10.x)
For using the online service: A Power BI Pro/Premium subscription in the same tenant as the Azure Databricks subscription.

Data

This demo uses the Azure Databricks TPC-H sample dataset. The dataset is available out of the box in Databricks, as Delta Parquet files as well as tables in the Hive metastore (samples database).

The TPC-H dataset contains 8 tables related by foreign keys. The sample data contains data for the tables with up to 30M rows per table, but foreign keys are not available in Databricks.

The program uses a file datamodel.json in a custom format, containing the definition of the tabular model (tables, columns and relationships) and its mapping to Databricks tables and columns.

Note: In the TPC-H dataset, the relationship from lineitem to partsupp is a foreign key over two columns. As the tabular model only supports single column relationships, only the first column is used (partkey).

The images below compare the partsupp table in Databricks and Power BI Desktop:

Program

The code project (GenerateDatabricksTOM ) uses the Analysis Management Objects (AMO) .NET Core client library to connect to a tabular server and modify the Tabular Object Model (TOM) definition. The same executable can be used:

As a client to the Power BI Online Service XML for Analysis (XMLA) endpoint, modifying an online Power BI dataset.
As a Power BI Desktop external tool, acting as a client to Power BI Desktop's embedded Analysis Services Tabular server.

The program reads a JSON file containing the detailed schema of the tabular data to be loaded, its columns and mapping to the Databricks tables source, and relationships to be created between tables. It connects to the Tabular server and modifies the model according the JSON file.

Server programming with the Power BI Online Service

Overview

In this approach, the program modifies a Dataset in the Power BI Online Service with its the tabular object model definition.

Prerequisites

A Power BI workspace must be created with the XMLA endpoint set up for read/write.
- ⚠️ Note that "Connecting to a My Workspace by using the XMLA endpoint is currently not supported" as of the time of writing.
- Create a Power BI workspace and follow the XMLA endpoint setup instructions to enable the endpoint and obtain the endpoint URL (starting with powerbi://).
A dataset must be uploaded first for the program to modify, since the XMLA endpoint does not allow creating new datasets. The XMLA endpoint also does not allow setting data source credentials, so these must be configured manually.
- In Power BI Desktop, connect to your Databricks workspace and connect to any table from the samples database. Which data is used is unimportant, since we will overwrite it with our program.
- Publish the report to the workspace you have configured with the XMLA endpoint. Browse the workspace in the online to confirm that a report and a dataset with the same name have been created.

Procedure

Run the program. In Visual Studio, use command line arguments similar to:

ENDPOINT DATASET ../../../datamodel.json adb-0000000000.0.azuredatabricks.net sql/protocolv1/o/00000/000-0000-000

Replace ENDPOINT with your workspace's XML endpoint (starting with powerbi://), and DATASET with your published dataset name. Adapt the Databricks connection strings based on your Databricks cluster settings: Configuration → Advanced options → JDBC/ODBC (Server Hostname and HTTP Path).

After the program completes successfully, connect to the dataset using either the online service or Power BI Desktop.

Connecting to the Dataset using the Power BI Online Service

In the online service, navigate to your Dataset. Note that the pane on the right shows the tables that have been created.

⚠️ The option File → Download this file is disabled (You can't download this file because the dataset was altered through XMLA endpoint).

Use the option Create a report → Start from scratch to create a report interactively.

⚠️ Note that the option File → Download this file is also disabled in the report.

Connecting to the Dataset using Power BI Desktop

In Power BI Desktop, connect to your Dataset and create a report.

⚠️ Note that options to modify the model (Transform data as well as editing relationships in Model view) are disabled.

Issues

⚠️ With either approach, the user cannot modify the queries, table definitions or relationships. Therefore they are limited to consuming the tabular model as it can be created by the program.

Conclusions

This approach usable for quickly viewing the data, but the user cannot transform the data to reach additional insights.

Server programming with Power BI Desktop

Overview

Our executable can run as Power BI Desktop external tool, acting as a client to Power BI Desktop's embedded Analysis Services Tabular server. The server (localhost port) and database are passed by Power BI Desktop when the tool is run.

In this simple demonstrator, the Databricks server connection strings must be hardcoded in the external tool definition file.

Procedure

Compile the program.

dotnet build

Update the modelgen.pbitool.json :

Adapt the paths into your installation path (replacing the C:\\path\\to strings).
Use the paths in your Databricks cluster settings: Configuration → Advanced options → JDBC/ODBC (Server Hostname and HTTP Path) to update the value in the arguments string

Copy the file to your C:\Program Files (x86)\Common Files\Microsoft Shared\Power BI Desktop\External Tools directory.

Restart Power BI Desktop if it's already open.

In Power BI Desktop:

Click Home → Enter Data and enter some random data, then click Load. This step serves no purpose, but is required as otherwise the next step fails with an error:
click External Tools → Model Generator:

This populates the model and relationships from datamodel.json.
When prompted for Databricks credentials, select Azure Active Directory.
You can now use the data in report visuals:

Issues

⚠️ Can't populate an empty model (error unknown variable or function Partition_guid, see workaround with Home → Enter Data above)
⚠️ Can't save resulting file (the banner There are pending changes in your queries that haven't been applied appears, and clicking Apply reverts the model updates)

Conclusions

This approach is usable for quick experimentation, but not usable for saving/sharing the results, given that the resulting file cannot be saved or uploaded.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
GenerateDatabricksTOM		GenerateDatabricksTOM
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
PowerBITOMDatabricks.sln		PowerBITOMDatabricks.sln
README.md		README.md
powerbi-report.png		powerbi-report.png
powerbi-transform-data-disabled.png		powerbi-transform-data-disabled.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Programming the Power BI Table Object Model to connect to Databricks

Objective

Prerequisites

Data

Program

Server programming with the Power BI Online Service

Overview

Prerequisites

Procedure

Connecting to the Dataset using the Power BI Online Service

Connecting to the Dataset using Power BI Desktop

Issues

Conclusions

Server programming with Power BI Desktop

Overview

Procedure

Issues

Conclusions

References

About

Uh oh!

Releases

Packages

Languages

License

algattik/powerbi-tom-databricks

Folders and files

Latest commit

History

Repository files navigation

Programming the Power BI Table Object Model to connect to Databricks

Objective

Prerequisites

Data

Program

Server programming with the Power BI Online Service

Overview

Prerequisites

Procedure

Connecting to the Dataset using the Power BI Online Service

Connecting to the Dataset using Power BI Desktop

Issues

Conclusions

Server programming with Power BI Desktop

Overview

Procedure

Issues

Conclusions

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages