FME Version
Introduction
In FME 2024.0, the Databricks Reader Database Connection introduced the option to authenticate using OAuth. This article will detail step-by-step how to configure OAuth authentication for your Azure Databricks Workspace correctly so you can use this new OAuth authentication type.
To configure OAuth authentication for Azure Databricks using FME, you need to perform several steps. These include creating an app registration in the Azure Portal, setting up a service principal in your Databricks Workspace, and enabling permissions for the service principal.
For instructions on how to configure your Databricks Reader connection with the OAuth authentication type, please refer to the Databricks: Add Database Connection documentation.
Step-by-step Instructions
Creating an App Registration in Azure Portal
- Go to the App registration page in your Azure Portal
- Register a new application in App registrations
- Set the account type to Accounts in this organizational directory only (Single tenant)
- Set the Redirect URI to a Web Application
- Set the url to http://localhost
- Click Register
- Once the application is created, make a note of the Application (Client) ID, you’ll need this later
- Go to the Authentication tab (on the left side menu)
- Under Implicit grant and hybrid flows, check on the following box - “Access tokens (used for implicit flows) and click Save
- Next, go to the API Permissions tab
- Click Add a permission
- Click on the APIs my organization uses tab
- Type “AzureDatabricks” in the search box
- Select user_impersonation and then select Add permissions
- Go to the Certificates & secrets tab
- Create a new Client Secret and save the client secret (value) somewhere safe. You can only copy this secret once. After that, it will be hidden and will have to be recreated if lost
- Go back to the Overview tab. Click on Endpoints to pull up the list of endpoints for your organization. Keep a copy of the OAuth 2.0 token endpoint (v2). This will be used for the Token Endpoint URL in the connection in FME.
- Your Azure App registration is now complete. You should have values for the following parameters after completing the previous steps.
- Token Endpoint URL
- Client ID
- Client Secret
- Scope - For Microsoft Azure-hosted Databricks clusters, the scope value must be set to:
2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default
Adding a Service Principal to the Databricks Workspace
Now that the App registration has been configured in Azure Portal, Databricks needs to know how to grab the permissions from the App registration in order to perform OAuth authentication. To do so, a service principal must be configured in your Databricks Workspaces that points to the App registration.
Go to your Databricks workspace/instance in a web browser
- Click on the User Icon in the top right corner and select Settings
- Go to Identity and access
- Under Service principals, click Manage
- Click Add service principal
- Select Add new
- Set Management to Microsoft Entra ID managed
- Enter the Application/Client ID from your Azure App registration
- Set any value for the Service principal name
- Click Add. You’ve now successfully linked your App registration to your Databricks workspace. For more information on Service Principals in Databricks (Azure), take a look at the Microsoft documentation
Configuring Service Principal Permissions in Databricks
Once the service principal has been added to the Databricks Workspace, permissions may need to be set at the Cluster & Catalog level to allow the service principal to gain access to data stored in Catalogs.
Configuring the Service Principal Role
The User role needs to be provided to the service principal explicitly in order for the service principal to have permissions to perform functions in Databricks.
- In the Service Principals page, select the Service Principal you just created
- Go to the Permissions Tab
- Click on the three dots next to the default principal and select Edit
- Under Role, click the drop-down menu and add Service principal: User
- Click Save to save these permissions.
Configuring Catalog Permissions for the Service Principal
- Next, go to your Catalog Explorer and navigate to the catalog you want to access via the Databricks Reader
- Go to the Permissions tab
- Click Grant, type in the name of the service principal you added, and select ALL PRIVILEGES
- Once added, you may need to wait a few minutes for the permissions to apply
- The Cluster you are connecting with will also need to be set to Unrestricted Policy with Shared Access Mode. It’s recommended that you use the latest Databricks runtime version.
- You are now setup to use OAuth authentication with Databricks (Azure)
Additional Requirements
Below are some of the additional requirements when using OAuth authentication with the Databricks Reader in FME.
- FME Version Requirement - You must be using FME 2024.0 or greater to access the OAuth authentication type for the Databricks Reader Database Connection
- Databricks Cluster Requirement - your cluster must be using Databricks Runtime 13.3 LTS and above.
Creating a Databricks Database Connection in FME using Azure OAuth
These instructions will provide detailed steps on how to create a Databricks Reader Database Connection in FME, using Azure OAuth authentication. These steps assume that the previous steps in this article have already been followed & completed.
- Add a Reader to your workspace
- Set the Format to Databricks (JDBC)
- For Connection, click the drop-down and select Add Database Connection…
- In the window that opens, set the Server Hostname & HTTP Path based on the Cluster you want to connect with.
On the Cluster Details page in your Databricks workspace, scroll down to the bottom and expand the Advanced Options. Click on the JDBC/ODBC tab and you will be presented with the parameters required to create a Databricks database connection. See the screenshot below for the corresponding parameters & where to find their values. - Change the Authentication type to OAuth
- Click the drop-down next to Databricks OAuth Connection and select Add Web Connection…
- In the window that opens, enter the following values. If you are missing values for any of these parameters, please refer back to the Create an App registration in the Azure section of this article.
- Token Endpoint URL: <your endpoint copied from the Azure App registration>
- Client ID: <your Client ID from the Azure App registration>
- Client Secret: <your Client Secret from the Azure App registration>
- Scope: 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default
Note: This scope value represents the programmatic ID of Databricks on Azure.
- Press OK to accept the parameters and close the Databricks OAuth Connection window.
- Enter the name of the Catalog you want to read from. You can click the ellipsis (...) to select a Catalog as well.
- Press Test… to test the connection. After successfully testing the connection, select Save.
You have now created a valid Databricks Reader Database Connection using OAuth Authentication. If you have trouble authenticating via OAuth, check that you have set the appropriate permissions on the Service Principal and that you have met the Additional Requirements.
Comments
3 comments
We struggled a bit with this one. I don't have permissions to use Azure Portal to find the following:
“Token Endpoint URL: <your endpoint copied from the Azure App registration>”
so I just assumed that our Token Endpoint would be:
https://login.microsoftonline.com/<subscription#>/oauth2/v2.0/authorize like in the example above. It did not work, and we actually had to use the following instead: https://login.microsoftonline.com/<subscription#>/oauth2/v2.0/token
Hi kjetilpettersso thanks for notifying us of the error. I've changed the wording & screenshots to direct users to use the token endpoint .
I also did some additional testing and found that some other permissions might need to be provided for the service principal. This has been added under the “Configuring Service Principal Permissions in Databricks” section.
- Dan M
I noticed that my post contains an error: the Token endpoint URL should not include your Subscription ID but rather your home Tenant ID. I suggest updating your KB accordingly.
Please sign in to leave a comment.