Connect Databricks Deltalake

Connect your Databricks Deltalake source to Bigeye.

1. Allow Bigeye's IP address

If you've set up network policies to restrict the IP addresses that communicate with your Databricks Deltalake instance, modify these policies to allow the Bigeye IP address.

Bigeye makes calls to your warehouse from the following static IP:

35.163.65.120

2 Give Bigeye access to Databricks

2.1 Generate an access token in the Databricks UI

An access token is needed for Bigeye to connect to Databricks. We recommend that you create a service principal and generate an access token for that service principal.

  1. Create service principal in the Databricks Account Console by going to Admin Settings -> Identity and Access -> Manage Service Principals -> Add service principal. Note the service principal's application id
  2. Give access token permissions to the service principal by following these instructions from Databricks.
  3. Generate an access token for the service principal by following these instructions from Databricks.

2.2 Grant permissions to Bigeye's service principal

2.2.1 Unity Catalog

If you want to Grant access to all existing and future tables within catalog

GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<application_id>`;
GRANT USE_SCHEMA ON CATALOG <catalog_name> TO `<application_id>`;
GRANT SELECT ON CATALOG <catalog_name> TO `<application_id>`;

If you want to Grant access to specific tables

GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<application_id>`;
GRANT USE_SCHEMA ON SCHEMA <catalog_name>.<schema_name> TO `<application_id>`;
GRANT SELECT ON TABLE <catalog_name>.<schema_name>.<table_name> TO `<application_id>`;

2.2.2 Hive Metastore

GRANT READ_METADATA, USAGE, SELECT ON catalog <catalog_name> to `<application_id>`

3. Gather connection details from Delta Lake

Login to your Delta Lake account.

  1. Click Compute in the sidebar.
  2. Choose a cluster to connect to.
  3. Navigate to Advanced Options.
  4. Click the JDBC/ODBC tab.
  5. Copy Server Hostname, Port, and Http Path.

4. Add Deltalake as a data source in Bigeye

Login to your Bigeye account.

On the Catalog page, click Add Source and then select Databricks from the Choose a data source section. Click Next to configure the connection to your database.

On the Configure source modal that opens, enter the following details:

Field NameDescription
VendorDatabricks Delta Lake
NameThe identifying name of the data source in Bigeye.
HostThe hostname from step 3.
PortThe port from step 3.
UsernameThe HTTP path from step 3.
PasswordThe access token for the service principal from step 2.1.3.

4. Next Steps

After you've configured the source, Bigeye loads and profiles your tables. It can take up to 24 hours for the profiling to complete and your autometrics and autothresholds to populate. See how to deploy autometrics in the Getting Started guide.


What’s Next