Mount ADSL Gen2 to Cluster using service principal and OAuth 2.0

Create service principal

Role assignment to service principal

Mount data lake storage to Databricks Cluster

Below scripts base on Python notebook

configs = {"fs.azure.account.auth.type": "OAuth",
           "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
           "fs.azure.account.oauth2.client.id": "<application-id>",
           "fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope="<scope-name>",key="<service-credential-key-name>"),
           "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<directory-id>/oauth2/token"}

# Optionally, you can add <directory-name> to the source URI of your mount point.
dbutils.fs.mount(
  source = "abfss://<file-system-name>@<storage-account-name>.dfs.core.windows.net/",
  mount_point = "/mnt/<mount-name>",
  extra_configs = configs)

Out[53]: True

dbutils.fs.ls('mnt/data/')

Out[56]: [FileInfo(path='dbfs:/mnt/data/Customer.csv', name='Customer.csv', size=196514)]

%fs ls abfss://rawdata@databricksdatastorage.dfs.core.windows.net

Appendix

Azure Data Lake Storage Gen2

最后更新于