Azure Databricks-Security Best Practices and Threat Model
Azure Databricks-Security Best Practices and Threat Model
Azure Databricks
Table of Contents
1. Introduction 4
6. Resources 17
Page 3 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
1. Introduction
Databricks has worked with thousands of customers to securely deploy the Databricks platform with the appropriate
security features to meet their architecture requirements. While many organizations deploy security differently, there are
guidelines and features that are commonly used by organizations who need a high level of security.
This document details typical security features that are deployed by most high-security organizations, and reviews the
largest risks and the risks that customers ask about most often. It will then provide a security configuration reference
linked to our documentation and finally a collection of recommended resources.
This document is focused on the Azure Databricks platform and assumes the use of the Premium tier. It also contains
some Terraform examples.
The phrase hybrid PaaS applies because most customers deploy a data plane (virtual network and compute) in a cloud
service provider account that is owned by the customer and so is single-tenant, while the multi-tenant control plane is
run within a Microsoft-managed account. Customers get the benefits of a PaaS platform with the option to keep your
data processing clusters locally within your environment.
The phrase general-purpose data-agnostic means that unlike a pure SaaS service, Databricks doesn’t know what data
your teams process with the Azure Databricks platform. The actual code, business logic, and the datasets are provided by
your teams. You won’t find recommendations like “truncate user IDs” because we don’t know what data you’re
analyzing.
If you’re new to the Azure Databricks platform, start with an overview of the architecture and review of common security
questions before you hop into specific recommendations. You’ll see those at our Security and Trust Center and the
security and trust overview whitepaper.
Importantly, these are recommendations based on the configurations we see from our customers,
and following them doesn’t guarantee that you will be “secure.” Please review in the context of your
overall enterprise security to determine what is required to secure your deployment and your data.
Page 4 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
Most deployments
The following typical configurations are part of most enterprise production Azure Databricks deployments. If you are a
small data science team of a few people, you may not feel the need to deploy all of these. If Databricks may become a
key part of your business or if you are analyzing sensitive data, we recommend that you review these.
Ensure that all workspaces have Secure Cluster Connectivity (No Public IP / NPIP) Enabled
Evaluate whether multiple workspaces are required for segmentation
Check that your storage accounts are encrypted and that public access is blocked
Deploy Azure Databricks into your Azure virtual network (VNet injection) for increased control over the network
environment. Even if you do not need this now, this option increases the chances for future success with your
initial workspace
Use IP access lists
Use multi-factor authentication
Separate accounts with admin privileges from day-to-day user accounts
Run production workloads with service principals
Configure Azure Databricks diagnostic log delivery
Configure maximum token lifetimes for future tokens using token management
Configure admin console settings according to your organization’s needs
Use Unity Catalog to provide fine grained access control and centralized governance controls
Educate users to avoid storing production datasets in DBFS
Backup your notebooks stored in the control plane or store your notebooks in git repos
Store and use secrets securely in Databricks or using a third-party service
Consider whether to implement network protections for data exfiltration
Restart clusters on a regular schedule so that the latest patches are applied
Highly-secure deployments
In addition to the configurations typical to all deployments, the following configurations are often used in highly-secure
Azure Databricks deployments. While these are common, not all highly-secure environments use all of these settings. We
recommend incorporating these items and the threat model in the following section alongside your existing security
practices.
Evaluate whether customer-managed encryption keys are needed on the control plane, data plane storage and
data plane disks for control over data at rest (Requires Premium tier)
Keep users and groups up-to-date using SCIM
Use either IP access lists and/or front-end Private Link
Configure back-end (data plane to control plane) Private Link connectivity
Implement network protections for data exfiltration
Use Azure Active Directory tokens for remote authentication
Evaluate whether your datasets require blob versioning, soft deletes and other data protection strategies
Evaluate whether your workflow requires using git repos or CI/CD
Plan for and deploy a disaster recovery site if you have strong continuity requirements
Encourage the use of clusters that support user isolation
Configure cluster policies to enforce data access patterns and control costs
Evaluate tagging to monitor and manage chargeback and cost control
Page 5 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
This section addresses common questions about these risks, discusses probabilities, and provides mitigation strategies.
Probability
Without proper protections, account takeover would be a good strategy for an attacker. Fortunately, it is easy to apply
protection strategies that dramatically reduce the risk.
Mitigation strategies
1. Strongly authenticate users. The best defense to account takeover is strong authentication. Azure Databricks
recommends the following best practices:
Leverage multi-factor authentication (MFA) with Azure AD Conditional Access
2. Implement automation that removes or disables old employee accounts when they leave your company:
Use SCIM for user de-provisioning and group management
3. Restrict network access. Just like other SaaS or PaaS services, Databricks does not require that users log in from a
specific network (like your office or VPN) unless you enable that configuration. If an account were compromised,
having network access would make it easier for an attacker to Databricks. Mitigate this risk via the following
steps:
Ensure that all workspaces have Secure Cluster Connectivity (No Public IP / NPIP) Enabled
Use Azure AD conditional access
Use IP access lists
Configure private network connectivity between users and the control plane
4. Monitor user activities. Monitor user activities to detect anomalies (such as unusual time of login, or
simultaneous remote logins):
Configure Azure Databricks diagnostic log delivery
5. Manage personal access tokens for REST API authentication. Personal access tokens are a proxy for the user who
generates it, allowing full privileges. Tokens should be controlled and protected as closely as you would protect a
user’s credential. This is possible with:
Token management (recommended for most deployments)
Azure AD tokens (recommended for highly-secure deployments)
Page 6 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
Data exfiltration
Risk description
If a valid user or an attacker can log into the environment, a common action on objective is to exfiltrate sensitive data
from the environment, after which the attacker might store it, sell it, or ransom it.
Probability
Generally, the probability of this attack category is low because it presumes either a malicious insider or account
compromise. However, if there is a malicious insider or compromised account, attackers would likely attempt to exfiltrate
data.
Mitigation strategies
1. Network protections
Ensure that all workspaces have Secure Cluster Connectivity (No Public IP / NPIP) Enabled
Implement network exfiltration protections to limit where data can be sent from Azure Databricks
2. Control data access
Avoid storing production data in DBFS as it is accessible via API and CLI
Use Private Link and Managed Identities to access your data and prevent all other access with storage
firewalls
3. Use data exfiltration settings within the Admin Console to prevent simple methods for exfiltration
Probability
There is significant variability in the likelihood and the impact of individual errors in this category, but most security
professionals identify this as a significant potential risk.
Mitigation strategies
1. Backup data and code
Enable blob versioning, soft deletes and other data recovery strategies
Backup notebooks in your environment
2. Use software development lifecycle (SDLC) processes to control what code is executed
Store production code in Git that you can access via the Databricks Repos feature
Use a CI/CD process that pushes only authorized code to production
3. Make sure your users have the access necessary
Use SCIM for user de-provisioning and group management
Monitor Azure Databricks diagnostic logs to identify what users are logging in, and what types of clusters
they’re configuring
Use clusters that support user isolation
4. Ensure that all workspaces have Secure Cluster Connectivity (No Public IP / NPIP) Enabled
5. Deploy data exfiltration protections, as the protection they provide against accidental insider exposure is similar
to that provided against a malicious attacker
Page 7 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
Resource abuse
Risk description
Azure Databricks can deploy large amounts of compute power. As such, it could be a valuable target for crypto mining if a
customer’s user account were compromised.
Probability
This has not been a prominent activity in practice, but customers will sometimes bring up this concern.
Mitigation strategies
1. Native Azure protections
Use Azure service quotas to limit the resources that can be deployed
Regularly monitor usage data in Azure and in Azure Databricks
Regularly monitor Azure activity logs to identify abnormal provisioning activity
2. Databricks protections
Ensure that all workspaces have Secure Cluster Connectivity (No Public IP / NPIP) Enabled
Use cluster policies to limit the maximum size and type of a cluster
Limit which users can create clusters
Control the libraries that can be used in the environment to limit the risk of compromised libraries
3. Monitoring controls
Monitor utilization of your deployment with Overwatch
Monitor the Azure Databricks diagnostic logs to identify what users are logging in, and what types of
clusters they’re configuring
Probability
The security of Azure Databricks is backed by extremely strong security programs to limit the risk of an incident.
However, it’s never possible to fully eliminate the risk.
Databricks has an extremely strong security program which manages the risk of such an incident – see our Security and
Trust Center for an overview on the program and the security features in the Databricks product. However, the risk for
any company is never completely eliminated.
Azure Databricks is a Microsoft product, and the control plane is running in a Microsoft-managed subscription, so
customers should also consider the Microsoft security program whilst evaluating this risk. Please see the Microsoft
Security and Trust Center for more information and reach out to your Azure representative as required.
Mitigation strategies
1. Databricks controls
Ensure that all workspaces have Secure Cluster Connectivity (No Public IP / NPIP) Enabled
Monitor the Azure Databricks diagnostic log to identify the activities of Databricks employees who
provide support to your deployment
Consider enabling Customer Lockbox for Microsoft Azure
2. Azure controls
Monitor Azure activity logs to identify abnormal provisioning activity
Page 8 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
3. Process controls
Review the Databricks security documentation
4. Prepare “break-glass” customer-controls in the event of an active compromise
Disable a Databricks workspace to render applicable data unreadable by revoking a customer-managed
key for managed services (not guaranteed to be a reversible operation)
For the highest security environments, Databricks also advocates where possible for the use of physical authentication
tokens such as FIDO2 keys, that augment traditional Multi-Factor authentication by requiring interaction with a physical
token that cannot be compromised.
It’s important to note that Azure Active Directory conditional access applies at the point of authentication with Azure
AD. It is not enforced for users who have already authenticated with Azure AD and subsequently change networks, or
who are using alternative methods of authentication such as Personal Access Tokens. Therefore, for comprehensive
network access controls Databricks recommends that customers combine Azure Active Directory conditional access with
the use of IP access lists and/or Azure Private Link.
It’s important to note that as part of the Azure RBAC model, users that are given Contributor or above permissions to the
Resource Group for a deployed Azure Databricks workspace automatically become administrators when they login to
that workspace. Therefore, the same considerations outlined above should be applied to Azure portal users too.
Page 9 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
As per the section above, Databricks recommends that customers combine Azure Active Directory conditional access
with the use of IP access lists and/or Azure Private Link for comprehensive control over which networks their workspaces
can be accessed from.
Between Databricks users and the control plane, Private Link provides strong controls that limit the source for inbound
requests. If a company already routes traffic through an Azure environment, they can use Private Link so that the
communication between users and the Azure Databricks control plane does not traverse public IP addresses.
Token management
Page 10 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
Customers can use the Token Management API or UI controls to enable or disable personal access tokens (PATs) for REST
API authentication, limit the users who are allowed to use PATs, set the maximum lifetime for new tokens, and manage
existing tokens. Highly-secure customers typically provision a maximum token lifetime for new tokens for a workspace.
Ensure that all workspaces have Secure Cluster Connectivity (No Public IP / NPIP) Enabled
With secure cluster connectivity enabled, customer virtual networks have no inbound open ports from external networks
and Databricks cluster nodes have no public IP addresses. Secure cluster connectivity is also known as No Public IP
(NPIP). Databricks recommends this configuration for all Azure Databricks workspaces because it significantly reduces the
attack surface and hardens the security posture.
Deploy Azure Databricks in your own Azure virtual network (VNet injection)
The default deployment of Azure Databricks is a fully managed service on Azure: all data plane resources, including a
VNet that all clusters will be associated with, are deployed to a locked resource group. If you require network
customization, however, you can deploy Azure Databricks data plane resources in your own virtual network, and
Databricks recommends this approach for nearly all deployments. Even if you do not need network customisation now,
you may do in the future, and this option increases the chances for future success with your initial workspace.
When designing your virtual network, you also need to consider your egress architecture. An Azure NAT gateway is
recommended for most deployments, but if data exfiltration protection is required you may need to consider a different
egress architecture.
Important note: while it is possible to deploy Databricks with public IP addresses, Databricks recommends our Secure
cluster connectivity (No Public IP / NPIP) model and it is very unusual for a security-conscious customer to prefer public
IPs.
The TLS connections between the control plane and the data plane cannot be broken, and so it’s not possible to use a
technology like SSL or TLS inspection. The custom TLS certificate that would be needed cannot be pre-loaded on the
Azure Databricks VHD that is built for all customers.
● You don't need to manage credentials. Credentials aren’t even accessible to you.
● You can use managed identities to authenticate to any resource that supports Azure AD authentication.
● Managed identities can be used at no extra cost.
Databricks recommends using Unity Catalog and Managed Identities to access data stored in your Azure storage
accounts. Once you have granted this access, Databricks recommends configuring storage firewalls to prevent access
from untrusted networks (note that the data plane is a trusted network and should be granted access via a private or
service endpoint). See (Recommended) Configure trusted access to Azure Storage based on your managed identity for
more details.
Export notebooks or cells containing code and partial interactive query results
Download notebook results
Block notebook clipboard features
Download MLflow run artifacts
Block application attacks via iFrames and cross-site scripting
Enable blob versioning, soft deletes and other data protection features
Azure storage provides a number of features that allow you to backup and recover your data if needed. Consider the
various options available and apply as necessary to achieve your target Recovery point objective (RPO):
● Resource locks
● Blob versioning
● Soft deletes for containers
● Soft deletes for blobs
● Storage redundancy
● Delta clones
Page 12 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
Overwatch was built to enable Databricks' customers, employees, and partners to quickly / easily understand operations
within Databricks deployments. As enterprise adoption increases there’s an ever-growing need for strong governance.
Overwatch means to enable users to quickly answer questions and then drill down to make effective operational
changes.
Most use cases are not strictly security-focused, but improved visibility strengthens all security teams. For example,
suppose a PyPi library you incorporate is compromised by crypto miners, you would be grateful to have a tool that excels
at troubleshooting heavy utilization within your Databricks deployment.
● migrate is a tool to migrate a workspace one time. It uses the Databricks CLI/API in the background.
● databricks-sync is a tool that has been used for multi cloud migrations, as well as disaster recovery
synchronization of workspaces. It uses the Terraform provider to synchronize incremental changes.
● You can run either tool from a command line or from a notebook.
● Azure DevOps
● Jenkins
Page 13 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
SCIM (System for Cross-domain Identity Management) allows you to sync users and groups from Azure Active Directory
to Azure Databricks. There are three major benefits of this approach:
1. When you remove a user, the user is automatically removed from Databricks.
2. Users can also be disabled temporarily via SCIM. Customers have used this capability for scenarios where
customers believe that an account may be compromised and need to investigate
3. Groups are automatically synchronized
Please refer to the documentation for detailed instructions on how to configure SCIM for Azure Databricks.
For the storage accounts that you manage, it is your responsibility to ensure that the storage accounts are protected
according to your requirements. Examples might include:
See the Azure security baseline for Storage for an exhaustive list.
Azure Databricks requires access to this key for ongoing operations. You can revoke access to the key to prevent Azure
Databricks from accessing encrypted data within the control plane (or in our backups). This is like a “nuclear option”
where the workspace ceases to function, but it provides an emergency control for extreme situations.
Page 14 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
Add a customer-managed key for the managed disks that are attached to a cluster. Azure Databricks requires access to
this key for ongoing operations, but a customer-managed key helps meet compliance requirements and allows you to
revoke access if required.
Cloud provider audit logs such as the Azure Monitor activity log provide a great mechanism for observing the behavior of
Azure Databricks in the data plane. It provides visibility into:
● Virtual Machine (VM) creation, to help identify bitcoin mining and also control for billing.
● Subscription level events such as API calls, to help identify account compromise.
These activity logs can be joined to resource logs, such as the Azure Databricks diagnostic logs, NSG flow and other
network logs and Azure Active Directory logs for a 360 degree view of what’s happening within your Azure account, as
well as providing a historical baseline of what “normal behavior” looks like. See the Azure documentation for more
information.
Please refer to the documentation for more details about Customer Lockbox for Microsoft Azure.
Data teams are often stuck defining permissions at a coarse grained level. Unity catalog provides a modern approach to
granular access control with centralized policy, auditing, and lineage tracking, all integrated into your Azure Databricks
workflow.
Page 15 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
- Manage fine-grained access controls with ease
- Unify and secure the data search experience
- Enhance query performance at any scale
- Automate real-time data lineage
- Securely share data across organizations with Delta Sharing
The following types of clusters will enforce user isolation so that users with different privilege levels can coexist on the
same cluster:
● SQL Warehouses
● Any cluster that does not have access mode set to “No isolation shared”
If you are using the legacy Manage Clusters UI within the data science or data engineering workspaces, the following
cluster types will enforce user isolation similarly to “No Isolation Shared” clusters:
● High concurrency clusters with table access control lists (Table ACLs clusters for short)
● High concurrency clusters with credential passthrough
Clusters with user isolation include enforcement such that each user runs as a different non-privileged user account on
the cluster host. Languages are also limited to those that can be implemented in an isolated manner (SQL and Python),
and Spark APIs must be on an allowlist of those we believe to be isolation-safe.
SQL Warehouses also enforce user isolation and have similar safety features, though implemented in a mechanism
specific to the SQL workloads run on these clusters.
Customers with more stringent security requirements can enforce cluster policies that do not allow standard clusters to
be created within the environment.
If you need standard clusters, customers whose workspaces have Unity Catalog enabled can allow users to create
single-user clusters. Administrators for workspaces without unity catalog can create clusters and use Cluster ACLs to
control the users permitted to attach notebooks.
Cluster policies
Azure Databricks administrators can control many aspects of the clusters that are spun up, including available instance
types, Databricks versions, and the size of instances by using cluster policies. Admins can enforce some Spark
configuration settings. Admins can configure multiple cluster policies, allowing certain groups of users to create small
clusters, some groups of users to create large clusters, and other groups to only use existing clusters. For detailed
recommendations and discussion of cluster policies, see our announcement blog post.
Cluster ACLs allow you to specify which users can attach a notebook to a given cluster. Note that if a user shares a
notebook already attached to a standard mode cluster, the recipient will also be able to execute code on that cluster. This
Page 16 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
does not apply to clusters that enforce user isolation: SQL Warehouses, high concurrency with table ACLs clusters, and
high concurrency with credential passthrough clusters. Customers who use Unity Catalog can also enable single-user
clusters to enforce isolation clusters.
Controlling libraries
By default, Databricks allows customers to install Python, R, or scala libraries from the standard public repositories, such
as pypi, CRAN, or maven.
Those who are concerned about supply-chain attacks, can host their own repositories and then configure Azure
Databricks to use those instead. You can block access to other sources of libraries. Documentation for doing so is outside
the scope of this document, but reach out to your Databricks team for assistance as required.
It’s important to note that even if customers use Azure Key Vault to store their secrets, access controls still need to be
defined within Azure Databricks. This is because the same service identity is used to retrieve the secret for all users of an
Azure Databricks workspace.
Customers are responsible for making sure that clusters are restarted periodically. Azure Databricks does not live-patch
systems - when a cluster is restarted and newer system images or containers are available, the system will automatically
use the latest available images and containers.
6. Resources
Many different capabilities have been discussed in this document, with documentation links where possible.
Organizations who prioritize high security can learn more than what is in this document. Here are additional resources to
help you learn more:
1. Request the Enterprise Security Guide and compliance documentation from your Databricks account team.
2. Review the security features in the Security and Trust Center, along with the overall documentation about the
Databricks security and compliance programs.
3. The Security and Trust Overview Whitepaper provides an overview of the Databricks architecture and platform
security practices.
4. Documentation articles:
Page 17 of 18
What security practices should I apply to Azure Databricks? Databricks Platform Security Docs
a. Security guide - Azure Databricks
b. Azure security baseline for Azure Databricks
c. Enterprise security for Azure Databricks
5. Blog: Azure Databricks Security Best Practices
a. See also our blog on Data Exfiltration Protection with Azure Databricks
6. Whitepaper: Data Plane Host Security Summary (request from your Databricks Account Team)
7. Documentation article: Customer support access, including customer-approved workspace login
8. More Information
For more information about Azure Databricks security, please see the Enterprise Security Guide and Azure Databricks
Docs. Your Azure Databricks representatives will be happy to assist you with copies of these documents.
Page 18 of 18