Databricks Security Basics — Tokens, Users & Groups
Databricks Security Basics — Tokens, Users & Groups
You’re back at ShopWave, our fictional retail company.
You’ve set up your first notebooks, clusters, and dashboards. Everything seems perfect—until your manager asks:
“How do we make sure only the right people can access sensitive data?”
Welcome to the world of Databricks Security.
🛡️ Why Security Matters in Databricks
Databricks houses valuable business data, including:
- Customer PII
- Sales transactions
- Payment info
- ML models
- Inventory forecasts
Without proper security:
- Analysts might accidentally access restricted tables
- Notebooks could be shared outside the team
- Jobs and pipelines could be modified by unauthorized users
Security in Databricks ensures access is controlled, data is protected, and compliance is maintained.
👤 Users — Who Can Log In?
A user is anyone with a Databricks account.
Each user has:
- Login credentials (email/password, SSO)
- Assigned roles
- Permissions to access workspace resources
At ShopWave:
- Alice is a data engineer
- Bob is a data scientist
- Carol is a business analyst
Each has different privileges according to their role.
👥 Groups — Organize Users Efficiently
Instead of assigning permissions individually, Databricks uses groups:
- Engineers group → full cluster and notebook access
- Analysts group → read access to dashboards and tables
- Data scientists group → access to ML features and Delta tables
Benefits:
- Easier management for large teams
- Consistent access policies
- Quick onboarding of new employees
ShopWave creates groups for each department to simplify security management.
🔑 Personal Access Tokens — Programmatic Access
Sometimes, scripts or notebooks need to access Databricks without a password.
Enter personal access tokens:
- Used for API access
- Can be time-limited
- Can be revoked at any time
Example use cases at ShopWave:
- CI/CD pipelines fetching notebooks
- Automated ETL jobs reading Delta tables
- External apps running queries via the Databricks REST API
🏛️ Access Control Levels
Databricks provides layered access control:
| Level | Description |
|---|---|
| Workspace | Who can see notebooks, folders, repos |
| Cluster | Who can start, edit, or terminate clusters |
| Data / Tables | Who can read, write, or manage Delta tables |
| Jobs | Who can create, schedule, or run jobs |
| Account-level | Admins controlling global workspace settings |
ShopWave enforces least privilege principle: each user only gets access needed for their job.
🔐 Security Best Practices
- Use groups instead of individual permissions
- Enable SSO (Single Sign-On) for authentication
- Rotate personal access tokens regularly
- Audit workspace activity using Unity Catalog logs
- Enforce multi-factor authentication (MFA)
- Apply table-level and row-level security for sensitive data
Following these practices prevents accidental leaks and ensures compliance.
🧠 Story Recap — ShopWave Security in Action
- Alice (Engineer) runs ETL jobs on clusters → belongs to Engineers Group
- Bob (Data Scientist) trains ML models → belongs to Data Scientists Group
- Carol (Analyst) queries dashboards → belongs to Analysts Group
- Tokens are issued for API automation → securely revoked when done
- Admin monitors access → ensures everyone follows least privilege
Result: ShopWave keeps data safe, while teams remain productive.
🏁 Quick Summary
- Users are individual Databricks accounts; Groups manage access collectively.
- Personal Access Tokens allow secure programmatic access.
- Access control layers include workspace, clusters, tables, and jobs.
- Security best practices: SSO, MFA, auditing, least privilege, and token rotation.
- Proper security ensures data protection, compliance, and team productivity.
🚀 Coming Next
👉 Databricks DBFS — Internal File System Explained