Complete your Databricks User Groups profile!

Fill out a few details about yourself so the community can get to know you.
Get Certified: Azure Databricks Platform Architect — Secure Lakehouse Design, Networking & Governance Deep Dive

Databricks Architect Cross-Cloud Cheat Sheet

Core Cloud Differences

AWS • Identity: IAM Role / Instance Profile
• Storage: S3
• Storage access model: IAM Role
• Region definition: VPC
• Networking: VPC + Subnets
• Query federation: Redshift, port 5439
• Workspace setup: Bucket + IAM Role

Azure • Identity: Service Principal / Managed Identity
• Storage: ADLS Gen2
• Storage access model: Access Connector
• Region definition: Workspace configuration
• Networking: VNet + Subnets
• Query federation: Synapse / SQL DB
• Workspace setup: Resource Group + Region

GCP • Identity: Service Account
• Storage: GCS Bucket
• Storage access model: Service Account
• Region definition: Subnet
• Networking: VPC + Subnets
• Query federation: BigQuery
• Workspace setup: VPC + Principal

Identity and Access

• AWS uses IAM Role with assume-role and cross-account access
• Azure uses Service Principal or Managed Identity
• GCP uses Service Account with direct binding

Architect Tip

• AWS = role assumption model
• Azure = Azure AD identity model
• GCP = simple service account model

Exam Traps

• AWS uses IAM roles, not service accounts
• GCP uses service accounts, not IAM roles
• Azure uses Azure AD identity, not AWS-style IAM

Networking and Region

• AWS region is tied to the VPC
• Azure region is defined during workspace setup
• GCP region is determined by subnet

Memory Trick

• If the question says “subnet determines region”, the answer is GCP

Exam Mapping

• AWS = VPC
• Azure = Workspace configuration
• GCP = Subnet

Storage Access Pattern

• AWS uses IAM Role to access S3
• Azure uses Access Connector to access ADLS
• GCP uses Service Account to access GCS

Memory Hack

• AWS = Role
• Azure = Connector
• GCP = Service Account

Permission Mapping

• AWS grants access using IAM policy
• Azure grants access using RBAC role
• GCP grants access using IAM binding

Query Federation

• Query federation is read-only
• Data stays in the source system
• No ETL is required
• Federation means querying without moving data

Cloud Mapping

• AWS = Redshift
• Azure = Synapse / SQL DB
• GCP = BigQuery

Exam Traps

• Federation is not ETL
• Federation is not write-enabled
• Federation does not copy data into Databricks

Unity Catalog

• Unity Catalog is cloud-agnostic
• The hierarchy is Catalog → Schema → Table
• The top-level object is Metastore

Unity Catalog Functions

• Governance
• Permissions
• Lineage
• Cross-workspace sharing

Cloud-Specific Difference

• AWS uses IAM roles
• Azure uses Azure AD + Access Connector
• GCP uses Service Account

External Location Setup

• Create Catalog
• Create Connection
• Create Storage Credential

Cost Optimization

• Use autoscaling
• Select the right instance type
• Apply tagging
• Use serverless where available
• Use job clusters for ephemeral workloads

Cloud Cost Lens

• AWS = instance profile and billing visibility
• Azure = budget policies and cost management
• GCP = project-level billing and tagging

Databricks Data Platform Stack

• Ingestion = Lakeflow Connect
• Transformation = Spark Declarative Pipelines
• Orchestration = Lakeflow Jobs
• Governance = Unity Catalog
• Performance = Photon + Caching + Z-Order

Workspace Infrastructure Requirements

AWS • Requires S3 bucket
• Requires cross-account IAM role
• Uses bucket + IAM role pattern

Azure • Requires workspace name
• Requires resource group
• Requires region
• VNet can be optional depending on setup

GCP • Requires VPC
• Requires IAM principal
• Uses VPC + principal pattern

Architect Decision Model

Step 1: Identify the cloud

• AWS
• Azure
• GCP

Step 2: Map the problem

• Access data → check identity model
• Deploy workspace → check infrastructure requirement
• Networking → check where region is defined
• Governance → apply Unity Catalog rules
• Integration → identify the required connector or service

Step 3: Apply the cloud-specific answer

Example: How to access storage?

• AWS → IAM Role
• Azure → Access Connector
• GCP → Service Account

Golden Memory Block

• AWS = IAM Role + VPC + S3
• Azure = Service Principal + Region + Access Connector
• GCP = Service Account + Subnet + Bucket

Final Exam Hack

• AWS thinks in roles
• Azure thinks in identities and connectors
• GCP thinks in service accounts and subnets

0 comments