Complete your Databricks User Groups profile!

Fill out a few details about yourself so the community can get to know you.
Get Certified: GCP Databricks Platform Architect — Lakehouse Design, Governance & Real-Time Pipelines on Google Cloud

GCP Databricks Platform Architect Flashcard

🔐 Identity & Access

Flashcard 1
Q: What enables Databricks to access GCP services?
A: Service account attached to clusters

Flashcard 2
Q: What is required for connecting to Google-managed services?
A: Enable API + attach a privileged service account

Flashcard 3
Q: What is identity federation used for?
A: Centralized user/group management via IdP (e.g., Azure AD, Okta)

Flashcard 4
Q: Who grants privileges on data objects?
A: Data object owner

🗂️ Storage & Data Access

Flashcard 5
Q: Where are storage credentials created?
A: Workspace → Data Explorer

Flashcard 6
Q: What is required to create a metastore in GCP?
A: Cloud storage bucket

Flashcard 7
Q: Who gets permission when granting access to external storage?
A: Service account generated by the metastore

Flashcard 8
Q: What is an external location?
A: Secure access control over part of a storage bucket

🌐 Networking & Architecture

Flashcard 9
Q: What defines regionality in GCP Databricks?
A: Subnet

Flashcard 10
Q: What is a standalone VPC?
A: VPC in same project as workspace resources

Flashcard 11
Q: What registers a VPC in Databricks?
A: Network configuration

Flashcard 12
Q: What are prerequisites for workspace creation in a custom VPC?
A: VPC + principal with appropriate permissions

🔐 Security & Encryption

Flashcard 13
Q: What does encryption key configuration do?
A: Registers Cloud KMS key for Databricks

Flashcard 14
Q: Can encryption keys be rotated?
A: Yes

Flashcard 15
Q: What resources can be encrypted?
A: Root bucket, system bucket, cluster disks

🔄 Data Federation & Integration

Flashcard 16
Q: What is query federation?
A: Query external systems without moving data

Flashcard 17
Q: Is query federation read/write or read-only?
A: Read-only

Flashcard 18
Q: What is required for BigQuery federation?
A: Connection to BigQuery + foreign catalog

⚙️ Compute & Cost Optimization

Flashcard 19
Q: How can you reduce compute cost?
A: Autoscaling + proper instance type + tagging

Flashcard 20
Q: What helps optimize query performance?
A: Photon engine + caching + clustering (Z-order, liquid clustering)

🔗 Unity Catalog & Governance

Flashcard 21
Q: What does Unity Catalog provide?
A: Centralized governance, permissions, lineage

Flashcard 22
Q: What is the Unity Catalog hierarchy?
A: Catalog → Schema → Table

Flashcard 23
Q: What sits at the top of the data hierarchy?
A: Metastore

🚀 Data Engineering & Pipelines

Flashcard 24
Q: What tool handles ingestion in Databricks?
A: Lakeflow Connect

Flashcard 25
Q: What is Spark Declarative Pipelines (SDP)?
A: Declarative ETL framework for batch and streaming

Flashcard 26
Q: What handles orchestration in Databricks?
A: Lakeflow Jobs

0 comments