There's no denying Databricks is powerful for large-scale analytics and AI workloads. The flip side is that getting a new data analyst productive takes real training, and concepts like Delta Lake and cluster tuning aren't obvious. Strong platform, but the onboarding effort is genuinely significant.
Powerful at scale
Steep learning curve
Databricks delivers genuinely impressive performance on our large ETL and analytics pipelines, and Delta Lake has made our data reliable. I'm happy with it overall. The one thing to manage carefully is compute spend, since idle clusters and oversized jobs can quietly run up costs.
Reliable Delta Lake, fast pipelines
Needs careful compute cost control
The shared notebook experience in Databricks made our data team far more collaborative, and the lakehouse approach removed a lot of duplication between our warehouse and data lake. MLflow integration for tracking experiments is handy. Knocking one star off only because cost monitoring could be clearer.
Great collaboration, MLflow built in
Cost visibility could improve
Databricks brought our data engineering and machine learning onto one lakehouse, so our analysts and ML teams finally collaborate in the same notebooks. Spark performance on big jobs is strong. Only minor gripe is that cluster startup times can leave you waiting before an interactive session is ready.
Unified data and ML platform
Cluster spin-up can be slow
We process huge volumes through Databricks daily and it handles them without complaint, while the AI and ML tooling has accelerated our data science roadmap. It's an excellent unified platform. Small gripe: occasional UI sluggishness and the odd quirk in job scheduling keep it just short of perfect.
Handles huge volumes, strong ML tooling
Occasional UI and scheduling quirks