Skip to main content

Posts

Featured

Micro-Batch Streaming in Databricks

  I’ve mentioned it before, but it’s worth repeating: Databricks truly stands out as the top platform for rapidly moving from data ingestion through transformation to actionable insights (forgive my enthusiasm!)   With Spark, you get the flexibility of both batch and structured streaming. And with Delta Live Tables (DLT), Databricks simplifies building secure and testable data pipelines. Using DLT’s declarative framework, you define transformations, and Databricks handles the orchestration, automatically maintaining dependencies between tables.    DLT also offers built-in data quality features, known as “expectations,” which can be applied using both SQL and Python. For an even richer development experience, you can integrate additional tools like DBT into your pipeline.   Databricks Jobs further streamlines orchestration, allowing you to schedule and manage multiple data pipelines effortlessly. For data governance, Unity Catalog offers a unified approach. Meanw...

Latest Posts

Databricks Structured Streaming with Azure Event Hubs

Administer Your Databricks Account with Terraform

Building a Conversational Retrieval System with Langchain and OpenAI GPT-3.5

Unlock the Power of Your Microsoft Documents with Azure AI Document Intelligence and LangChain

Getting Started with Delta Sharing in Azure Databricks

Leveraging the Common Data Model (CDM) in Azure Data Lake

Efficient Data Movement with Azure Data Factory

Harness the Power of Terraform to Provision Azure Databricks with Unity Catalog

Identity Objects in Microsoft Entra ID