Announcing Delta Lake 3.1.0 on Apache Spark™ 3.5: Try out the latest release today!

Build Lakehouses with Delta Lake

Delta Lake is an open-source storage framework that enables building a format agnostic Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, Hive, Snowflake, Google BigQuery, Athena, Redshift, Databricks, Azure Fabric and APIs for Scala, Java, Rust, and Python.

Get Started

GitHub

Releases

Roadmap

Open

Community driven, rapidly expanding integration ecosystem

Simple

One format to unify your ETL, Data warehouse, ML in your lakehouse

Production Ready

Battle tested in over 10,000+ production environments

Platform Agnostic

Use with any query engine on any cloud, on-prem, or locally

The Latest

High-Performance Querying on Massive Delta Lake Tables with Daft

Delta Lake - State of the Project - Part 2

Delta Lake Announces Pandas Enhancement: Real Pandas to Optimize Data Lakehouse Performance

Delta Lake - State of the Project - Part 1

Delta Lake 3.1.0

Key Features

ACID Transactions

Protect your data with serializability, the strongest level of isolation

Scalable Metadata

Handle petabyte-scale tables with billions of partitions and files with ease

Time Travel

Access/revert to earlier versions of data for audits, rollbacks, or reproduce

Open Source

Community driven, open standards, open protocol, open discussions

Unified Batch/Streaming

Exactly once semantics ingestion to backfill to interactive queries

Schema Evolution / Enforcement

Prevent bad data from causing data corruption

Audit History

Delta Lake log all change details providing a fill audit trail

DML Operations

SQL, Scala/Java and Python APIs to merge, update and delete datasets

Delta Lake:
The Definitive Guide

Early Release (Raw & Unedited) sponsored by Databricks

Download

Read the Lakehouse Storage Systems Whitepapers

These whitepapers dive into the features of Lakehouse storage systems and compare Delta Lake, Apache Hudi, and Apache Iceberg. They also explain the benefits of Lakehouse storage systems and show key performance benchmarks.

Read the whitepaper

Analyzing and Comparing Lakehouse Storage Systems

Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics

Delta Lake: high-performance ACID table storage over cloud object stores

Organizations that have contributed to Delta Lake

Together we have made Delta Lake the most widely used lakehouse format in the world!

Join the Delta Lake Community

Delta Lake is supported by more than 190 developers from over 70 organizations across multiple repositories.
Chat with fellow Delta Lake users and contributors, ask questions and share tips.

Check out Last Week in a Byte newsletter: for the latest Delta events...a week late!

Project Governance

Delta Lake is an independent open-source project and not controlled by any single company. To emphasize this we joined the Delta Lake Project in 2019, which is a sub-project of the Linux Foundation Projects. Within the project, we make decisions based on these rules.