Reliable Data Lakes at Scale

Get Started
Delta Lake is an open-source storage layer that brings ACID
transactions to Apache Spark™ and big data workloads.


Key Features

ACID Transactions:
Scalable Metadata Handling:
Time Travel (data versioning):
Open Format:
Unified Batch and Streaming Source and Sink:
Schema Enforcement:
Schema Evolution:
Audit History:
Updates and Deletes:
100% Compatible with Apache Spark API:
Instead of parquet...
dataframe
   .write
   .format("parquet")
   .save("/data")
… simply say delta
dataframe
   .write
   .format("delta")
   .save("/data")
Together, the features of Delta Lake improve both the manageability and performance of working with data in cloud storage objects, and enable a "lakehouse" paradigm that combines the key features of data warehouses and data lakes: standard DBMS management functions usable against low-cost object stores.

Organizations using and contributing to Delta Lake

Thousands of companies are processing exabytes of data per month with Delta Lake.

To add your organization here, email us at info@delta.io.

Join the Delta Lake Community

Communicate with fellow Delta Lake users and contributors, ask questions and share tips
Slack ChannelGoogle Group


Project Governance

Delta Lake is an independent open-source project and not controlled by any single company. To emphasize this we joined the Delta Lake Project in 2019, which is a sub-project of the Linux Foundation Projects.

 

Within the project, we make decisions based on these rules.

Copyright © 2020 Delta Lake, a Series of LF Projects, LLC. For web site terms of use, trademark policy and other project policies please see https://lfprojects.org.
twitterstack-overflow