Community driven, rapidly expanding integration ecosystem
One format to unify your ETL, Data warehouse, ML in your lakehouse
Battle tested in over 10,000+ production environments
Use with any query engine on any cloud, on-prem, or locally
Protect your data with serializability, the strongest level of isolation
Handle petabyte-scale tables with billions of partitions and files with ease
Access/revert to earlier versions of data for audits, rollbacks, or reproduce
Community driven, open standards, open protocol, open discussions
Exactly once semantics ingestion to backfill to interactive queries
Schema Evolution / Enforcement
Prevent bad data from causing data corruption
Delta Lake log all change details providing a fill audit trail
SQL, Scala/Java and Python APIs to merge, update and delete datasets
Read the Lakehouse Whitepaper
Together, the features of Delta Lake improve both the manageability and performance of working with data in cloud storage objects, and enable a “lakehouse” paradigm that combines the key features of data warehouses and data lakes: standard DBMS management functions usable against low-cost object stores.
Pipeline using separate storage systems
Using Delta Lake for both stream and table storage
Organizations that have contributed to Delta Lake
Together we have made Delta Lake the most widely used lakehouse format in the world!
Delta Lake is an independent open-source project and not controlled by any single company. To emphasize this we joined the Delta Lake Project in 2019, which is a sub-project of the Linux Foundation Projects. Within the project, we make decisions based on these rules.