The Linux Foundation Projects
Delta Lake

Delta Lake Blogs

Thumbnail for Building a more efficient data infrastructure for machine learning with Open Source using Delta Lake, Amazon SageMaker, and EMR

Building a more efficient data infrastructure for machine learning with Open Source using Delta Lake, Amazon SageMaker, and EMR

By Vedant Jain , Denny Lee

In this blog, we’ll explore how connecting Delta Lake, Amazon SageMaker Studio, and Amazon EMR can simplify the end-to-end workflow required to support data engineering and data science projects.

Thumbnail for How to Delete Rows from a Delta Lake Table

How to Delete Rows from a Delta Lake Table

By Matthew Powers

This post teaches you how to delete rows from a Delta Lake table and how the operation is implemented under the hood.

Thumbnail for Delta Lake Constraints and Checks

Delta Lake Constraints and Checks

By Matthew Powers

This post shows how to add constraints to your Delta table to avoid certain types of values from getting appended.

Thumbnail for Delta Lake Schema Enforcement

Delta Lake Schema Enforcement

By Matthew Powers

This post teaches you about schema enforcement in Delta Lake and why it's better than what's offered by data lakes

Thumbnail for Why PySpark append and overwrite write operations are safer in Delta Lake than Parquet tables

Why PySpark append and overwrite write operations are safer in Delta Lake than Parquet tables

By Matthew Powers

This post shows you why PySpark overwrite operations are safer with Delta Lake and how the different save mode operations are implemented under the hood.

Thumbnail for How to Create Delta Lake Tables

How to Create Delta Lake Tables

By Matthew Powers

This post shows you how to create Delta Lake tables with Python, SQL, and PySpark.

Thumbnail for How to Version Your Data with pandas and Delta Lake

How to Version Your Data with pandas and Delta Lake

By Matthew Powers

This post shows you how to version your pandas datasets and the benefits you'll enjoy with versioned data.

Thumbnail for Sharing a Delta Table’s Change Data Feed with Delta Sharing 0.5.0

Sharing a Delta Table’s Change Data Feed with Delta Sharing 0.5.0

By Will Girten

We are excited to announce the release of Delta Sharing 0.5.0.

Thumbnail for How to Rollback a Delta Lake Table to a Previous Version with Restore

How to Rollback a Delta Lake Table to a Previous Version with Restore

By Matthew Powers

This post shows you how to rollback Delta Lake tables to previous versions with restore.

Thumbnail for Converting from Parquet to Delta Lake

Converting from Parquet to Delta Lake

By Matthew Powers

This post shows how to convert a Parquet table to a Delta Lake.

Thumbnail for Why we migrated to a Data Lakehouse on Delta Lake for T-Mobile Data Science and Analytics Team

Why we migrated to a Data Lakehouse on Delta Lake for T-Mobile Data Science and Analytics Team

By Robert Thompson , Geoff Freeman

In this post, we will discuss the how and why we migrated from databases and data lakes to a data lakehouse on Delta Lake. Our lakehouse architecture allows reading and writing of data without blocking and scales out linearly. Business partners can easily adopt advanced analytics and derive new insights. These new insights promote innovation across disparate workstreams and solidify the decentralized approach to analytics taken by T-Mobile.