Delta Lake Blogs
Running ML Workflows with Delta Lake and Ray
March 23, 2023 by Jim Hibbard
This post explains how you can read Delta Lake with the Ray compute framework
How to Convert from CSV to Delta Lake
March 22, 2023 by Matthew Powers
This post explains how to convert from a CSV data lake to Delta Lake, which offers much better features.
Getting started contributing to Delta Lake Spark
March 7, 2023 by Nick Karpov
This post explains the full development loop with the Delta Lake Spark connector. You'll learn how to retrieve and navigate the codebase, make changes, and package and debug custom builds.
New features in the Python deltalake 0.7.0 release of delta-rs
February 27, 2023 by Will Jones, Matthew Powers
This post explains the new features in the deltalake 0.7.0 release
Delta Lake Merge
February 14, 2023 by Nick Karpov
This post shows how to use MERGE with Delta tables.
Delta Lake Schema Evolution
February 8, 2023 by Matthew Powers
This post shows how to enable schema evolution in Delta tables and when this is a good option.
Delta Lake Time Travel
February 1, 2023 by Matthew Powers
This post shows how to time travel between different versions of a Delta table.
Delta Lake Small File Compaction with OPTIMIZE
January 25, 2023 by Matthew Powers
This post shows compact small files in Delta tables with OPTMIZE.
Adding and Deleting Partitions in Delta Lake tables
January 18, 2023 by Matthew Powers, Ryan Zhu
This post shows add partitions and remove partitions from Delta Lake tables.
Delta Lake Vacuum Command
January 3, 2023 by Matthew Powers, Nick Karpov
This blog post explains how to vacuum files marked for deletion from storage with the Delta Lake Vacuum command.
Reading Delta Lake Tables into Polars DataFrames
December 22, 2022 by Matthew Powers, Chitral Verma
This post shows how to read Delta Lake tables into Polars DataFrames.
Building a more efficient data infrastructure for machine learning with Open Source using Delta Lake, Amazon SageMaker, and EMR
December 13, 2022 by Vedant Jain, Denny Lee
In this blog, we’ll explore how connecting Delta Lake, Amazon SageMaker Studio, and Amazon EMR can simplify the end-to-end workflow required to support data engineering and data science projects.
Data Sharing across Government Agencies using Delta Sharing
December 8, 2022 by Li Yu, Mubashir Kazia, Jon D. Ceanfaglione, Prabha Rajendran, Purushotam Shrestha, Shawn A. Benjamin
This post shows how government agencies are sharing data with Delta Sharing.