Delta Lake’s transaction log brings high reliability, performance, and ACID compliant transactions to data lakes. But exactly how does it accomplish this? Working through concrete examples, we will take a close look at how the transaction logs are managed and leveraged by Delta to supercharge data lakes. In this tech talk you will learn: - […]
Take a walk through the daily struggles of a data engineer in this presentation as we cover what is truly needed to create robust end to end Big Data solutions.
This tutorial goes through many features of Delta Lake features including schema enforcement and schema evolution, interoperability between batch and streaming workloads, time travel, and DML commands like Delete and Merge. It was originally given at Spark Summit 2019 Europe and is available in both Scala and Python.
We will demonstrate on Apache Spark™ 2.4.3 how to use Python and the new Python APIs in Delta Lake 0.4.0 within the context of an on-time flight performance scenario. We will show how to upsert and delete data, query old versions of data with time travel and vacuum older versions for cleanup. This tutorial includes […]
This guide helps you quickly explore the main features of Delta Lake. It provides code snippets that show how to read from and write to Delta tables from interactive, batch, and streaming queries.
This is the notebook primer for the Delta Lake workshop featuring Delta Lake and MLflow. It is also used extensively for the Delta Lake Hands-on Labs.
Within the project, we make decisions based on these rules.