March 2023 – Insights

Delta Lake is an open source storage layer that provides ACID transactions. Spark DataFrames can be saved in delta format by just specifying the format as “delta”. Similarly, the saved delta table can be read by reading the format as “delta”.

The longer we use Delta, the more likely it is that we will run into a scenario where the incoming data has a schema that is slightly different from the target Delta table schema. Like with every other thing around us, evolution of schema over time is a very common scenario.

Month: March 2023

Schema Evolution With Delta