Databricks Fully Open Sources Delta Lake
Databricks has announced that it will contribute the entirety of its Delta Lake storage framework to the Linux Foundation and open source all Delta Lake APIs as part of the Delta Lake 2.0 release.
The Delta Lake framework enables building a “Lakehouse architecture” on top of data lakes. It provides ACID transactions for concurrency control, scalable metadata handling, and unifies streaming and batch data processing. The new 2.0 release of Delta Lake features improved query performance as well as general improvements for writing large scale performance benchmarks.
Databricks also released MLflow 2.0, which includes a new Pipelines feature to accelerate and simplify ML model deployments. The company additionally introduced Spark Connect, which allows Apache Spark to run on any device, and Project Lightspeed, a next-generation Spark streaming engine.