Mooncake acquisition boosts Databricks database capabilities

This story was updated on Oct. 2.

Flush with cash from a spate of massive funding rounds, Databricks on Wednesday continued its acquisition spree with the purchase of database startup Mooncake Labs.

Financial terms of the deal were not disclosed.

The acquisition of Mooncake closely follows Databricks’ May purchase of Neon, a cloud-based database platform built on PostgreSQL. Buying Neon marked expansion for Databricks by adding open-source database capabilities that can serve as a foundation for agentic AI development. Now, Neon’s technology forms the infrastructure of Lakebase, a fully managed PostgreSQL database that combines operational database capabilities with lakehouse architecture.

Acquiring Mooncake adds new capabilities to Lakebase aimed at further enabling Databricks customers to easily develop agents and other AI applications trained on the proprietary data they store in the Databricks Data Intelligence Platform.

Multi-agent systems rely on near real-time data to be effective. Without real-time data pipelines, change data capture and other capabilities that keep agents current, agentic systems fail.

However, online transaction processing (OLTP) databases traditionally sit outside data platforms, requiring developers, engineers, autonomous agents and other users to extract, transform and load (ETL) from their data platforms into OLTP databases to inform AI and analytics applications. Each new workload, meanwhile, adds more data that needs to be moved, integrated and governed.

Mooncake’s combination of a PostgreSQL database and lakehouse architecture aims to eliminate the expensive, time-consuming need to build and maintain ETL pipelines. Instead, it is designed to enable users to run AI, analytics and transactional workloads by mirroring changes — providing an exact replica in real time — between PostgreSQL databases and Apache Iceberg and Delta tables in their lakehouse so agents can immediately discover and reason on fresh, governed data.

As a result, Mooncake’s acquisition is a significant move for Databricks, according to Sanjeev Mohan, founder and principal of analyst firm SanjMo.

“It is a great acquisition for Databricks, which is building out this new category of a lakebase,” he said. “This gives agents the advantage of being able to see the latest data with minimal latency in any format that the agent needs. This enhances their vision of delivering a lakebase.”

Based in San Francisco, Databricks was one of the pioneers of the data lakehouse. Following OpenAI’s November 2022 launch of ChatGPT, which represented significant improvement in generative AI (GenAI) and sparked surging investments in building AI tools, Databricks expanded to include an AI development environment that now enables customers to build and deploy agents.

Competitors including rival Snowflake, tech giants such as AWS and Microsoft, and specialists including Alation and Domo have similarly broadened their scope to include AI development.

Additive capabilities

The acquisition of Mooncake, though valuable, is just the latest purchase for Databricks.

It is a great acquisition for Databricks, which is building out this new category of a lakebase. This gives agents the advantage of being able to see the latest data with minimal latency in any format that the agent needs. This enhances their vision of delivering a lakebase.
Sanjeev MohanFounder and principal, SanjMo

To date, the data platform vendor has raised nearly $22 billion in funding, including $10 billion in a single round in December 2024. Using that capital, Databricks has expanded through acquisitions to provide a comprehensive environment for building and deploying AI, including agents.

Databricks’ acquisition spree started in earnest with its June 2023 acquisition of MosaicML, which now provides the foundation for the Mosaic AI platform for AI development. Other acquisitions include Arcion, Einblick and Lilac AI to add complementary AI development capabilities, BladeBridge to add data migration capabilities that aid AI development, and Tabular to provide support for open-source Iceberg tables that can be used when building AI tools.

Donald Farmer, founder and principal of TreeHive Strategy, like Mohan, noted that Mooncake adds valuable capabilities to the initial PostgreSQL technology Databricks acquired when it bought Neon.

In particular, Mooncake’s focus on AI-driven transactional workloads, which require real-time synchronization between PostgreSQL OLTP and Databricks’ lakehouse, augments Neon’s PostgreSQL scaling and cloud-native design, according to Farmer.

“This is probably quite complementary to Neon,” he said. “While Neon strengthens Databricks’ Postgres-as-a-Service and open-source cloud, Mooncake targets next-generation OLTP alongside AI and analytics. So, there are good moves here for agentic AI and real-time analytics.”

However, there is some risk involved in Databricks’ acquisition strategy, Farmer continued.

While Databricks now possesses a full PostgreSQL framework and has added a litany of other capabilities to aid AI development, piecing together all the technology that has been acquired could be complex and result in a fragmented infrastructure.

“Strategically, we have to wonder about this buying spree,” Farmer said. “Is it all positive, or is Databricks just buying up talent and emerging tech to pre-empt competition because they can? That’s an expensive move that may result in a fragmented, poorly integrated, confusing infrastructure. We’ll know in a couple of years if this has been a rush to inorganic growth, or a carefully executed strategic plan.”

Meanwhile, though Databricks didn’t have to spend over $1 billion to acquire a 2024 startup as it did to buy MosaicML, Tabular and Neon, the acquisition of Mooncake could prove to be one of its most significant if Mooncake’s technology works as intended, according to Mohan.

Many of Databricks’ acquisitions added complementary capabilities that broadened a standard AI development environment. Mooncake provides capabilities that, if successful, enable Databricks to build an innovative product that combines PostgreSQL database capabilities with a lakehouse architecture to fuel agentic AI systems.

“Databricks is on to something,” Mohan said. “The concept of Lakebase is new and we don’t know how successful it will be. It’s not established, it’s not a done deal — lakehouses are, but Lakebase is not — so it is too early to know if Lakebase is going to have legs. If it does, Databricks gets a huge benefit with this acquisition.”

Meanwhile, because of its potential to establish a new format for feeding AI and analytics applications, Databricks’ development of Lakehouse and acquisition of Mooncake could force Snowflake and other competitors to react, he continued.

In the spring, just as Databricks bought Neon to add PostgreSQL database capabilities, Snowflake acquired Crunchy Data to do the same — AWS, Google Cloud, Microsoft, Oracle and other tech giants already provide PostgreSQL databases. Now, with Lakebase potentially making agents more effective, Mohan noted that vendors aiming to equip customers with the most advanced systems for agentic AI will need to add capabilities similar to what Mooncake provides.

However, to do so, Databricks’ rivals may need to replicate Mooncake’s technology through internal development rather than acquisition, given that there aren’t other startups developing the same technology.

“I’m sure Snowflake is following this news,” Mohan said. “However, what Mooncake is doing is not a generally accepted, widely available concept like PostgreSQL. I don’t think others have caught on, so [Databricks rivals] will need to decide how to react to this because [lakebases] are not an established concept.”

Regarding Databricks’ own choice to buy Mooncake rather than attempt to replicate its technology through internal development, adding Mooncake founders Zhou Sun, Cheng Chen and Pranav Aurora to Databricks was key, according to Nikita Shamgunov, vice president of engineering at Databricks.

“Mooncake’s founders are an integral part of [advancing Lakebase],” Shamgunov, who was co-founder and CEO of Neon, said. “They have deep expertise in PostgreSQL internals, distributed systems and open table formats and experience building products that bridge the gap between transactional workloads and large-scale analytics.”

Meanwhile, Lakebase itself is Databricks’ attempt to develop a new type of database that provides the speed and performance that AI workloads require, he continued.

“With both the Neon and Mooncake acquisitions, we’re doubling down on our bet to reinvent the database for AI,” Shamgunov said. “Lakebase … brings operational data to the lakehouse to unify operational and analytical data and support agent workloads. The Mooncake acquisition complements this by making operational data readily available for analytics and AI workloads.”

Looking ahead

Over the course of one week, Databricks formed a partnership with OpenAI that makes OpenAI’s proprietary models natively available in Databricks’ Data Intelligence Platform, launched an AI-powered suite to combat cyberattacks and completed its latest acquisition.

Over the final months of 2025 and into 2026, continuing to simplify AI development will be a focus for Databricks, according to Hanlin Tang, the vendor’s chief technology officer for neural networks.

Mohan suggested that one way Databricks could make it easier for users to build AI applications is to improve its data integration capabilities, perhaps with yet another acquisition, given Databricks’ history and that Salesforce recently acquired Informatica.

In addition, helping the data management industry develop a standard for semantic modeling — defining data to make it easier to discover and operationalize for AI — would be a wise move for Databricks, Mohan continued.

Snowflake, Salesforce, DBT Labs, ThoughtSpot and a host of other vendors recently formed a consortium to develop an open standard for semantic modeling. However, Databricks is not a participant. Neither is AtScale, a semantic modeling specialist that competes with DBT Labs.

“First, social media was rife with data mesh versus data fabric, then that got upended by Iceberg versus Delta, and now we are going to have semantic wars breaking out,” Mohan said. “My suggestion would be not to go there and waste time, and to have a common semantic standard. It’s a big area of contention.”

Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than 25 years of experience. He covers analytics and data management.

Related posts

Who takes responsibility? Birmingham’s ERP extraordinary meeting

Heightened global risk pushes interest in data sovereignty

Digital Catapult sets sights on boosting AI take-up in agrifood sector