New features make Google’s Spanner a database for AI
Google Cloud on Thursday unveiled a series of new capabilities for its Spanner database designed to enable development and deployment of AI applications.
The new features, which the tech giant introduced during a user conference in Tokyo and are now in preview, include Spanner Graph to add graph processing, vector search and full-text search.
Spanner, which Google Cloud first made generally available as Cloud Spanner in 2017, is a multimodal database that has historically supported structured data. To develop AI models and applications — including generative AI — unstructured data is also critical.
Vector search, full-text search and graph processing all help application developers discover and operationalize unstructured data. As a result, the new capabilities — once generally available — will make Spanner an AI database in addition to its other multimodal capabilities in what Kevin Petrie, an analyst at BARC U.S., said is a significant move.
Enterprises are no longer simply combining their data with large language models to enable generative AI exploration and analysis. Instead, they are developing their own generative AI applications that work in concert with one another. That requires features such as those Google Cloud is adding to Spanner.
“Google’s announcement signals a critical trend in the market,” Petrie said. “AI is a multi-faceted, multi-model endeavor. Companies are not implementing GenAI language models or other types of models in isolation. They are building applications in which multiple models complement one another. In this context, you need an AI database.”
In addition to the new capabilities in Spanner, Google Cloud unveiled new features for its Bigtable database to aid developers as well as new pricing options for its Cloud SQL for SQL server database.
Developing an AI database
Generative AI has the potential to transform business. When combined with an organization’s proprietary data, large language models such as Google’s Gemini and OpenAI’s GPT models enable users to model, query and analyze data using true natural language.
In turn, by enabling the use of natural language to work with data, generative AI lets non-technical workers who previously didn’t possess the coding skills or data literacy training to use complex analytics and data management platforms to do so. In addition, true natural language enables data experts such as developers and engineers to be more efficient by reducing coding requirements and other manual tasks that occupy much of their time.
As a result, many vendors have made it a priority to develop generative AI tools such as text-to-code translators an AI-powered assistants.
Some enterprises, however, want more. They want to develop their own generative AI applications, tools that understand their business and can be used in conjunction with one another to drive decision-making.
To do so, they not only need access to their data but also to easily and efficiently find the right data to train an individual model. That’s where databases can support AI development.
Technologies such as graph processing, vector search and text search enable data discovery for AI models and applications, which includes both structured data such as financial records as well as unstructured data such as text, images and audio files.
In response, vector search has become a core component of many databases over the past year with AWS, Databricks and Oracle among the many vendors that have added vector search capabilities to help deliver the relevant data needed to train generative AI models and applications.
Now, Google Cloud is not just working to add vector search to its Spanner database but also doing so in concert with other technologies that make data retrieval more efficient.
“Operational data is critical to bridging the gap between foundation models and truly delivering the promise of AI in the enterprise,” said Andi Gutmans, Google Cloud’s general manager and vice president for databases, during a media briefing on July 26. “A big focus of ours is to enhance our databases to make sure they can deliver the best, most contextually relevant data to enterprise applications.”
Graph technology differs from traditional relational database technology by enabling data points to simultaneously connect to an unlimited number of related data points rather than just one other data point at a time. Consequently, it speeds up the discovery of data that can be used together to inform an application.
Spanner Graph is a graph processing feature designed to enable developers to use graph query language — the industry standard for graph databases — along with SQL to discover and query connected data.
Vector search likewise enables similarity searches to discover more than just one data point or dataset at a time that can be used to train models and applications while full-text search enables users to simultaneously search large numbers of documents to find relevant data.
“Combining full-text search and vector similarity search capabilities makes perfect sense,” Petrie said.
One potential real-world example of an enterprise using graph processing, vector search and full-text search — also known as semantic search — within the same database to develop a AI application is customer service, he continued.
The application could be trained to find the right product guide for a customer by using keyword matching, summarize the guide using natural language and then converse with the customer with generative AI. In addition, with machine learning, the application could potentially recommend additional products based on that conversation and the customer’s purchase history.
Vespa is another database that enables multiple search types to facilitate AI development, Petrie noted.
Like Petrie, Doug Henschen, an analyst at Constellation Research, said that the additions of Spanner Graph, vector search and full-text search are important because of what they add to the database’s pre-existing multimodal capabilities.
Beyond the capabilities themselves, also important are new pricing options for Spanner that add transparency and better enable customers to control their cloud spending.
Spanner Editions provides tier-based pricing at Standard, Enterprise and Enterprise Plus levels. The new search capabilities are available to Enterprise and Enterprise Plus users, though the vendor did not publicize what it costs to use each edition.
“Spanner Graph is clearly the headline, as it fills a gap that Google had in its portfolio,” Henschen said. “But the Spanner Graph feature name falls short of telling the full story, which is that Spanner is becoming a multi-capable, high-scale database offering SQL, graph, full-text search and vector search capabilities through the new Spanner Enterprise and Enterprise Plus editions.”
In addition, by combining different search types in one service within Spanner, Google Cloud is differentiating itself from other database providers such as AWS and Oracle that separate each service within their databases, Henschen continued.
“I see it as an attractive and compelling combination of capabilities,” he said. “But there’s still room for best-of-breed Google partners, such as Neo4J which offers a dedicated graph database with vector embedding and search capabilities.”
Beyond the new search features in Spanner aimed at enabling AI development, Google Cloud also unveiled the following new database features:
- Bigtable SQL support to enable customers to use any of more than 100 SQL functions to develop applications.
- Bigtable distributed counters to simplify embedded application development.
- An Enterprise Plus edition for Cloud SQL for SQL Server that aims to provide more cost certainty for SQL Server users just as Spanner Editions does for Spanner users.
- Hosting for Oracle database services including Exadata and Autonomous database services in concert with the recent formation of a strategic partnership between Google Cloud and Oracle.
Petrie noted that the partnership between Google Cloud and Oracle is both interesting and odd given that the two are rivals. However, with Oracle likely generating most of its profit from database services rather than its cloud strategy, he said it makes sense for Oracle to enable customers to deploy databases on Google Cloud’s infrastructure.
Next steps
By adding multiple search types that turn Spanner into a database for facilitating AI development, Google Cloud is making an innovative move, according to Petrie.
“Google has a significant advantage given its massive resources and long expertise with text search and analytics,” he said.
There is, however, still room for specialized database vendors whose platforms enable customers to carry out many of the same tasks while not tying them to a single data ecosystem such as the Google Cloud Platform, he continued.
Henschen, meanwhile, said that despite what Google Cloud has done to make Spanner a database for AI development, including adding new pricing options, it has customers who use the database for reasons other than AI.
Some use the database, which provides global scalability, for its multi-region and geo-partitioning capabilities but are suddenly being classified as Enterprise Plus customers due to the amount of computational storage they use across regions, according to Henschen. As a result, those users are not served by the database’s new pricing tiers.
“I’d like to see another edition whereby customers who are only interested in Spanner for its multi-region and geo-partitioning capabilities don’t have to choose the highest-priced edition if they’re not interested in using the new graph, text-search and vector-search capabilities,” Henschen said.
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.
#features #Googles #Spanner #database