Vast Data on Thursday launched SyncEngine, a new feature that combines cataloging, migrating and preparing data to make it faster and easier to feed AI pipelines with relevant data.
The tool is now part of the Vast Data AI Operating System (OS) at no additional cost to customers.
While model complexity and lack of GPUs are reasons many AI projects never make it into production, the readiness of data is another. Data is often distributed across numerous systems, isolated in SaaS applications and not cleansed and validated to make it usable.
Vast Data aims to ensure that an enterprise’s data is prepared for AI with SyncEngine. Given that the new feature combines data integration and preparation for AI pipelines, it is valuable for Vast Data customers, according to Michael Ni, an analyst at Constellation Research.
SyncEngine tackles one of the most overlooked barriers in enterprise AI, which is fragmented, inaccessible data. By collapsing cataloging, migration and pipeline prep into the Vast AI OS, it gives enterprises one vendor — versus multiple — to deliver scattered files and SaaS apps into AI-ready intelligence. Michael NiAnalyst, Constellation Research
“SyncEngine tackles one of the most overlooked barriers in enterprise AI, which is fragmented, inaccessible data,” he said. “By collapsing cataloging, migration and pipeline prep into the Vast AI OS, it gives enterprises one vendor — versus multiple — to deliver scattered files and SaaS apps into AI-ready intelligence.”
In addition, because SyncEngine combines capabilities that normally require more than one tool, it is differentiated from traditional data catalog offerings, Ni continued.
“SyncEngine looks like a data catalog, but it’s more than that,” he said. “Most catalogs stop at metadata. Vast folds in high-speed migration and vectorization, so you don’t just discover data, you make it AI-ready in one step. That tight integration with the Vast AI OS sets it apart from standalone catalogs.”
New capabilities
Despite enterprises continuing to increase their investments in AI development, AI projects still fail more than 80% of the time, according to some estimates. While Vast Data’s new feature doesn’t eradicate hindrances such as bias and lack of infrastructure resources, it does address problems like poor data quality and lack of relevant data.
Motivation for developing SyncEngine came from observing the trouble customers were having discovering and feeding distributed data into AI pipelines, according to Aaron Chaisson, Vast Data’s vice president of product and solutions marketing.
“The impetus was a direct response to the ‘last mile’ problem hindering customer AI strategies,” he said. “While the Vast AI OS provides automated AI data pipeline services, customers were challenged with finding and then moving all of their distributed data sources into the Vast pipeline.”
Meanwhile, because it directly addresses one of the main barriers to successful AI development, Stephen Catanzano, an analyst at Enterprise Strategy Group, now part of Omdia, like Ni called SyncEngine a compelling addition to Vast Data’s AI OS.
“Vast’s SyncEngine is a significant addition for customers that want to eliminate the ‘last mile’ problem of data fragmentation that has become one of the biggest constraints in AI deployment,” he said. “Allowing organizations to unify scattered data … without additional costs or complex third-party tools is critical.”
SyncEngine includes the following features:
Data migration capabilities built for the massive files and data sets AI models and applications require to reduce the occurrence of AI hallucinations.
Metadata indexing, enabling enterprises to catalog hundreds of trillions of files.
Throughput levels that are limited by only source and target systems.
Parallel processing for input/output workloads to reduce bottlenecks and improve performance.
Combined, the features enable users to build a catalog that connects data from disparate systems such as object storage and enterprise applications, migrate and synchronize data at scale, and ultimately speed the process of feeding AI pipelines.
That combination, rather than any one or two individual capabilities, is what is most significant about SyncEngine, according to Ni.
“Enterprises continue to struggle to find, contextualize and feed their petabytes of siloed data into [AI] workflows,” he said. “Vast folds those steps into the OS, eliminating the hand-coded, often brittle, tool chains and making [data] searchable and AI-ready.”
By folding the different steps into its AI OS — leading to high performance but also possibly fears regarding vendor lock-in — Vast Data is taking a different approach to feeding AI pipelines than many other data management vendors, Ni continued.
For example, Snowflake and Databricks lay governance and intelligence on top of data while separating compute from storage. Collibra and Informatica, meanwhile, excel at metadata management but don’t specialize in data migration and preparation capabilities.
“Vast is deliberately separating itself from the pack by collapsing categories … into one AI operating system,” Ni said. “This consolidation delivers high performance and low-latency pipelines for real-time and agentic AI, but it also challenges buyers to embrace a single vendor spanning so much ground.”
Catanzano likewise suggested that Vast Data is taking a unique approach to data management and AI pipeline development.
“Vast differentiates by offering a unified platform that combines storage, database and compute capabilities specifically optimized for AI workloads, rather than retrofitting traditional storage systems for modern AI requirements,” he said.
Meanwhile, regarding SyncEngine, Catanzano highlighted the significance of metadata indexing.
“Metadata indexing stands out as particularly valuable because it enables cataloging and searching hundreds of trillions of files within the Vast DataBase,” he said.
Looking ahead
Over the remainder of 2025, Vast Data plans to continue building an OS for AI that includes InsightEngine and AgentEngine in addition to SyncEngine, according to Chaisson.
Specific initiatives include adding a Model Context Protocol tool set for AgentEngine, he said.
Other initiatives Vast Data could take to improve its AI OS include developing industry-specific tools with prebuilt models, as well as adding more partnerships to create a broader ecosystem, according to Catanzano.
“Developing deeper integrations with popular AI development frameworks and cloud services [could] create a more seamless experience across hybrid environments,” he said.
Ni, meanwhile, suggested that to remain competitive with other data platform vendors, Vast Data needs to expand beyond focusing on fast AI pipelines to provide semantic modeling and governance capabilities.
“Enterprises don’t just want data in motion,” he said. “They want data whose meaning is understood, trusted and governed to be able to scale to drive real business decisions.”
Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than 25 years of experience. He covers analytics and data management.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Manage your privacy
To provide the best experiences, we and our partners use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us and our partners to process personal data such as browsing behavior or unique IDs on this site and show (non-) personalized ads. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Click below to consent to the above or make granular choices. Your choices will be applied to this site only. You can change your settings at any time, including withdrawing your consent, by using the toggles on the Cookie Policy, or by clicking on the manage consent button at the bottom of the screen.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.