Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Enterprise AI is only as good as the data that is available to a model.
In the past, enterprises largely relied on structured data. With the rapid adoption of generative AI, enterprises are increasingly aiming to consume vastly larger amounts of unstructured data. Unstructured data, by definition, doesn’t have structure and can be in any number of formals. For enterprises that can be a challenge as the data quality of unstructured data is often unknown. Data quality can refer to accuracy, knowledge gaps, duplication and other issues that impact the utility of data.
Data quality tools, long used for structured data, are now expanding to unstructured data for enterprise AI. One such vendor is Anomalo, which has been developing its data quality platform for structured data for several years. Today the company announced an expansion of its platform to better support unstructured data quality monitoring.
Anomalo’s co-founder and CEO Elliot Shmukler believes that his company’s technology can have a strong impact in organizations.
“We believe that by eliminating data quality issues, we can accelerate at least 30% of gen AI deployments,” Shmukler told VentureBeat in an exclusive interview.
He noted that enterprises abandon some AI projects after the proof-of-concept stage. The root issue lies in the poor data quality, large data gaps and the fact that enterprise data is not ready for gen AI consumption.“We believe using Anomalo’s unstructured monitoring could accelerate typical gen AI projects in the Enterprise by as much as a year,” Shmukler said. “This is due to the ability to very quickly understand, profile and ultimately curate the data that these projects rely on.”
Alongside the product update, Anomalo announced a $10 million extension of its Series B funding first announced on Jan. 23, bringing the round up to $82 million.
Why data quality matters for enterprise AI
Unlike traditional structured data quality concerns, unstructured content presents unique challenges for AI applications.
“Because it’s unstructured data, anything could be in there,” Shmukler emphasized. “It could be personally identifiable information, people’s emails, names, social security numbers… there could be proprietary sec …