Data Analytics Data governance Open source

DataHub Cloud v1 aims to boost analytics agent accuracy

Sat, 30th May 2026

DataHub has launched DataHub Cloud v1, a release intended to give analytics agents trusted context for enterprise data. It is designed to improve answer accuracy for tools such as Databricks Genie and Snowflake Intelligence.

The software sits between analytics agents and data stored in warehouses, lakes and other systems. It ingests and organises metadata, documentation, query history and operational signals so agents can draw on broader, more current information before generating responses.

Many analytics agents still produce incorrect answers because they lack enough context on metric definitions, data lineage, freshness and business usage, according to DataHub. The new release is intended to address that by serving as a central layer for data meaning and provenance across an organisation.

A customer example cited by DataHub pointed to a sharp improvement in benchmark performance. "Starting with Snowflake metadata alone, our analytics agent answered about half of our benchmark questions correctly," said Ronald Angel, Product Manager, Data Platform, Miro.

"After layering in DataHub Cloud as our context platform, including data product documentation, cross-source context and business meaning derived from our query history, we nearly doubled accuracy from close to 50% to around 90%," Angel said.

Another user said the system surfaced information that otherwise would have remained hard to find. "Every system in the enterprise holds context, but DataHub Cloud is the platform built to unify it, and that makes it the natural foundation for AI agents," said Björn Barrefors, Metadata Management Lead, ICA.

"We have already seen it surface institutional knowledge that analysts would never have found on their own, while also flagging known data quality issues before the query ran. The next step is putting that context layer behind every data question our business asks," Barrefors said.

Product features

The release introduces four main elements: Context Ingestion, Context Intelligence, Context Hub and Context Activation. Together, they are designed to collect information from structured and unstructured sources, convert it into a searchable semantic index, allow expert review, and make that context available to agents and workflows.

Context Ingestion targets organisations where definitions and operational details are spread across multiple systems and documents. DataHub says it can build a unified graph from catalogues, metric definitions in tools such as dbt and Power BI, and material from workplace documentation systems including Notion and Confluence.

Context Intelligence focuses on query history. It turns past analyst activity into a structured semantic index so that, when an agent receives a question, it can retrieve not only schema information but also prior query patterns, joins, filters and aggregation logic used for similar questions.

According to DataHub, that approach differs from systems that rely on developers manually defining semantic models in advance. The company argues that using existing query history lets organisations improve results without a lengthy implementation period.

Context Hub is designed as a workspace where domain specialists can review, approve and enrich AI-generated context. It also lets users test how changes to definitions or context would affect text-to-SQL results before publishing them more widely.

Context Activation extends that information to agents and workflows through APIs, software development tools and prebuilt skills. DataHub also says that giving agents pre-validated context rather than raw schema alone can reduce the number of tokens needed to answer questions, potentially lowering inference costs.

Market backdrop

The launch comes as software suppliers and data teams look for ways to make AI agents more dependable in business settings. The broader issue is not only whether an agent can generate an answer, but whether it can show that the answer is based on current, approved and widely understood data.

Industry analysts increasingly describe context as a central issue in enterprise AI. DataHub cited Gartner research saying a robust context layer is foundational to AI success, while Kevin Petrie, Vice President of Research, BARC, said context engineering depends on both AI-curated and human-curated knowledge drawn from usage patterns, business definitions, semantic meaning and historical accuracy.

DataHub's technical pitch centres on combining machine-derived context with expert input. "We turn years of query history into a living knowledge base, fuse in real-time operational signals and compound it with every expert correction from the field," said Shirshanka Das, Co-Founder and Chief Technology Officer, DataHub.

"Every change is timestamped and versioned; agents don't just know the right answer, they know why it changed. That's auditable context, and it's how agents stop hallucinating and start earning trust," Das said.

DataHub was founded by the creators of the open-source DataHub project, which the company says has more than 15,000 contributors and is used by thousands of organisations. Its backers include Bessemer Venture Partners, LinkedIn and 8VC.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google

Image: Ronald Angel