TDWI Research: AI Impact Starts with Data Foundations -- Virtualization Review

TDWI Research: AI Impact Starts with Data Foundations

By David Ramel
06/16/2026

TDWI Research's new 2026 Blueprint report argues that the main divide between enterprises getting broad business value from AI and those still stuck in pilots is not simply model choice. It is the condition of the data foundation beneath those AI systems.

The report is titled, "TDWI Blueprint Report | Building an AI-Ready Data Foundation," authored by Fern Halper, Ph.D., TDWI vice president of research. The report's central finding is that organizations reporting the greatest AI impact have stronger architectural, governance and operational capabilities than lower-impact organizations. TDWI is a research and education organization that provides training, insights, and best practices for data, analytics, and AI professionals.

"Although many organizations have achieved localized successes, the findings in this Blueprint suggest that long-term AI success depends on the strength of the underlying data foundation," Halper says. She explains how fragmented data environments, inconsistent governance, weak semantic alignment, and poor data accessibility become major constraints as AI initiatives move from experimentation into production.

In the report download site, TDWI says long-term AI success depends on the strength of the underlying data foundation as generative AI, copilots and agentic systems move from experimentation into production. The report itself states that many organizations have seen localized AI successes, but that fragmented data environments, inconsistent governance, weak semantic alignment and poor data accessibility become constraints when AI moves into production.

The report defines an AI-ready data foundation as the integrated set of capabilities that transforms raw, fragmented data into governed, contextualized and accessible assets that can be used reliably to build, deploy and scale AI applications. That includes ingestion, integration, pipelines, flexible architectures, metadata, lineage, semantic context, governance and access controls.

High-Impact Organizations Treat Data as Table Stakes
The report segments respondents into high-, moderate- and low-impact groups based on reported AI business impact. Among high-impact organizations, 58% said the data foundation is "absolutely required" for successful AI, while another 37% said it is important but not sufficient alone. TDWI summarizes that as 95% of high-impact organizations viewing the data foundation as either absolutely required or important.

How important is the data foundation for successful AI? — **[Click on image for larger view.]** How Important Is the Data Foundation for Successful AI? *(source: TDWI).*

The difference becomes more striking when TDWI compares high-impact organizations with lower-impact groups. Only 18% of moderate-impact respondents and 17% of low-impact respondents said the data foundation is absolutely required. Low-impact organizations were also more likely to report the data foundation as a current constraint, at 21%, compared with 1% of high-impact respondents.

TDWI also reports that organizations achieving the greatest AI impact are more likely to see the data foundation as a competitive differentiator. Among high-impact respondents, more than 85% described the data foundation as a major or moderate differentiator, and more than 40% explicitly called it a major differentiator.

Unstructured Data Moves to the Center
Another key finding is the growing importance of unstructured data. The report says generative and agentic AI are pushing organizations beyond traditional data warehousing approaches toward architectures that can parse, contextualize, govern and retrieve documents, emails, chat transcripts and multimedia content at scale.

In a chart on active AI data types, 64% of respondents said they are using structured transactional data in AI systems, while 62% said they are using documents such as PDFs, contracts and policies. Emails and chat transcripts were each cited by 41% of respondents. The report calls this shift "quite striking" and links it to generative and agentic AI use cases.

What types of data are actively used in AI systems? — **[Click on image for larger view.]** What Types of Data Are Actively Used in AI Systems? *(source: TDWI).*

The report also lists the AI use cases currently delivering value. Document processing and intelligent extraction ranked first at 59%, followed by internal knowledge assistants at 54% and process automation at 52%. Software development productivity and decision support for executives were each cited by 43% of respondents.

Which AI use cases are currently delivering value? — **[Click on image for larger view.]** Which AI Use Cases Are Currently Delivering Value? *(source: TDWI).*

Semantics and Governance Emerge as Core Requirements
TDWI's report says AI-ready data cannot be limited to storage and access. AI systems also need business context, relationship awareness and governance. The report states that "AI agents and applications cannot reliably act on raw, loosely defined data; they require data that is explicitly tied to business concepts and definitions."

The report identifies semantic models, enterprise taxonomies, business glossaries, ontologies, metadata and context engineering as increasingly important. It says modern semantic layers translate technical data structures into business terms such as revenue, shipment volume or fiscal period, allowing people and AI systems to work from the same definitions.

Governance is another major differentiator. In TDWI's comparison of governance capabilities, 63% of high-impact organizations reported policy enforcement at the data layer, compared with 32% of moderate-impact organizations and 28% of low-impact organizations. The report states that "scalable AI requires governance to be architectural, observable, and embedded into the data foundation."

**[Click on image for larger view.]** Governance Capability *(source: TDWI).*

High-impact organizations were also more likely to report integrated identity and role-based access control frameworks, at 40%, compared with 7% of low-impact organizations. TDWI says those access practices help organizations enforce boundaries for both users and AI systems, including service identities or constrained delegation models.

Architecture Differences Include Unified Platforms and Vector Stores
The report says high-impact organizations are far more likely to have a unified data platform, at 66%, compared with 24% of low-impact organizations. They are also more likely to use domain-level semantic models, open table formats, knowledge graphs, vector data stores, model registries and model risk management frameworks.

**[Click on image for larger view.]** Architecture Capability *(source: TDWI).*

TDWI reports that 60% of high-impact organizations use domain-level semantic models, compared with 17% of low-impact organizations. Forty percent of high-impact organizations reported vector data stores, compared with 17% of low-impact organizations. Thirty-eight percent of high-impact organizations reported knowledge graphs, compared with 10% of low-impact organizations.

Those architecture findings matter because the report says production-grade AI needs more than one-off data pipelines. It needs reusable context, governed retrieval mechanisms and operational controls. The report introduces the TDWI Data Foundation Blueprint for AI as a layered capability stack for ingesting, standardizing, governing, contextualizing, retrieving and operationalizing enterprise data.

Blueprint Maps the Stack for AI-Ready Data
The blueprint groups capabilities into foundation layers, trust and meaning layers, AI activation layers, cross-cutting governance and operating model components. Foundation layers include data acquisition and integration, data standardization and normalization, and storage and access. Trust and meaning layers include data quality and governance, semantic and business context, and context engineering. AI activation layers include data serving and retrieval and consumption.

**[Click on image for larger view.]** The Blueprint for the Data Foundation for AI *(source: TDWI).*

TDWI says the layers are not strictly linear. Governance, observability, interoperability, metadata and policy enforcement span multiple layers, while organizations may implement the capabilities through centralized platforms, federated data products or modular domain architectures.

The report also includes industry perspectives from ZoomInfo and Snowflake executives. Rowan Bailey, senior product director at ZoomInfo, describes AI-ready data for agentic systems as "a navigable landscape that LLMs or AI can interpret and understand." He also says, "If a user comes in with a prompt or a question, you can treat that as a compass indicating the intended direction of travel," while "an AI-ready data foundation is the map."

Jim Lebonitte, GTM platform and architecture AFE leader at Snowflake, says in the report, "The key thing I tell people right now is that your architecture has to be flexible and interoperable." He adds that "Getting your data into a lakehouse architecture with a single-copy-of-data approach is super important."

TDWI is also offering a related on-demand webinar, "Building the Data Foundation for AI -- Insights from TDWI's Q2 Blueprint Report." The webinar page says Halper presents key findings from the report and examines the characteristics of organizations translating AI initiatives into measurable outcomes.

The research firm frames the shift as a new phase of enterprise AI adoption. It states that organizations achieving broader business impact are building unified and interoperable environments, embedding governance into architectural workflows, investing in semantic and contextual layers, and creating retrieval mechanisms that ground AI systems in enterprise knowledge.

The report was based on primary research that began in February 2026 and combined a structured survey with executive focus groups and subject matter expert interviews. TDWI says 167 survey responses met its quality guidelines and were included in the final analysis. The research and writing were sponsored by Alteryx, Snowflake and ZoomInfo.

TDWI Research is a subsidiary of the parent company of Virtualization & Cloud Review .