Enterprises have moved quickly to adopt RAG to ground LLMs in proprietary data. In practice, however, many organizations are discovering that retrieval is no longer a feature bolted onto model inference — it has become a foundational system dependency.Once AI systems are deployed to support decision-making, automate workflows or operate semi-autonomously, failures in retrieval propagate directly into business risk. Stale context, ungoverned access paths and poorly evaluated retrieval pipelines do not merely degrade answer quality; they undermine trust, compliance and operational reliability.This article reframes retrieval as infrastructure rather than application logic. It introduces a system-level model for designing retrieval platforms that support freshness, governance and evaluation as first-class architectural concerns. The goal is to help enterprise architects, AI platform leaders, and data infrastructure teams reason about retrieval systems with the same rigor historically applied to compute, networking and storage.Retrieval as infrastructure — A reference architecture illustrating how freshness, governance, and evaluation function as first-class system planes rather than embedded application logic. Conceptual diagram created by the author.Why RAG breaks down at enterprise scaleEarly RAG implementations were designed for narrow use cases: document search, internal Q&A and copilots operating within tightly scoped domains. These designs assumed relatively static corpora, predictable access patterns and human-in-the-loop oversight. Those assumptions no longer hold.Modern enterprise AI systems increasingly rely on:Continuously changing data sourcesMulti-step reasoning across domainsAgent-driven workflows that retrieve context autonomouslyRegulatory and audit requirements tied to data usageIn these environments, retrieval failures compound quickly. A single outdated index or mis-scoped access policy can cascade across multiple downstream decisions. Treating retrieval as a lightweight enhancement to inference logic obscures its growing role as a systemic risk surface.Retrieval freshness is a systems problem, not a tuning problemFreshness failures rarely originate in embedding models. They originate in the surrounding system.Most enterprise retrieval stacks struggle to answer basic operational questions:How quickly do source changes propagate into indexes?Which consumers are still querying outda …