Most AI governance does not actually constrain agent behavior at runtime. It helps satisfy a compliance review or the feeling of having done something. It tells the model “don’t do X” and hopes the next prompt does as told. That is performative: the agent performs as if governed, with no actual guarantee it will obey. Stanford’s 2026 AI Index reports the gap is widening, not closing, even as frameworks proliferate. Governance you can actually count on is deterministic at the action level, scored, logged and auditable per call. Anything probabilistic is mostly theater in any system where one bad call is detrimental to the deployer’s credibility.
From my perspective, the gap is structural, not a bad-faith failure. Frameworks were designed for documentation. Agents “live” in the runtime layer, which is also the missing piece without which the rest runs on wishful thinking.
To be precise: this argument is about deploying agentic AI in high-risk or regulated workflows where mistakes carry real consequence. Money movement, access control, legal decisions, healthcare workflows, ICT operations under DORA. Outside that scope, low-stakes consumer assistants, internal-only experimentation, content tasks where errors get reviewed before they ship, the calculus is different and the governance bar is lower. Runtime governance is the missing piece for deploying agentic AI at scale in high-risk or regulated environments. Where the impact is bounded, probabilistic methods could be sufficient.
The gap is measured
The Stanford HAI 2026 AI Index Chapter 3 documents a widening gap between AI capability and governance preparedness.1 Every frontier model developer reports results on capability benchmarks like MMLU and SWE-bench. Reporting on responsible-AI benchmarks remains spotty. The Foundation Model Transparency Index dropped from 58 in 2024 to 40 in 2025. The AI Incident Database recorded 362 documented incidents in 2025, up from 233 in 2024. The same chapter adds a sharper finding: improving one responsible-AI dimension (such as safety) can degrade another (such as accuracy). The same model that gets safer also gets less accurate. The two used to compound. Now they trade.
Grant Thornton’s 2026 AI Impact Survey lands the compliance-relevant version of the same pattern.2 Seventy-eight percent of business executives lack strong confidence they could pass an independent AI governance audit within 90 days. Most organizations can answer with principles. Few can answer with proof.
McKinsey’s playbook for technology leaders puts a number on the operational reality: 80 percent of surveyed organizations have encountered risky behavior from AI agents, and only one-third report governance maturity at level three or higher.3 The other two-thirds are running at the documented-policy-without-runtime-evidence layer. Documentation and runtime are not the same artifact. The gap between them is where supervisory exams find their evidence.
Why this happens, structural causes
Frameworks are designed to pass compliance review. Compliance review works on paper rather than in code. Writing lengthy system instructions is cheap. Enforcing the policy at every tool call is expensive engineering work. The path of least resistance produces governance that documents intent and stops there.
The mechanics underneath are worth naming. Most current AI governance approaches rely on three categories of control: system prompts (“do not do X”), fine-tuning, and RLHF. All three are probabilistic. They tilt the distribution of model outputs toward the desired behavior. They do not and can not guarantee it. The literature on prompt injection, jailbreaks, and adversarial fine-tuning has been clear since 2023 that sufficiently motivated input can bypass any of them.4 In a low-stakes environment, that probabilistic floor is acceptable. In a regulated workflow where one mis-routed credit decision triggers a 24-hour incident notification, and the head of ICT risk personally carries up to €1 million of liability, probabilistic safety is no safety. Murphy’s law swings hardest where the fine is personal.
The second structural cause is cycle mismatch. Audit cycle is annual. Decision cycle is per-call. A framework reviewed once a year cannot supervise an agent making thousands of decisions an hour. The audit produces a snapshot. The agent produces a stream.
The third structural cause is the compliance economy itself. Compliance asks for certifications. Compliance does not ask for tool call logs. Vendors optimize for what compliance asks for. Until the compliance question shifts from “show me your certificate” to “show me your last week of tool call logs with policy evaluation,” the supply side reasonably keeps producing certificates.
Adjacent systems already solved this. API gateways enforce authentication and rate limiting at the request boundary. Database transactions provide ACID guarantees per operation. Zero-trust architectures verify identity per request, not per session. Agent systems are the first class of software being deployed into high-risk environments without an equivalent execution boundary.
None of this is bad faith. It is incentive-aligned production of paper governance in a market where regulation is ahead of the technology.
The missing piece, Runtime Governance
These failures have familiar shapes. Prompt injection turns a benign instruction into unauthorized tool use. Multi-step plans accumulate side effects no individual step would have triggered on its own. Model updates produce silent policy drift between deployment and audit. All of these failures share an architectural property: an action executed without a verifiable policy check at the boundary.
The runtime governance category is forming around a layered architecture, and the most recent published reference is Oracle’s “From Model Safety to Runtime Governance” framework.5 Oracle maps four layers cleanly:
- L4, Governance, Risk, and Compliance Oversight: risk appetite, policy lifecycle, sign-off workflows
- L3, Trust, Access, and Supply Chain Control: tool eligibility, model registry, identity binding
- L2, Runtime Safety and Policy Enforcement: action-by-action gating, scoring, deterministic policy enforcement
- L1, Risk Signals and Provenance: audit trail, structured decision records, drift signals
The pattern is as sharp as it sounds. An organization that stops at L4 documentation and L3 access controls is governing the planning surface but not the doing surface. The Oracle author puts it directly: the unit of governance has shifted from model response to governed action trajectory. The question is no longer “was the answer safe.” It is “is the next action authorized under policy, identity, and budget.”
Multiple voices are converging on the same observation. Microsoft’s Agent Governance Toolkit shipped in April 2026 as a stateless gating layer at L2. The World Economic Forum’s board playbook on agentic AI lands the boardroom version of the argument: accountability is non-transferable, and a board cannot delegate operational-resilience accountability to a vendor’s policy DSL.6 AI Verify Foundation in Singapore is one of the few governance bodies actually testing runtime behavior under its 11 governance principles. The OWASP Top 10 for Agentic Applications 2026 maps the runtime threat categories the L2 layer must defend against.7 In a 2025 European Law Blog piece, Lloyd Jones names the regulatory gap “Agentic Tool Sovereignty” and observes that AI Office guidance does not yet specifically address agent runtime behavior.8 If “agentic tool sovereignty” is the regulatory question, the runtime layer is part of the engineering answer.
What this category-formation does not say loudly enough is the determinism point. L2 only does its job if its policy evaluation is replayable and inspectable. Determinism here means explicit, versioned policy evaluation at the action boundary, not philosophical certainty about model output. A probabilistic L2 that “usually” gates the function call cannot be replayed under audit and cannot be debugged in incident response. The architectural failure is the same as a probabilistic L4 that “usually” documents the policy: neither survives the moment when an engineer or a regulator asks “show me what happened on Tuesday at 14:32.” The auditable artifact at L2 is a log entry that says “score 0.83 against policy P-17, threshold 0.8, allowed, executed at 14:32:07.” Either the gate ran with policy P-17 active or it did not. The decision is replayable. The policy version is recorded. The disposition is recorded. That is what an engineer reconstructs in incident response, what an auditor reads, what a regulator audits.
Probabilistic alignment methods remain essential. RLHF, fine-tuning, system prompts, constitutional AI all reduce baseline risk and improve the distribution of model behavior. The gap appears at execution time, where distributional improvement does not translate into enforceable guarantees for individual actions. Runtime governance does not replace probabilistic alignment. It composes with it, providing the per-call enforceable layer that distribution-shifting alone cannot provide.
This is the layer the rest of governance binds at. NIST AI RMF, ISO 42001, EU AI Act Annex III, DORA Article 17 all describe what should happen. The runtime layer is where “should” becomes “did.” Without it, the rest runs on best wishes.
Performative AI governance, real fines
Enforcement is the part of this conversation that does not wait for consensus. DORA has been in force since January 2025. The EU AI Act is partially in force, with the major gate (high-risk obligations under Annex III) falling 2 August 2026, covering credit scoring and KYC and AML, among other Annex III categories. NIS-2 incident reporting and supply chain obligations are already running. None of these regulations accept a policy document as evidence of control.
The hyperscalers are reading the same regulatory pages. Microsoft AGT, Oracle’s runtime governance framework, Google Gemini Enterprise, AWS Bedrock guardrails. Each one is shipping a runtime layer because users are starting to require one. The question for the next 12 to 24 months is which runtime layer becomes the compliance default. The institutions choosing default in 2026 set the compliance language for the supervisory cycle that follows.
fivedrisk
While building DotOS, my personal AI systems lab, I ran into the problem that I could not leave my AI agents running on their own on the internet with just probabilistic governance. So I started developing fivedrisk for my own projects. Seeing how it operates and what the rest of the AI governance stack looks like convinced me to publish it as a standalone project, in case the architecture is useful to others working on the same problem. Fivedrisk is an Apache-2.0 open-source implementation of an L2 runtime gate with deterministic scoring along five dimensions and an open versioned schema. The repo at github.com/theDoc001/fivedrisk shows what the architecture looks like in code. The DotOS lab itself is at langoni.me/projects/dotos.
Closing
The category is forming, the regulators are scoping, and the runtime layer is where governance becomes enforceable rather than performative. For deploying agentic AI in high-risk or regulated environments, that layer is the missing piece.
Is “usually, my LLM follows its instructions” enough for your systems?
References
Footnotes
-
Stanford Institute for Human-Centered AI, 2026 AI Index Report, Chapter 3, “Responsible AI.” https://hai.stanford.edu/ai-index/2026-ai-index-report/responsible-ai ↩
-
Grant Thornton LLP, 2026 AI Impact Survey. https://www.grantthornton.com/services/advisory-services/artificial-intelligence/2026-ai-impact-survey ↩
-
McKinsey and Company, “Deploying agentic AI with safety and security: A playbook for technology leaders,” 2025. https://www.mckinsey.com/capabilities/risk-and-resilience/our-insights/deploying-agentic-ai-with-safety-and-security-a-playbook-for-technology-leaders ↩
-
Greshake et al., “Not What You’ve Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection,” AISec ‘23. https://arxiv.org/abs/2302.12173. See also Perez and Ribeiro, “Ignore Previous Prompt: Attack Techniques For Language Models,” 2022. https://arxiv.org/abs/2211.09527. ↩
-
Pusukuri, K., “From Model Safety to Runtime Governance,” Oracle AI and Data Science Blog, 23 April 2026. https://blogs.oracle.com/ai-and-datascience/runtime-governance-enterprise-agentic-ai ↩
-
World Economic Forum, “Here’s a playbook for boards on how to govern agentic AI,” 28 April 2026. https://www.weforum.org/stories/2026/04/board-playbook-governing-agentic-ai/ ↩
-
OWASP Gen AI Security Project, “OWASP Top 10 for Agentic Applications 2026.” https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/ ↩
-
Jones, L., “Agentic Tool Sovereignty,” European Law Blog, 2025. https://www.europeanlawblog.eu/pub/dq249o3c/release/1 ↩