On May 7, 2026, Microsoft published research that most enterprise security teams should have stopped and read carefully. The post, titled "When prompts become shells," disclosed two critical Remote Code Execution vulnerabilities in Microsoft's Semantic Kernel AI agent framework. Both were exploitable through prompt injection. Neither required the attacker to have any special access to the environment. The prompt itself was the weapon.
The disclosure came with CVE assignments, patches, and technical detail. What it also came with, whether Microsoft intended it this way or not, was a warning about the broader architecture of how AI agents are being built and deployed across the enterprise right now.
What Microsoft disclosed
The two vulnerabilities affected Semantic Kernel, Microsoft's widely used open-source SDK for building AI agents across both Python and .NET environments.
The first, CVE-2026-26030, affected the Python package for Semantic Kernel's In-Memory Vector Store. The framework used Python's eval() function to process filter parameters without sanitizing AI-controlled inputs first. An attacker could craft a prompt injection payload that escaped the filter template string, traversed Python's class hierarchy to locate a module loader, and used it to call os.system(). The framework's own AST validator, designed to block dangerous function names, missed the bypass entirely because it did not account for bracket notation or identifiers like __name__ and BuiltinImporter. The result was arbitrary code execution through what looked, from the outside, like a routine search query.
The second, CVE-2026-25592, affected Semantic Kernel's .NET SDK. A function called DownloadFileAsync, intended to transfer files from an isolated Azure Container Apps sandbox to the host, had been accidentally marked with the [KernelFunction] attribute. That marking advertised the function to the AI model as a callable tool. The localFilePath parameter that controlled where files were written on the host was entirely AI-controlled, with no path validation or directory restriction in place. The attack chain was three steps: inject a prompt to generate a malicious script inside the sandbox container, use DownloadFileAsync to write that script to the Windows Startup folder on the host, and wait for the next user sign-in to execute it. Full host compromise through an unsanitized parameter that the agent framework itself put on the table.
Both vulnerabilities were patched. Semantic Kernel Python versions prior to 1.39.4 and Semantic Kernel .NET versions older than 1.71.0 are affected.
- CVE-2026-26030: Prompt injection via unsanitized eval() in Semantic Kernel Python, bypassing AST validation through class hierarchy traversal
- CVE-2026-25592: Sandbox escape via accidental [KernelFunction] exposure, allowing AI-controlled arbitrary file write to host
- Both exploitable through prompt injection. No privileged access required.
- Patch: Python semantic-kernel 1.39.4 or later; Semantic Kernel .NET 1.71.0 or later
Why this is bigger than two CVEs
These are fixable vulnerabilities. Both have been patched. But treating this disclosure as a routine patch cycle notification misses the point.
The attack vector for both CVEs was prompt injection. That means an attacker who can influence what text an AI agent processes can potentially influence what that agent does in your environment. The attacker does not need credentials. They do not need network access. They need a path to inject content into the agent's input, which in many enterprise deployments includes emails, documents, web content, user-submitted forms, or data pulled from connected SaaS applications.
As WindowsForum reported when covering the disclosure, an attacker who can inject malicious prompts into an AI agent may be able to execute arbitrary code on the underlying system without any additional foothold.
This is what makes prompt injection fundamentally different from traditional injection attacks. In a SQL injection scenario, you know where the input fields are and you can sanitize them. In an AI agent scenario, the attack surface is any content the agent reads. The Agentic Ecosystem Security Gap: 2026 CISO Report, which surveyed 500 U.S. CISOs, found that 1 in 3 enterprises experienced suspicious AI agent activity in 2025, the first year of serious enterprise AI deployment.
That number will only grow as agent frameworks expand the inputs they process. That surface is enormous, it is often dynamic, and it is frequently outside the direct control of the security team.
The CVE-2026-25592 attack chain is instructive here. The agent was running in what was supposed to be an isolated sandbox. The sandbox escape was not through a container exploit or a kernel vulnerability. It was through a function that the framework itself was exposed to the model, with an AI-controlled parameter and no validation.
AI agent frameworks are being built, extended, and deployed faster than their security properties are being audited. Functions get exposed. Parameters go unsanitized. Validators get bypassed. And underneath all of it, the agent is processing inputs that the organization did not write and cannot fully predict.
How detection and enforcement change the story
When Vorlon establishes behavioral baselines for every entity in your environment, that includes AI agents. An agent that suddenly starts writing files to locations it has never touched, calling functions it has never invoked, or making outbound requests to destinations outside its normal operational pattern, those are detectable signals. Not after a post-incident review. In real time, as the behavior unfolds.
The Semantic Kernel sandbox escape took three steps. The first was generating a malicious script. The second was writing it to the Startup folder. Either of those steps, measured against a baseline of normal agent behavior, would look anomalous. An agent framework that has never written to the Windows Startup directory doing so for the first time is a signal worth catching before step three.
Vorlon Guardian adds the enforcement layer on top of detection. At the protocol level, Guardian can apply Read-Only Enforcement to limit what an agent can write, regardless of what the model has been instructed to do. In the case of CVE-2026-25592, an agent operating under Read-Only enforcement would not have been able to write the malicious script to the host filesystem in the first place. The attack chain breaks at step one.
This is what separates enforcement from alerting. An alert after a sandbox escape has already happened is useful for the post-mortem. Enforcement at the protocol level means the action never completes.
What security teams should do now
Patch immediately. If you are running Semantic Kernel Python below version 1.39.4 or Semantic Kernel .NET below version 1.71.0, update now. Both CVEs are actively documented and the attack chains are publicly known.
Audit what your agent frameworks are exposing to the model. The CVE-2026-25592 vulnerability existed because a function was accidentally marked as callable by the AI. Review which functions in your agent implementation carry the equivalent of a [KernelFunction] attribute or are otherwise advertised to the model. Anything with AI-controlled parameters that affect the host filesystem, network connections, or downstream integrations deserves scrutiny.
Treat prompt injection as a first-class threat. Most application security programs have mature processes for testing SQL injection, XSS, and buffer overflows. Prompt injection testing is still catching up. Include it in your threat modeling for any AI agent deployment. Map every content source the agent reads as a potential injection vector.
Apply least-privilege to agent permissions. Agents should only have write access to the specific locations and resources they genuinely require. A scoped permission model does not make prompt injection impossible, but it significantly limits what a successful injection can actually accomplish.
Enforce sandbox boundaries with controls that do not rely on the model. If an agent is running in an isolated environment, the controls that enforce that isolation should not be bypassable by influencing the model's instructions. Validate paths. Restrict file operations. Do not expose functions with AI-controlled parameters to the host without independent validation at the framework level.
The Microsoft disclosure was detailed, responsible, and technically thorough. The patches are available. What organizations need to take from it, beyond the immediate fix, is a clearer picture of how AI agent frameworks extend the attack surface and why the security controls around them need to operate at a different layer than the model itself.
The agent cannot secure itself. That job belongs to the infrastructure around it.



