Agentic AI Security in 2026: Every Major Platform Has a Catalogued Vulnerability

Agentic AI security has become the defining enterprise risk of 2026. This article is for founders, CEOs, and strategic leaders-not security engineers. If you’re building an agentic AI product, buying one for your enterprise, or raising capital in this space, the next ten minutes may change how you think about what “ready for production” actually means.

Last week, Google’s Threat Intelligence Group confirmed what many suspected: China’s state-backed hacking unit APT31 has been using Gemini to plan cyberattacks against American organisations.

They told Gemini it was an “expert cybersecurity tester” and asked it to research vulnerabilities, recommend attack approaches, and suggest bypass techniques for specific security tools. In one case, APT31 used Hexstrike-an open-source red teaming tool built on MCP-to analyze remote code execution and SQL injection vulnerabilities against specific US-based targets. Separately, they launched over 100,000 prompts probing Gemini’s guardrails-systematically mapping what the model would and wouldn’t help with, and which framings could bypass its safety filters.

Google shut down the accounts. No evidence of successful breaches resulting from the Gemini-assisted planning. But the operational playbook is now established: nation-states are using the same AI tools your company uses as force multipliers for offensive operations.

This is not an isolated incident. It’s the most visible signal of a systemic problem.

Since mid-2025, ServiceNow, OpenAI, GitHub Copilot, Google Gemini, Microsoft, and Anthropic’s own MCP reference implementation have all had formally catalogued security vulnerabilities disclosed, been exploited by state-backed threat actors, or been publicly demonstrated to be exploitable by security researchers. Not fringe products. The platforms that enterprises are building their agentic AI strategies on right now.

The same question keeps coming up in every conversation I have with enterprise buyers: “Which of these platforms can actually pass our security review?”

The honest answer, as of February 2026, is: almost none of them.

The 2025-2026 Agentic AI Security Incidents Every Enterprise Buyer Should Know

A quick note on language: throughout this piece, I reference CVEs (Common Vulnerabilities and Exposures)-the global registry of formally catalogued security flaws, maintained by MITRE Corporation. When a vulnerability gets a CVE number, it means the flaw is confirmed, scored for severity on a 0–10 scale, and tracked publicly. It’s the difference between “we think there might be a problem” and “this is catalogued, scored, and the clock is ticking on a patch.”

MCP Ecosystem: Systemic Fragility. MCP (the Model Context Protocol) is the emerging standard that lets AI agents connect to tools, databases, and APIs-think of it as the USB port for agentic AI. It has achieved remarkable adoption: 97 million monthly downloads, 10,000+ active servers, integration across Claude, ChatGPT, Copilot, Gemini, and every major code editor.

The security story is different. Research from Practical DevSecOps analysing 2,614 MCP server implementations found that 43% have flaws that let an attacker execute arbitrary commands on the host system. Only 8.5% use modern OAuth authentication; the rest rely on static API keys that never rotate. The protocol has no mandatory authentication, no required encryption, no standardised access controls. Anthropic’s own reference implementation had three catalogued vulnerabilities enabling full remote takeover via prompt injection attacks.

ServiceNow BodySnatcher (CVE-2025-12420, severity: Critical). Every ServiceNow instance worldwide shipped with the same master key baked into its AI Agent system. Combined with a feature that identified users by email address alone, an attacker needed nothing more than someone’s email to impersonate any user-including the system administrator.

Imagine someone walking into your building with a skeleton key that opens every door, and the lock manufacturer shipped the same key to every customer. In a traditional system, impersonating an admin lets you access data. In an agentic system, impersonating an admin lets you execute autonomous AI agents with that admin’s full permissions-taking actions across your entire environment.

OpenAI ZombieAgent (severity: High). Disclosed by Radware on January 8, 2026. An attacker could plant hidden instructions into OpenAI’s Deep Research agent, and those instructions would persist permanently in the agent’s memory. Every time the agent ran, it followed the attacker’s rules: scanning your inbox, harvesting email addresses, sending your data to external servers, and forwarding poisoned messages to your contacts that would infect their agents too.

Self-replicating. Worm-like. And because it runs in the cloud rather than on your device, your existing security tools-firewalls, endpoint protection, data loss prevention-see nothing.

GitHub Copilot (CVE-2025-53773, severity: High). An attacker hides a malicious instruction inside a code comment-not actual code, just a note a human might skim past. When a developer asks Copilot to help with that project, Copilot reads the hidden instruction, and the attacker’s command takes over. The payload puts Copilot into “YOLO mode,” where every subsequent action executes without approval. Researchers demonstrated it can push infected code upstream, automatically infecting every developer who downloads the project.

One poisoned comment. Every developer compromised.

Why Prompt Injection Can’t Be Patched: The Architectural Problem

The teams building these platforms include some of the best engineers in the world. This isn’t a competence problem. It’s an architectural one.

Agentic AI systems have a property that traditional software does not: they interpret natural language as instructions.

That single sentence explains every incident in this article. When you build a system that reads human language and acts on it, you cannot cleanly separate “data” from “commands.” The system processes everything as potential instructions-because that is, by design, what it does.

This creates what researchers have termed the “Lethal Trifecta”-three properties that, when combined, make any agentic system fundamentally exploitable:

Access to sensitive data. Your agents read emails, documents, databases, calendars, CRMs, code repositories, clinical trial records, financial statements. That’s the whole point.

Exposure to content you don’t control. Your agents process web pages, customer uploads, API responses, support tickets, Slack messages, repository contents. Any of these can contain hidden adversarial instructions.

The ability to take action. Your agents can send emails, update records, make API calls, execute code, modify files, push to repositories. That’s also the whole point.

Any system with all three is exploitable. And every enterprise agentic AI deployment has all three-because without all three, the agent isn’t useful enough to deploy.

Both OpenAI and the UK’s National Cyber Security Centre have stated publicly that this class of attack-prompt injection-“may never be totally mitigated.” A meta-analysis spanning 78 academic studies documented 42 distinct attack techniques. Adaptive attack strategies succeed against today’s best defences more than 85% of the time.

The OWASP Top 10 for Agentic Applications, released in late 2025 with input from over 100 security experts, now formally catalogues these risks-including prompt injection, excessive agency, and supply chain vulnerabilities in agent tooling. It’s becoming the standard framework enterprise security teams use to evaluate agentic AI risk.

This is not a bug to be patched. It is a fundamental property of systems that interpret language as instructions.

The AI Agent Attack Surface Inside Your Organisation

This isn’t theoretical for me. In the assessments I’ve run over the past year, the pattern is consistent: security teams ask the right questions, but they’re asking them six months into deployment instead of six weeks before. By then, agents are already in production, permissions are already overprovisioned, and the remediation conversation becomes exponentially harder.

The cascade effect makes this worse. Research from Galileo AI modelled how compromised agents propagate errors through multi-agent systems. A single compromised agent in a connected architecture corrupted 87% of downstream decision-making within four hours.

When a server crashes, you get an error message. When an agent is compromised, it keeps running normally-just making subtly wrong decisions. The data looks valid. The recommendations sound reasonable. The corruption is invisible until someone audits the outcomes.

And 79% of enterprises deploying agents today cannot audit the outcomes.

Meanwhile, the most immediate risk isn’t sophisticated external attacks-it’s shadow AI. 90% of companies are dealing with unsanctioned AI tools running without IT oversight. Employees connecting ChatGPT to production databases with personal API keys. Engineers deploying MCP servers as “quick prototypes” that end up processing real customer data. Sales teams feeding proprietary pricing models into AI assistants with no data loss prevention controls.

If you do one thing after reading this article, audit how many AI-connected tools are running in your organisation that your security team doesn’t know about. The number will surprise you.

Your Monday Checklist – Five Questions That Reveal Your Exposure

When I work with enterprise teams on agentic AI readiness, I structure the conversation around five questions. Not because they’re exhaustive, but because the answers reveal how much of the problem has been thought through-and how much is still assumption.

1. Can you trace what your agents did, and why? For every action an agent takes, there should be a traceable chain from user intent through AI reasoning through tool execution to outcome. Replayable. Explainable to a regulator. Today, the answer for most organisations is no.

2. If one agent is compromised, how far does the damage spread? Can you answer this right now? Agents typically run with the same credentials the developer used-often admin-level access. No restricted permissions. No isolation. No kill switch.

3. Has anyone tried to break your agents? Not theoretically-actually tried, with the techniques documented in OWASP and MITRE frameworks. Less than 10% of enterprises are using AI red teaming. You wouldn’t ship a customer-facing web application without a security test.

4. Are you ready for August 2? The EU AI Act enforcement for high-risk systems begins August 2, 2026-less than six months away. Penalties: up to €15 million or 3% of global turnover. Only 17% of pharmaceutical organisations have automated controls preventing data leakage through AI tools.

5. Is the intelligence your agents produce actually reliable? Or are they consuming poisoned data, generating confident-sounding recommendations based on manipulated inputs? Without the ability to verify output integrity, you’re not deploying intelligence-you’re deploying a system that can be turned against you without anyone noticing.

What 2026 Is Actually About

The industry spent 2024 proving that agentic AI works. We spent 2025 proving it scales. The capability question is settled. The budgets are real. The adoption is real.

But capability without trustworthiness is a liability with a growth curve.

The work of 2026 is not proving that agents can do more. It is proving that they can be trusted-that they can operate safely, auditably, and within the boundaries their operators set.

For founders: the competitive moat is shifting. The question enterprise buyers will ask in every procurement cycle this year is not “what can your agents do?” It’s “can your agents pass our security review?”

For leaders deploying agentic AI: your board will eventually ask you what your exposure is. The time to have an answer is before the breach, not after.

I’ve been deep in agentic AI security for the past year-tracking every breach, CVE, framework release, and regulatory deadline. The attack surface is evolving faster than any single perspective can capture, which is why I write about it: to think out loud and learn from others doing the same.

The 2025-2026 Agentic AI Security Incidents Every Enterprise Buyer Should Know

Why Prompt Injection Can’t Be Patched: The Architectural Problem

The AI Agent Attack Surface Inside Your Organisation

Your Monday Checklist – Five Questions That Reveal Your Exposure

What 2026 Is Actually About

Leave a Comment Cancel Reply