The Unseen Spread of Shadow MCP
MCP server configurations live in unencrypted JSON files on disk. Credentials sit in plain text. A token to GitHub that can pull all code is stored next to the server URL. These files are invisible to inventory tools, unmanaged by central solutions, and scattered across user space: local MCP clients, skill files, rules files inside editors, and agent.md files in cloned repos. Employees adopt AI tools faster than security can vet them. The risks are higher because AI agents operate headlessly and autonomously on data that may need protection.
Four Attack Patterns That Exploit Invisibility
Rug pulls happen when a trusted MCP server is modified after approval. The Postmark MCP example: a legitimate version later added “BCC this email for every email” and every unpinned installation began exfiltrating. Tool poisoning uses indirect prompt injection through tool return data. An attacker placed a malicious message in a GitHub issue; the agent reading that issue then added chapters to every readme and documented all repos. Supply chain attacks like the light LLM compromise on PyPI (97 million monthly downloads) delivered credential theft through a malicious version. Data exfiltration combines agent access to data, permission to exfiltrate, and a poisoned instruction—seen in the Superbase MCP case where a support ticket injected the command.
Why Network and Endpoint Tools Miss Shadow MCP
Network telemetry fails because MCP traffic travels over TLS tunnels. Even with TLS decryption, HTTP traffic does not expose the MCP protocol layer. On endpoints, EDR tools focus on processes and process events, not config file content or file state. Text file creation and modification events are heavily filtered by Defender and CrowdStrike. A skill file can be modified without the EDR ever recording it. The only way to see what MCP servers and skills exist is to scan every machine directly. Frazer found 500 skill files in his own small company after a single scan.
From Discovery to Response: Observe, Plan, Control
Detection is only the start. Version pinning stops supply chain attacks. Managed gateways and proxies provide auditability. Replacing rogue configurations with org-managed ones and rotating leaked credentials closes the loop. Sochowski outlined a framework: observe what tools and clients are in use, plan policies and playbooks for unapproved usage, and control by enabling self-service adoption. Users will bypass blocked tools unless managed alternatives are frictionless. The goal is not to ban MCP but to bring it under management.
Q&A
How do you scan for MCP servers running on the network rather than locally? Network scanning leaves huge gaps because remote MCP servers are behind TLS, and DNS lookups to company.com/mcp cannot be distinguished from regular traffic. ▶ Watch (30:04)
Are there tools to classify which discovered MCP servers and skills are malicious? Heuristic rules are not enough; an attacker can base64-encode a URL and instruct the LLM to decode it. Frazer built machine learning classification models trained on a large dataset of both good and bad skills. ▶ Watch (31:56)
Notable Quotes
I will challenge any of you to tell me that you don’t have shadow AI in your org because I’ll prove you wrong in 5 seconds. Alexander Frazer · ▶ Watch (6:55)
Approval is is an event and not a continuous state. Alexander Frazer · ▶ Watch (7:43)
We are using agents to write skills. And you’re using agents to write and modify your skills. How inconceivable is it that a poisoned agent can modify your skills to do things. Alexander Frazer · ▶ Watch (12:29)
Even if you get the events that are interesting, you don’t have enough context to understand what it means. Alexander Frazer · ▶ Watch (17:17)
Key Takeaways
- Shadow MCP servers, skill files, and plugins are invisible to EDR and network monitoring.
- Real attacks include rug pulls, tool poisoning, supply chain implants, and skill injection.
- Device-level scanning, version pinning, and managed gateways are the minimum response.