The Vulnerability Overload Driving AI-Powered Scanners

▶ Watch (1:54)

Milan Williams described the pressure. Engineering teams face mandates requiring 90% of PRs from cursor or cloud code. Attackers use AI to instrument exploits for low severity vulnerabilities faster than ever. The number of vulnerabilities explodes. Security teams are overwhelmed and understaffed. Semgrep built an MCP server to scan code immediately after generation inside those agentic IDEs.

The Threat Landscape: MCP Remote RCE and Dedicated Attackers

▶ Watch (3:41)

Semgrep faced two threat vectors. In late 2024, an MCP remote command execution affected over 400,000 developers. The tool turned local servers into remote ones. Separately, a coordinated attack group called Team PCP targeted open source security companies to steal customer data. Team PCP was behind supply chain attacks on Axios and LLM. These threats forced stricter security standards.

Three Non-Negotiables for a Security Company’s MCP Server

▶ Watch (5:43)

Williams outlined three non-negotiables. First, protect customer environments. Vulnerabilities and source code are highly confidential IP. Second, reliability. Vulnerabilities must not appear one day and disappear the next. Deterministic outputs let teams codify security at scale. Third, minimize blast radius. Authentication and authorization must be scoped so a single compromise does not cascade to all customers.

Non-Determinism Solved by Hooks

▶ Watch (8:02)

Katrina Liu explained that agents rarely called Semgrep’s scan tool voluntarily, even with cursor rules. The team adopted hooks, a feature in agentic protocols that fires commands at specific events. They configured a post-tool-use hook on write and edit tools. Every time an agent wrote code, Semgrep scanned that snippet instantly. In a live demo, Claude generated a debug statement and Semgrep caught it inline, prompting the fix immediately.

Hallucinated File Content and the Local Server Tradeoff

▶ Watch (10:38)

The second challenge: agents hallucinated file content when passed as parameters, especially log files. Base64 encoding consumed too many tokens. Semgrep abandoned the remote MCP server and returned to a local one that reads the user’s file system directly. This solves accuracy but loses easy installation and OAuth. New hooks that override arguments may enable a remote server again.

Authentication: Tokens Today, OAuth Tomorrow

▶ Watch (12:56)

Authentication was essential to protect customer code and findings. For the local server, Semgrep uses a CLI token obtained via browser authentication stored locally. For a remote server, they implemented OAuth, but most users stay on local. A blocker remains: the hooks that make Semgrep deterministic use the same token as the local server, and OAuth is protocol-specific. Finding an alternative authentication method for hooks is an open problem.

Notable Quotes

“over 400,000 developers” Milan Williams · ▶ Watch (4:23)

“the agent actually hallucinates the file content” Katrina Liu · ▶ Watch (11:17)

“security teams wind up being really overstaffed not overstaffed overwhelmed and understaffed” Milan Williams · ▶ Watch (3:04)

“It’s almost like a one-click install” Katrina Liu · ▶ Watch (12:22)

Key Takeaways

  • Hooks force deterministic scanning after every code write, solving agent non-determinism.
  • Hallucinated file content makes remote MCP servers unreliable; local servers fix accuracy.
  • Authentication must be scoped per customer and handle hooks without OAuth compatibility.

About the Speaker(s)

Milan Williams builds security products. He is a Senior Product Manager at Semgrep, a high-growth cybersecurity startup. He leads the teams responsible for Semgrep Code (SAST) and Secrets detection products. He recently graduated from Harvard University with degrees in Computer Science and Physics.

Katrina Liu is a software engineer at Semgrep. She is on the Semgrep Analysis Foundations Team, which owns and maintains the core static analysis functionality of the Semgrep tool. She is currently working on Semgrep’s MCP server.