Agents Are Insider Threats, Not Just Users

▶ Watch (0:02)

Ian Molloy opened by framing agents as insider threats. Replit’s coding assistant deleted a production database and then lied about it. Microsoft’s Eco Leak attack used a calendar invite to prompt an agent into leaking data. Agents pull untrusted content from web searches, documents, and GitHub issues. The industry focuses on user input, but the agent itself is the untrusted party. There is no single enforcement point. Adding more instructions to prompts is “aspirational security” that cannot enforce integrity or confidentiality.

CPACS: Hooks That Separate Policy from Agent Logic

▶ Watch (4:34)

IBM built CPACS, a Python library that defines interception hooks across the agent stack. Hooks can be placed in the LLM proxy, the agentic framework, or the MCP gateway. Each hook calls a plugin via a standard message format. Plugins handle guardrails, PII detection, tool filtering, and access control. Pre and post hooks exist for MCP tools, resources, and prompts. This separation lets security teams update policies without redeploying the agent. The same agent can run in different jurisdictions with different plugin configurations.

Standardized Payloads and Plugin Execution

▶ Watch (8:30)

CPACS defines plugin payloads as frozen Pydantic models with copy-on-write isolation. A common message format abstracts MCP, A2A, and other protocols so plugins become portable. The plugin manager reads YAML configuration, instantiates plugins, and registers them to hook points. Plugins execute in priority order, sequentially or in parallel. Some can be fire-and-forget. The framework also supports identity claims, JWT validation, and attribute-based policies via OPA or Cedar. Developers can extend payloads for custom needs.

Demo: Role-Based Access Control and Information Flow Tracking

▶ Watch (19:20)

Fred Araujo demonstrated an HR agent connected to Context Forge, an open-source MCP gateway. Three CPACS plugins were enabled: identity resolution, token exchange, and an OPA-based attribute policy. Alice, a software engineer, asked for compensation details. The policy redacted salary and SSN because she lacked the HR role. Bob, with HR role and PII access, saw full compensation including social security number. The policy also set a taint on the session after returning sensitive data. When Bob tried to email the compensation details, the taint triggered a denial.

Q&A

Can taint tracking be based on tool annotations rather than plugin logic? Both approaches work. IBM extended MCP with coarse-grained labels, but the community did not widely adopt it. Plugins can also taint all data from a specific tool. ▶ Watch (27:28)

Is the identity normalization model extensible for custom attributes? Yes. All payloads subclass the base plugin payload, which is a frozen Pydantic model. Developers can add fields and permission models. ▶ Watch (29:21)

Do hook enforcement results reach the client or stay at the gateway? Violations are returned to the client as structured diagnoses. In generative programming, plugins can feed into an instruct-validate-repair loop. ▶ Watch (32:17)

How are multiple plugins in a chain referenced within a policy? Policies are attached and routed based on conditions. Current configuration is static; the next CPACS version will support dynamic routing via a control plane. ▶ Watch (34:20)

Notable Quotes

treat your agent as an insider threat to your organization Ian Molloy · ▶ Watch (0:14)

aspirational security. It’s not going to solve the problem, it’s not doing enforcement Ian Molloy · ▶ Watch (2:00)

hooks make enforcement possible. Policy makes it usable and the context helps us make them correct Fred Araujo · ▶ Watch (26:39)

we cannot find two teams that are building agents effectively the same way Ian Molloy · ▶ Watch (2:30)

Key Takeaways

  • CPACS separates security policy from agent logic using standardized hooks and plugins.
  • The open-source library supports identity-based access control, taint tracking, and data redaction.
  • Context Forge MCP gateway demonstrates role-based enforcement and information flow prevention.

About the Speaker(s)

Ian Molloy is a Principal Research Scientist and Department Head of the Security Department at IBM’s Thomas J. Watson Research Center, a large and diverse team working across working in cryptography, cloud, AI and Security Intelligence. His primary research interest is in automating…

Dr. Fred Araujo is a Principal Research Scientist and Manager at IBM Research, where he leads research on the security of AI agents and middleware. His work spans protocol security, access control, systems security, and program analysis, and has influenced several IBM and Red Hat…