MCP Apps as Interactive Views for AI Agents

▶ Watch (1:07)

Anton compared MCP Apps to HDMI over USB: same protocol shape, but with extra capability. In a demo, a PDF viewer tool returned an interactive form. The user filled it, stamped it as a draft, and added a signature. The tool modified the actual PDF file. The tool result includes a view UUID for further communication. The app can update model context with screenshots and text, consuming tokens only when a turn is taken. The view sits on top of tools and connects to the server.

Using Tool Calls for Authenticated Data Access

▶ Watch (7:31)

Apps need data beyond the initial tool call. For unauthenticated data, apps fetch from declared domains with CSP restrictions. For authenticated data, Anton recommends using tool calls through the server, reusing the host’s auth infrastructure. Tools can be marked app-visible only, so the model never sees them. Patterns include polling, parallel calls, chunking, and returning binary blobs. The view UUID keys server-side state. Partial streaming of tool inputs allows progressive rendering.

Managing Interaction and State Persistence

▶ Watch (9:35)

Interactions happen in the same UI without reloading. An upcoming spec change will allow view-local tools, declared by the view itself. Currently, the PDF server emulates this with a command queue on the server, using the view UUID. The app polls the queue and executes commands. State persistence is DIY: store state server-side keyed by view UUID, or use local storage for best-effort. The spec plans to add core state persistence so reloaded chats restore app state exactly.

Latency and Streaming with Partial Inputs

▶ Watch (18:03)

If an MCP app is AI-powered, tool call arguments can be 10,000+ tokens, creating latency. Olivier introduced partial input streaming: the app can start processing before the full tool call arrives. The host chunks the JSON and sends partial inputs. The app can show a progress bar or progressively render. Examples include streaming CSV for spreadsheets, HTML for generative UI, and SVG for diagrams. Claude’s “show me” feature uses one tool that streams HTML progressively.

Q&A

How does the UI cost less tokens than text? The HTML is pre-built by the developer, so the model doesn’t need to generate tokens to present JSON; the app renders instantly. ▶ Watch (21:32)

How can the model have access to the data behind a chart for further inference? Use update model context to push data to the model, or expose tools for the model to pull extra data. ▶ Watch (23:11)

Is there a way for the LLM to push data into the app without polling? Upcoming spec changes will allow the host to talk directly to the view, eliminating the need for polling. ▶ Watch (24:25)

Best practices for using screenshots vs text to give the model context? Use screenshots for visual context with debouncing and reasonable resolution; also provide a tool for the model to request a screenshot. ▶ Watch (25:26)

For tool input partial streaming, who is responsible for chunking? The host is responsible; it’s optional. ▶ Watch (28:15)

Notable Quotes

MCP apps like the view part of the MCP app are the tip of the iceberg of your agentic workflow Anton Pidkuiko · ▶ Watch (5:09)

If your MCP app really AI powered, sometimes it can get a lot of input from LLM as arguments to your tool call. Sometimes it can be 10,000 of token and easily becomes latency bottleneck. Olivier Chafik · ▶ Watch (18:10)

We don’t use any fancy frameworks. It’s simply one tool that gets HTML and we display this HTML progressively. Olivier Chafik · ▶ Watch (20:10)

The idea is to allow former chats to be reloaded and the apps to look exactly the same as how you left them last time you opened the chat. Anton Pidkuiko · ▶ Watch (12:57)

Key Takeaways

  • MCP Apps reduce token usage by replacing model-generated text with pre-built interactive HTML views.
  • Use tool calls for authenticated data access and mark tools app-visible only to keep them from the model.
  • Partial input streaming lets apps start rendering before the full tool call arrives, improving perceived latency.

About the Speaker(s)

Anton Pidkuiko works on MCP (Model Context Protocol) tooling and integrations at Anthropic, with a focus on MCP Apps, connectors, and interactive UI surfaces.

Olivier Chafik is co-author of the MCP Apps extension. He joined the MCP team at Anthropic in 2025, after working at Google (mostly AdSense) for 13 years. In recent years, he contributed to OSS projects such as OpenSCAD & llama.cpp, and is particularly excited by tool calling as the next interoperability…