Note on The Lethal Trifecta for AI Agents: Private Data, Untrusted Content, and External Communication via Simon Willison

If you are a user of LLM systems that use tools (you can call them “AI agents” if you like) it is critically important that you understand the risk of combining tools with the following three characteristics. Failing to understand this can let an attacker steal your data.

The lethal trifecta of capabilities is:

• Access to your private data—one of the most common purposes of tools in the first place! • Exposure to untrusted content—any mechanism by which text (or images) controlled by a malicious attacker could become available to your LLM • The ability to externally communicate in a way that could be used to steal your data (I often call this “exfiltration” but I’m not confident that term is widely understood.)

If your agent combines these three features, an attacker can easily trick it into accessing your private data and sending it to that attacker.

The problem is that LLMs follow instructions in content

FROM:
Simon Willison
The Lethal Trifecta for AI Agents: Private Data, Untrusted Content, and External Communication
Source

I think of it as: every message from a user and response from a tool call is exogenous code Reminds me to think about the sux rule to prevent untrusted external code (which is just plain/natural language, in an LLM system) from being executed outside of a sandbox.

Reference

Widgets

Network Graph

Legend

Insight Agent

This widget generates “insights” about the post using an agentic loop and MCP server for this site. A response may take up to a minute to generate.

Generating

Keyboard Shortcuts

Key	Action
`o`	Source
`e`	Edit
`i`	Insight
`r`	Random
`h`	Home
`s` or `/`	Search