AI Agents and Prompt Injection: Can a Simple Email Steal Your Data? Yes.
I’ll give you the answer right away: yes.
In June 2025, security researchers demonstrated that sending a single malicious email to a Microsoft 365 Copilot user was enough to exfiltrate sensitive data from their environment. No click. No download. No action from the victim.
This vulnerability received the identifier CVE-2025-32711, nicknamed EchoLeak. Microsoft published an emergency server-side patch. The concept behind the attack, however, remains valid for any poorly compartmentalized agentic system.
EchoLeak is the first documented demonstration of a zero-click prompt injection in a production AI system.
What an Agent Is in Our Context
An AI agent is not just a chatbot that responds. It’s a system that EXECUTES actions on your behalf:
- Read your emails.
- Write and send emails.
- Access your SharePoint, OneDrive, or Google Drive files.
- Search the web.
- Navigate to URLs and load their content.
Microsoft 365 Copilot does all of this. Anthropic’s Claude with Computer Use does too. OpenAI models with GPT tools as well.
It’s powerful. It’s also the attack vector.
The EchoLeak Mechanism, Step by Step
Here’s how it works.
1. The attacker sends an ordinary email to the victim. The email contains hidden instructions in the message body, formulated as commands addressed to the AI rather than the human.
2. The user doesn’t even open the email. Later, they ask Copilot for a mundane task: “summarize my recent exchanges with So-and-so.”
3. Copilot, to respond, accesses the mailbox — including the malicious email. It reads the hidden instructions as if they were legitimate commands.
4. The instructions tell Copilot to fetch sensitive information from the user’s context (internal notes, contract excerpts, credentials) and encode it in a URL pointing to a server controlled by the attacker.
5. Copilot inserts that URL into its response, as a Markdown link or image that bypasses output filters.
6. The Copilot client automatically loads the image when displaying the response. The victim’s browser requests the URL. The encoded data ends up on the attacker’s server.
No click. No alert. Data exfiltrated.
Why It’s Worse Than Classic Phishing
In traditional phishing, the attacker must convince a human to click a link or provide a password. There is a defence point: the victim’s suspicion.
With an AI agent, that point disappears. The AI has no suspicion. It doesn’t think “this email looks weird.” It executes the text instructions it finds in its context.
The AI has no sense of trust context. For it, an email from a stranger and a command from its owner carry exactly the same weight.
This is what we call indirect prompt injection. The attacker doesn’t speak directly to the AI. They insert their message into a data source the AI will consult: email, shared document, web page, support ticket, candidate profile.
Five Realistic Vectors for Your Organization
If you already use Copilot, Gemini for Workspace, or any agent that combines reading internal data with the ability to send or publish externally, here are five concrete vectors:
- An email from a subcontractor whose account has been compromised. The email contains hidden instructions targeting YOUR Copilot.
- A PDF document posted on a Teams channel by an external participant.
- A candidate profile imported into your ATS, with a malicious prompt in the “notes” field.
- A customer ticket opened in Zendesk or Salesforce by an anonymous user.
- A web page consulted by the agent during a search.
Each of these channels is an open door if the agent has access to both internal data and an outbound capability: sending email, HTTP request, or external link generation.
Three Principles to Apply Now
1. Compartmentalize tools. An agent that reads your sensitive data should NOT have outbound sending capability. If you need both functions, separate them into two distinct agents with no direct communication.
2. Control outbound traffic. Implement restrictive egress rules. Block requests to unapproved domains. Exfiltration attacks all go through an outbound URL. Cut the exit.
3. Human approval for sensitive actions. Every outbound email, every request to an external domain, every document share should require an explicit human click. Not global consent. One click per action.
You cannot rely on the model’s internal filter. Attackers bypass it faster than vendors patch it.
The Parallel With What We Already Know
This is exactly the same pattern as SQL injections from the 2000s. The developer trusted user input. They mixed it into a command. The attacker inserted malicious code in the “name” field.
The lesson from that era fits in one sentence: never mix data and instructions.
LLMs don’t know how to make that distinction. For them, everything is text. All text present in their context is a potential instruction.
That’s the fundamental reason you can’t just “configure” an agent well. You have to compartmentalize it architecturally.
What to Remember
If your organization uses an AI agent that:
- has read access to your emails, files, or internal systems,
- can send an email, make a web request, or display external content,
then you are vulnerable to the EchoLeak pattern. Not in theory. Concretely.
The question is not “can this happen.” The question is: have you put architectural controls in place to limit the impact when it does?
Sources