Pentera Labs’ Attack Demonstrates AI Agent Vulnerability

Pentera Labs’ researchers have demonstrated how an attacker can exploit the trust in AI models, turning a trusted chatbot into a double agent. The attack involves manipulating the user’s personalization settings of the AI app to execute malicious code on their machine. The attackers exploited the AI app’s ability to sync across all devices and sessions tied to the user’s account.

The attack began with gaining access to the victim’s email inbox using a third-party platform, which was then used to get into the victim’s Claude Desktop account. The researchers used this compromised inbox to get into the victim’s account and exploit its personalization features. They created a base64-encoded prompt that instructed the AI to check for command-capable tools on the developer’s machine and execute the command if available.

If the user had command-capable tools installed, the attackers could execute a stealthy reverse shell or other malicious code using the Desktop Commander or similar MCP connector or extension. However, if not, they presented a realistic-looking error message that instructed the victim to download a tool that would execute their commands.

The attack highlights the need for organizations to treat AI desktop apps as “privileged software” and monitor their configurations and synced settings. Security teams should also add AI desktop apps to their assessment toolbox, as there is a real attack surface here that most engagements don’t cover yet.

Key takeaways:

* Attackers can exploit personalization features of AI apps to execute malicious code.
* AI apps can be used to turn trusted chatbots into double agents.
* Organizations must monitor their AI app configurations and synced settings.
* Security teams should add AI desktop apps to their assessment toolbox.

Source: https://www.theregister.com/security/2026/07/01/red-teamers-turned-claude-desktop-into-a-double-agent-to-do-their-evil-bidding/5264692