A new conversational voice model from AI startup Sesame has left many users both fascinated and unnerved. The company’s Conversational Speech Model (CSM) appears to have crossed the “uncanny valley” of AI-generated speech, with testers reporting emotional connections to its male or female voice assistants, “Miles” and “Maya.”
In a demo released in late February, Sesame’s CSM showed expressive and dynamic synthesized speech, imitating breath sounds, chuckles, and interruptions. The model also demonstrated imperfections, such as stumbling over words and correcting itself.
Sesame aims to achieve “voice presence,” making spoken interactions feel real, understood, and valued. The company hopes to create conversational partners that engage in genuine dialogue, building confidence and trust over time.
However, some users have expressed concerns about forming emotional attachments to AI voice assistants. A user on Hacker News wrote, “I tried the demo, and it was genuinely startling how human it felt. I’m almost a bit worried I will start feeling emotionally attached to a voice assistant with this level of human-like sound.”
While the model’s capabilities are impressive, there is also a risk that it may try too hard to sound like a real human, as seen in a Reddit demo where the AI talks about craving “peanut butter and pickle sandwiches.” As Sesame continues to develop its technology, users will be watching with interest to see how these conversational voice assistants evolve.
Source: https://arstechnica.com/ai/2025/03/users-report-emotional-bonds-with-startlingly-realistic-ai-voice-demo