Gradiium, a company specializing in speech-to-speech translation technology, has launched two real-time speech translation models. The stt-translate and s2s-translate models cover five languages (English, French, German, Spanish, and Portuguese) and stream results live in the browser. According to Gradium, the accuracy-latency tradeoff of its models is better than that of GPT-realtime-translate and Gemini-3.5-live-translate.
The stt-translate model takes speech input in one language and returns text in another, supporting 20 language pairs across five languages. The s2s-translate model turns spoken audio in one language into spoken audio in another, end-to-end, building on the stt-translate model.
Gradiium’s models achieve this with a single-pass design that removes intermediate transcripts and handoffs between systems. This reduces latency compared to traditional three-model cascades. The company uses Reinforcement Learning to optimize low latency and high accuracy jointly.
The s2s-translate model averages 3.0 seconds across all language pairs, beating GPT-realtime-translate at 3.6 seconds. It also offers output voice control, including cloning, which sets it apart from other models.
Gradiium measures translation quality using BLEU and MetricX metrics, with the latter providing a more nuanced assessment of semantic adequacy. The company benchmarks its models on a proprietary dataset that covers everyday topics, demonstrating their effectiveness in real-world scenarios.
The stt-translate model can be used for live dubbing and localization, multilingual voice agents, and accessibility and captioning applications. A Python SDK is available to stream audio through the Speech-To-Speech endpoint and return translated audio plus transcript.
Gradiium’s strengths include its single-pass design, output voice choice, and one duplex WebSocket replacing a hand-wired STT-plus-TTS pipeline. However, it faces weaknesses in its limited five-language set and proprietary benchmark dataset, which may limit external replication.
Source: https://www.marktechpost.com/2026/06/24/gradium-launches-stt-translate-and-s2s-translate-real-time-speech-translation-models-beating-gpt-realtime-translate-on-accuracy-and-latency