Cohere’s open-weight ASR model hits 5.4% word error rate — low enough to replace speech APIs in production pipelines

by | Mar 30, 2026 | Technology

Enterprises building voice-enabled workflows have had limited options for production-grade transcription: closed APIs with data residency risks, or open models that trade accuracy for deployability. Cohere’s new open-weight ASR model, Transcribe, is built to compete on all four key differentiators — contextual accuracy, latency, control and cost.Cohere says that Transcribe outperforms current leaders on accuracy — and unlike closed APIs, it can run on an organization’s own infrastructure.Cohere, which can be accessed via an API or in Cohere’s Model Vault as cohere-transcribe-03-2026, has 2 billion parameters and is licensed under Apache-2.0. The company said Transcribe has an average word error rate (WER) of just 5.42%, so it makes fewer mistakes than similar models.It’s trained on 14 languages: English, French, German, Italian, Spanish, Greek, Dutch, Polish, Portuguese, Chinese, Japanese, Korean, Vietnamese and Arabic. The company did not specify which Chinese dialect the model was trained on. Cohere said it trained the model “with a deliberate focus on minimizing WER, while keeping production readiness top-of-mind.” According to Cohere, the result is a model that enterprises can plug directly into voice-powered automations, transcription pipelines, and audio search workflows.Self-hosted transcription for production pipelinesUntil recently, enterprise transcription has been a trade-off — closed APIs offered accuracy but locked in data; open models offered control but lagged on performance. Unlike Whisper, which launched as a research model under MIT license, Transcribe is available for commercial use from release and can run on an or …

Article Attribution | Read More at Article Source