Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Businesses looking to use AI models to transcribe audio, specifically human speech, from executives, employees, and customers, may be wary of the idea of an AI program listening to and recording sensitive information.
However, the Israeli audio AI startup aiOla has a new model that addresses this very concern. Built atop OpenAI’s industry-standard open source model Whisper, the new Whisper-NER from aiOla is itself fully open source and available now on Hugging Face and Github for enterprises organizations, and individuals to take, use, adapt, modify and deploy.
It integrates automatic speech recognition (ASR) with named entity recognition (NER). This innovation aims to enhance privacy by automatically identifying and masking sensitive information such as names, phone numbers, and addresses during the transcription process.
A demo model is available for users to try on Hugging Face as well, allowing them to record snippets of speech and have the model mask specific words they type in, in the resulting typed transcript. The model performed successfully in my brief test of masking the word “VentureBeat” in my speech, which is a proper noun and jaron.
Whisper-NER addresses a significant challenge in the transcription of spoken content: ensuring privacy and compliance with data protection regulations. The model processes audio files and simultaneously applies NER to tag or mask specific types of sensitive information directly within the transcription pipeline. Unlike traditional multi-step systems, which leave data exposed during intermediary processing stages, Whisper-NER eliminates the need for separate ASR and NER tools, reducing vulnerability to breaches.
“We designed this as an open-source tool to advance privacy in AI,” said Gill Hetz, Vice President of Research at aiOla, in a recent video call interview with VentureBeat. “It helps users mask sensitive data without needing additional software steps.”
Previously, aiOla was noted for releasing Whisper variants that could accurately and reliably recognize industry-specific jargon and transcribe it, as well as a much faster speech-to-text and speech recognition model.
Fully Open Source for Community and Commercial Use
Whisper-NER is fully open source and available under the MIT License, allowing users to adopt, modify, and deploy it freely, incl …