Enhancing ASR Model with Speech Datasets for Testing


Are you looking to enhance the accuracy of your Automatic Speech Recognition (ASR) model? In this article, we’ll discuss how our client faced a significant challenge in their fast-paced sales environment. Accurate speech transcription was crucial for task efficiency and the sales cycle. Discover how they overcame this obstacle with the help of specialized speech datasets for testing, leading to remarkable results and exceeding their expectations.


The customer

Our client is a California-based Fortune Global 500 cloud-based CRM software company. They design, develop, and market software applications for sales and client-focused teams, in addition to analytics and app development, in more than 20 fully supported languages and in 120+ markets.


The context

This company has integrated an Automatic Speech Recognition (ASR) model using Speech-To-Text (STT) to automatically transcribe in six languages. Their fast-paced sales environment has the end-user speaking directly into an app connected via mobile device. The STT needs to be accurate such that a Natural Language Understanding (NLU) can suggest follow-ups or add notes to help improve task efficiency and the sales cycle. Due to frequent changes to jargon and a wide variety of accents, the client needs to quickly test and assess for accuracy in order to quickly train and deploy updates to the ASR.


The solution

To address their challenges, our client turned to specialized speech datasets for testing provided by Defined.ai. Within six weeks, Defined.ai collected, validated, and transcribed speech datasets, resulting in an impressive 120 hours of spontaneous dialogue across six languages. This comprehensive dataset became an invaluable asset for the client, enabling them to test and verify their existing ASRs’ accuracy thoroughly. Additionally, the client was able to identify entities that had previously gone unrecognized, leading to valuable insights for improving their speech recognition system. The client anticipated potential annual cost efficiencies of $500k+ and a reduction of up to 10% in the sales lifecycle by augmenting their training dataset with these tailored speech datasets for testing.

Are you looking to reach the full potential of your speech recognition system? Explore today the range of speech solutions available on our Marketplace to enhance your business performance.