Mastering Linguistic Diversity: Sourcing French Speech Training Data for Multiple Dialects

.14.04.2023

Developing speech-enabled technologies for the French market is challenging, considering the variety of dialects and accents across the region. In this article, we’ll discuss how our client, a Fortune 500 tech company, overcame this challenge by sourcing French Speech Training Data from 1,000 unique speakers representing different genders, ages, and regional dialects, resulting in a more inclusive and versatile voice assistant.

The customer

Our client is a Fortune 500 tech company who is renowned for the development of speech-enabled technologies.

The context

When a Fortune 500 tech company wanted to create speech-enabled technologies for the French market, it knew it would be challenging. There are as many as 28 different dialects or accents in the region, so for a voice assistant to be truly inclusive, it needs to recognize every last of them. To help the company meet its business goals, we sourced comprehensive training data – specifically, 600 hours of speech from 1,000 unique speakers representing a cross-section of genders, ages, and regional dialects.

The solution

Step 1 – Text collection

To obtain 600 hours of French speech training data, we created an extensive library of unique prompts – a spoken question, for example, to which an AI model will respond – covering the client’s specified domains, like the contexts in which the company operates, i.e., banking or healthcare. To do so, we engaged our French-speaking crowd, which produced written examples of how they might express intents, such as disputing a charge or returning a package, in 15 words or less. We collected and validated those variants and used them as a basis for the web-crawled text that created our prompt library, resulting in more than 500,000 prompts.

Step 2 – Speech recording

We gave those 500,000+ prompts to our crowd, who read and recorded them via Horatio, a mobile app developed in-house and available across Android, iOS, and Windows devices. We sourced a representative cross-section of 1,000 French speakers who read and recorded the prompts. Another group of 2,000 crowd members validated every second of the 600 audio hours the first group produced.

We worked closely with our clients to define their needs; establish workflows; gather and structure data; and ensure the quality of the results through expertly crafted qualification tests and automated analysis. This partnership provided our client with comprehensive speech training data in French, which enabled them to bring a better solution to the marketplace faster, and at a far lower cost than would have been possible if data had been handled in-house.

Are you looking to develop a speech-enabled technology tailored to a diverse linguistic market? Explore our range of solutions, including Automatic Speech Recognition and Text-to-Speech, to build speech solutions that are attuned to your customers. You can also contact us to create a solution that meets your needs!