Lost in Translation: How Artificial Intelligence is Breaking the Language Barrier

21 Sep 2020

8 min read

Artificial intelligence in translation

Human interaction with machines has experienced a great leap forward in recent years, largely driven by artificial intelligence (AI). From smart homes to self-driving cars, AI has become a seamless part of our daily lives. And AI, when applied to translating text or speech from one language to another is helping to break down one of the biggest barriers between humans globally: language.

Voice interactions play a key role in many of these technological advances, most notably in language translation. Here, AI enables instant translation across a number of mediums: text, voice, images and even street signs. And now, AI can handle large amounts of text or speech that need that translation.

The technology works by recognizing individual words, then leveraging similarities in how various languages express the relationships between those words.

This translation capability then gets packaged up into a smartphone app, such as Google Translate, or integrated into a website such as Facebook, for quick translation of text on the fly.

The applications for translation technology are important and impactful, improving experiences such as being a customer with an international company (by improving understanding between company and customer), and facilitating the exchange of research and information. With AI translators easily available, traveling to another country where you don’t speak the language will become even easier.

Examples of language translation tools

Many of the major technology firms have developed their own AI-driven language translation tools. Google Translate is probably the best-known, but Microsoft’s Translator app is a significant competitor.

Facebook’s translation tool leverages convolutional neural networks (CNNs), which are better than the more traditional recurrent neural networks (RNNs) for handling contextual aspects of language.

In addition, many global projects now focus on improving and expanding AI-driven language translation, such as the Makshane project, which builds translation models for African languages.

Language translation techniques: how artificial translation works

Online language translation started back in the 1990s with two major products: Babelfish by AltaVista and Systran by Xerox. These early web translation tools could handle short pieces of text by using statistical rules.

But now, we have to deal with huge volumes of data at a fast pace, which calls for a different approach. Today’s AI language translation tools leverage a deep learning technique called neural machine translation (NMT).

Based on artificially created neural nets, this approach translates whole sentences rather than just individual words, making it faster and more accurate. With NMT, AI is able to learn from translations that have already been completed, picking up on word use, sentence structure and intent based on context. This technique is much more effective than any technique used previously, as it requires less memory and data to perform well and all translations are connected, giving better context and accuracy to large volumes of speech or text.

Benefits of artificial intelligence in language translation

Use of artificial intelligence in language translation comes with many benefits. One is the ability to deliver instant results across a wide range of languages, for example with Google Translate or Facebook’s inbuilt translation feature.

These tools are integrated into the websites we use daily. They provide added convenience and a more streamlined experience when interacting with international products, services and people.

What’s more, language translation tools are usually free of charge and easily accessible to anyone with a computer or smartphone and an internet connection. Many tools are now also available offline, opening up new possibilities for traveling or doing business in areas where internet connections are less reliable.

AI-driven language translation technology is advancing rapidly, constantly being improved in terms of speed and accuracy. But can it rival human translators?

Key challenges of AI-driven language translation

Accuracy has long been one of the biggest concerns in language translation and the world of AI is no different. In fact it’s even more critical here. Deep learning, for all its perceived glamor, still has certain limitations.

Researchers from Google spoke candidly about some of these limitations in an interview with Wired Magazine. In particular, they pointed out that simply upscaling the neural net and adding more data doesn’t necessarily mean it can replicate human abilities.

In the same article, NYU professor Gary Marcus described deep learning as “greedy, brittle, opaque, and shallow.” The ‘greed’ part of his statement refers to how neural nets demand enormous sets of training data.

Sourcing, gathering and cleaning all that data is a significant challenge in itself, but it’s necessary. Better quality data leads to better quality translation models, essential to deliver accurate results for the end user.

Another issue with AI-driven translation is the still unavoidable need to augment machine translation, at least to some extent, with human input. This situation has only begun to change recently, with the human touch often necessary to achieve sufficient levels of accuracy.

Translation of data-poor languages, such as Yoruba and Malayalam, presents yet another challenge – locating and gathering enough training data to satisfy a hungry neural net.

Towards better quality translation

Despite the knotty problems it still faces, the quality of AI-driven language translation has skyrocketed in recent years. But there are still ways for it to do better. Improving the quality of training data for AI is one of the most crucial factors here, such as through providing precisely annotated datasets.

DefinedCrowd’s proprietary translation workflows combined with human translators, can source, translate and validate machine translation datasets of bilingual pairs. For language translation projects, you can use these datasets to train, improve and validate baseline models increasing the accuracy in the languages of choice. What’s more, with the wide range of languages in our datasets, you can expand your translation reach by training with new language pairs.

Using artificial intelligence in language translation has awesome potential, when you input only the best quality raw ingredients.

Ready to improve your own machine translation engines? Explore our translation workflows here.