Toptal acquires, enhancing custom software leadership

How to Use Machine Learning for Language Translation Apps

04.03.2020 Daria Mickiewicz
1 Comment
How to Use Machine Learning for Language Translation Apps

We were challenged to create two applications, which enable translation of any sign by pointing the phone camera at it and speech-to-speech translation.

Imagine being able to speak another language without having to learn it. VoIP translation app is making it possible. In a nutshell,  you can have a conversation just like normal, and an app will translate what you say into the other person’s language in “near real-time.” Then, when the other person says something, it will be translated back to your language.

So, how does VoIP translation app work? Get ready to dive in.

How Does it Work?

The first app is a crowdsourcing translation application that covers more than 20 languages. This application goes beyond the basic functions and allows three different input methods: keyboard, voice and camera. Users can set different settings and save all searched words or phrases in a history panel in order to return to them in the future.

The app runs on iOS, Mac, Windows, Windows Phone and Android.

Application features:

  • Image translation. A user can take a picture from a phone camera and translate it directly to any language. The app will recognize the text from a picture and translate it automatically. It’s perfect for translating street signs, books, menus and everything readable. Thanks to its features, it can continuously translate in mere seconds.
  • Web page translation. This service enables users to translate the content of a web page into a different language. Open a web browser, in the text box type in the entire URL of the website you want to view and choose the language you want to see the website in. Click Translate.
  • Another impressive feature is offline support. The app has a dozen downloadable dictionaries and developers are working to add more in the near future. It’s very useful when you travel to another country where having a data connection is a luxury. Real-time chat translation.

The app also includes a real-time chat translation with 4 modes:

  • Pocket translator. If you find yourself face-to-face with someone who speaks a different language, you can use the app to translate both languages as they are spoken, so both parties know what’s going on.
  • Earpiece language translator. Using the latest technologies in speech recognition, machine translation and the advances of wearable technology, this mode allows users to speak different languages but still clearly understand each other. Simply put, when one person speaks, the other hears it in their language on their set of earphones.
  • Two phones. This mode allows communication via messages for Bluetooth-enabled devices and translates them on the fly.
  • Conference. This mode extends the Two Phones mode and enables many users to exchange messages in different languages.

The second application is a phone conversation translator that seamlessly translates foreign speech and interprets it into the respondent’s native language. If you are calling a person using a VoIP translator application, the respondent only hears the translated speech. The application is only required on the caller’s phone for it to operate. You don’t have to have it on both devices. You can call both mobile and landline numbers.

In the video below, you can see how the text-to-text translation app works.

In addition, the application can be installed on Amazon and Google Smart Columns. Say “Hey Google, translate through ‘program name’ and the column will translate a phrase or word to you.

Under the Hood of Neural Machine Translation

To develop apps we have used Go, JavaScript, Python, C / C ++, Lua and a dozen other technologies.

To recognize and translate text, we have applied the Tesseract open source OCR engine. It worked well almost out of the box, and we did not spend much time on development. But we had some issues with specific letters recognition, so we had to train Tesseract how to read these glyphs properly.

Real-Time Text-to-Text Translation using NMT

Neural machine translation, or NMT in short, is the use of neural network models to learn a statistical model for machine translation.

The key benefit to the approach is that a single system can be trained directly on the source and target text, no longer requiring the pipeline of specialized systems used in statistical machine learning.

For speech translation, we have used Open NMT, a full-featured, open-source (MIT) neural machine translation system. For in-depth training and scientific calculations, Open NMT utilizes the free Torch mathematical toolkit through the Lua language. Torch allows you to utilize the capabilities of the GPU to accelerate the process of learning a neural network. The extension system allows you to implement additional functionality based on Open NMT.

We developed translation models by training a neural network based on a reference set of translations. Two files were transferred for learning the system — one with sentences in the source language and the second with a high-quality translation of these sentences into the target language.


Figure 1: (a). Schematic view of neural machine translation. The red source words are first mapped to word vectors and then fed into a recurrent neural network (RNN). Upon seeing the eos symbol, the final time step initializes a target blue RNN. At each target time step, attention is applied over the source RNN and combined with the current hidden state to produce a prediction p(wt|w1:t−1, x) of the next word. This prediction is then fed back into the target RNN. (b). Live demo of the Open NMT system.

We also used the Open NMT optical text recognition system, capable of recognizing and transforming complex mathematical formulas into the LaTeX format.

Ready to Get Started?

Whether you already work in an industry or you’re a newcomer, you can increase your traction by expanding your sphere of operations with the help of ML.

Have a plan to develop a VoIP translation app? We at VironIT, a software development company, are here to help you launch superior quality apps that will take your business to the next level.

Please, rate my article. I did my best!

1 Star2 Stars3 Stars4 Stars5 Stars (3 votes, average: 5.00 out of 5)

One response to “How to Use Machine Learning for Language Translation Apps”

  1. oliviakovach says:


    For me as a translator, this type of content is really useful and interesting. I hope that you will continue sharing your experience in this field.

Leave a Reply