In an increasingly connected world, communication between individuals from different cultures and languages has become an essential necessity. International exchanges, whether economic, educational, or social, require effective solutions to overcome language barriers.

Simultaneous translation technology in video conferencing represents a significant advancement in this area, allowing for nearly unimpeded interaction between people, regardless of their native language.

Transcription vs. Translation

It’s essential to distinguish between two concepts often used in the context of online communication tools: transcription and translation.

Transcription, when used during video conferences, involves the automatic generation of text displayed on the screen that accurately captures what is said by the speaker, in the same language that is used (for example, a speaker speaking in English would have their words transcribed as English subtitles).

On the other hand, translation can manifest either through the oral conversion of the speaker’s statements into a different language or through the display of subtitles that are the written translation of these statements.

The technology behind simultaneous translation

Simultaneous translation in video conferencing relies on artificial intelligence (AI) and natural language processing (NLP). These technologies enable the recognition, translation, and reproduction of speech in real-time with increasingly high accuracy. Automatic translation systems have reached a level of precision that, in some contexts, is close to that of human translators. This advancement is crucial for the reliability of multilingual exchanges in video conferencing.

Evaluating instantaneous translation features across leading video conferencing platforms

Languages supportedAround thirty languages available. Some languages cannot be translated from languages other than English.Around 30 languages available.Around ten languages are available, some of which are still in the test phase.Over 100 languages supported, but mainly English.
Translation timesNear instantaneousSlight latency may occurFastNear instantaneous
NotesNo possibility of adding acronyms, company names, technical terms, etc.The technology used improves the accuracy of captions by exploiting meeting subjects, invitations, participants’ names, attachments and their recent emails.Quality has improved considerably since its launch in 2022. What sets this tool apart is the integration of its powerful speech recognition and translation technologies, inherited from Google Translate.
AccessibilityTo access the real-time translation function in Zoom, you will need a paid Zoom licence.The meeting organiser must have a Teams Premium licence to allow participants to use live translated subtitles.Only available to Google Workspace Enterprise account holders.The real-time translation (RTT) service requires a licence.
Specific features, such as translation times and translation quality, may vary depending on software updates.

Please note: As with all translation tools currently available, the quality of the result is directly linked to the quality of the initial data. This means that if the speech is not sufficiently intelligible, the tool will not be able to produce an adequate translation, or will even produce an incorrect translation.

Challenges and future perspectives

Despite its advancements, simultaneous translation in video conferencing faces challenges, particularly in handling cultural nuances and idiomatic expressions. The continuous improvement of AI algorithms and training on diverse data sets are crucial for overcoming these hurdles.

The future of simultaneous translation in video conferencing looks promising, with ongoing research aimed at enhancing accuracy and reducing translation time lag. Moreover, expanding these technologies to less common languages would open up further possibilities for communication and interaction worldwide.

Speech-to-speech: A revolution in the translation industry

Thanks to advancements in AI, video conferencing tools today are capable of offering real-time translation and interpretation solutions, thereby facilitating an unparalleled mutual understanding during international exchanges. Currently, these translations are mostly limited to the display of subtitles.

Third-party companies (such as the promising Vidvoice, for example) are in the experimental phase of developing direct speech-to-speech translation functionalities.

It is reasonable to expect that, in the near future, this technology will be integrated into all video conferencing platforms, radically transforming our way of communicating on a global scale.

