MAIN Logo White BG
The State of Machine Translation 2020
Independent multi-domain evaluation
of commercial Machine Translation engines
 

The State of Machine Translation 2021 now available

Get report
27_07_20_intento_without_COVID-1

Report Focus

14_the Machine-4
16_industrial
14_language copy

Executive Summary

The report presents an independent benchmark of commercial machine translation engines in 16 industrial sectors on domain-specific data from TAUS Data Cloud.

1
The scores are dead, long live the scores! New semantic similarity scores (e.g. BERTscore) solve the main issue of syntactic similarity scores (e.g. BLEU), dealing with alternative translations and synonyms.
 
2
Each of the 15 MT engines is best at something. 7 of them are enough to get the best quality for all 15 industries and 14 language pairs. For any given industry, one to four engines are enough to achieve the best quality.
 
3
The highest MT quality has been seen in Computer Software, Legal Services, and Telecommunications, with Software Strings and Documentation, Support Content, Policies, Processes and Procedures being the most accessible for machine translation.
 
4
As many would expect, content related to Professional and Business Services, as well as Instructions for Use and Sales & Marketing Content in other industries, are the hardest for MT to get right.
 
5
An incredible spike in language coverage is observed: +2,000 language pairs since June 2020, many low-resource languages added.
 
 
6
MT Landscape continues to evolve: 11 more vendors offer pre-trained MT engines since June 2019.
 
Screenshot 2020-07-30 at 02.26.13

MT systems evaluated

We have evaluated general-purpose Cloud Machine Translation systems with pre-trained translation models, available via API:

  • Alibaba (General and eCommerce models)
  • Amazon Translate
  • Baidu Translate
  • DeepL
  • Google Advanced Translation
  • GTCom YeeCloud
  • IBM Watson
  • Microsoft Text Translator
  • ModernMT Realtime
  • PROMT
  • SDL BeGlobal
  • SYSTRAN PNMT
  • Tencent Cloud TMT
  • Yandex Translate

 

Semantic similarity scores

This time we have decided to use BERTscore, which has shown good results on the last WMT benchmarks. Its main advantage over syntactic similarity scores (such as BLEU, TER or hLEPOR) is tolerance to alternative translations and synonyms.

We do statistically significant selection of the best engines, considering confidence intervals, and in many cases, there's more than one winner per category!

 

Screenshot 2020-07-30 at 02.40.30
Screenshot 2020-07-30 at 02.46.30

Industry sector analysis

We have analyzed MT performance for the following Industry Sectors:

  • Automotive Manufacturing
  • Computer Hardware
  • Computer Software
  • Consumer Electronics
  • Energy, Water and Utilities
  • Financials
  • Healthcare
  • Industrial Electronics
  • Industrial Manufacturing
  • Legal Services
  • Leisure, Tourism, and Arts
  • Medical Equipment and Supplies
  • Pharmaceuticals and Biotechnology
  • Stores and Retail Distribution
  • Telecommunications

Content type analysis

We also have analyzed what is the best MT for different Content Types:

  • Financial Documentation
  • Instructions for Use
  • News Announcements, Reports and Research
  • Patents
  • Policies, Process and Procedures
  • Sales and Marketing Materials
  • Software Strings and Documentation
  • Support Content

 

Screenshot 2020-07-30 at 02.56.07

Get the full report to learn what MT engines work best

for your language, industry sector, and content type

book2
MAIN Logo White BG