The report presents an independent benchmark of commercial machine translation engines in 16 industrial sectors on domain-specific data from TAUS Data Cloud.
MT systems evaluated
We have evaluated general-purpose Cloud Machine Translation systems with pre-trained translation models, available via API:
This time we have decided to use BERTscore, which has shown good results on the last WMT benchmarks. Its main advantage over syntactic similarity scores (such as BLEU, TER or hLEPOR) is tolerance to alternative translations and synonyms.
We do statistically significant selection of the best engines, considering confidence intervals, and in many cases, there's more than one winner per category!
Industry sector analysis
We have analyzed MT performance for the following Industry Sectors:
We also have analyzed what is the best MT for different Content Types: