Tackling the challenges of Asian machine translation
In today’s globalized economy, more and more Asian companies are spreading their wings across the globe. At rapid pace, these companies are moving away from centralized Asia-based engineering sites towards a distributed model with regional branches. And although geographical boundaries seem to be fading away, there is still one important barrier that needs to be overcome: language. In order to improve corporate communication between different geographical business units, many of Yamagata’s customers are now turning to machine translation.
Consider the following real-life example from a Yamagata customer in the automotive sector. That company’s product improvement center in Japan was tasked with processing reports of technical issues from local dealers. The reports were written in English. The Japanese team had to rely on manual translations of these reports, made by an internal employee during a half day that was allocated for this. This working method not only resulted in delayed translations, but also in delayed registrations of technical claims, and in a lot of frustrated local dealers.
It’s a situation many global companies can relate to: vast amounts of content, often user-generated and from different geographical sites, need to be translated as soon as possible and in a cost-efficient way, in order not to impact business in a negative way.
In recent years, machine translation has played an increasingly important role in managing this global content flood and in helping corporate communication teams tackle their global communication needs.
What is machine translation?
Machine translation (MT) is automated translation by means of smart computer software. The technique can either be integrated into a semi-automated translation process, where pre-translated text is then presented to a human translator, or it can be applied as a fully automated technique where any intervention from a human translator in the process is omitted.
In the latter case, raw machine translation output is used as a translated document. Here, machine translation is particularly useful to gain access to information that was previously not translated at all, e.g. database content, chats or emails between employees. Obviously here, translation quality is not perfect, but machine translation does provide a way to absorb massive amounts of translation content in a cost-effective way.
Tackling Asian machine translation challenges
Can machine translation streamline corporate communications for Asian languages? We certainly think so. However, Asian languages have a number of characteristics that pose additional challenges for machine translation:
- Lack of word boundaries: Statistical machine translation works by means of phrase tables – source/target language pairs of words or phrases linked with their translation probabilities. However, in Chinese or Japanese for example, word boundaries are not always clear, making it difficult to determine phrase tables correctly. One possible solution for this is to prepare language data, for example by means of word parsers that can create artificial word boundaries.
- Elliptical language: In Japanese casual language, a lot of words can be left out (This is called “ellipsis”). Pronouns and subjects are often cut out, and the actual meaning of the sentence needs to become clear from the context. This creates a certain degree of ambiguity which can impact machine translation performance and cause context-inappropriate translations. A large and specific translation corpus, only with relevant data, can solve this problem to a certain degree.
- Different word order: Asian and European languages usually have a different word order. This makes it difficult to determine matching phrase tables during machine translation engine training. In this case ‘bigger is better’ is the motto. By optimizing the corpus and by adding more phrases, you can increase the chance that phrases are linked correctly. Ideally, a corpus should contain over 400,000 segments when dealing with machine translation between European and Asian languages.
Yamagata Machine Translation System (YMTS)
Machine translation is here to stay. Global online content is booming like never before and more in particular user-generated content and social media are taking an ever bigger piece of the content pie. Human translators alone will no longer be able to cope with this growing demand. And although the technology has some additional challenges for Asian languages, it remains a perfect alternative for global companies to provide fast, consistent and cost-effective translations for large amounts of data.
In order to tackle some of the specific challenges of Asian languages, it is critical to build customized machine translation engines that cover a specific domain. At Yamagata, we offer machine translation for Asian and other languages through our online portal, called the Yamagata Machine Translation System (YMTS).
If you are curious about how machine translation could work for your company, then let us know. Our dedicated machine translation specialists can help you set up your own machine translation system.
- Five steps to translating life-critical content Posted by Yamagata Europe posted on 13 january
- Yamagata Europe and TXTOmedia partner up to meet growing demand for instructional video content Posted by Yamagata Europe posted on 12 november
- Which variant of Chinese do you need? Posted by Yamagata Europe posted on 17 october
- GitHub: the ideal version control and collaboration platform Posted by peter on 21 january
- GitHub: the ideal version control and collaboration platform Posted by Doris Susan on 21 january
- GitHub: the ideal version control and collaboration platform Posted by Bitcoin Generator on 20 january