An Evaluation of Source Factors in Concatenation-based Context-aware Neural Machine Translation

Data: 04.09.2023


Abstract

We explore the use of source factors in contextaware neural machine translation, specifically concatenation-based models, to improve the translation quality of inter-sentential phenomena. Context sentences are typically concatenated to the sentence to be translated, with string-based markers to separate the latter from the former. Although previous studies have measured the impact of prefixes to identify and mark context information, the use of learnable factors has only been marginally explored. In this study, we evaluate the impact of single and multiple source context factors in English-German and Basque-Spanish contextual translation. We show that this type of factors can significantly enhance translation accuracy for phenomena such as gender and register coherence in Basque-Spanish, while also improving BLEU results in some scenarios. These results demonstrate the potential of factor-based context identification as a research path in contextaware machine translation.

BIB_text

@Article {
title = {An Evaluation of Source Factors in Concatenation-based Context-aware Neural Machine Translation},
pages = {399-407},
keywds = {
Computational linguistics; Computer aided language translation
}
abstract = {

We explore the use of source factors in contextaware neural machine translation, specifically concatenation-based models, to improve the translation quality of inter-sentential phenomena. Context sentences are typically concatenated to the sentence to be translated, with string-based markers to separate the latter from the former. Although previous studies have measured the impact of prefixes to identify and mark context information, the use of learnable factors has only been marginally explored. In this study, we evaluate the impact of single and multiple source context factors in English-German and Basque-Spanish contextual translation. We show that this type of factors can significantly enhance translation accuracy for phenomena such as gender and register coherence in Basque-Spanish, while also improving BLEU results in some scenarios. These results demonstrate the potential of factor-based context identification as a research path in contextaware machine translation.


}
isbn = {978-954452092-2},
date = {2023-09-04},
}
Vicomtech

Gipuzkoako Zientzia eta Teknologia Parkea,
Mikeletegi Pasealekua 57,
20009 Donostia / San Sebastián (Espainia)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbo (Espainia)

close overlay

Jokaeraren araberako publizitateko cookieak beharrezkoak dira eduki hau kargatzeko

Onartu jokaeraren araberako publizitateko cookieak