mintzai-ST: Corpus and Baselines for Basque-Spanish Speech Translation
Authors: Edson Benites
Date: 24.03.2021
Abstract
The lack of resources to train end-to-end Speech Translation models hinders research and development in the field. Although recent efforts have been made to prepare additional corpora suitable for the task, few resources are currently available and for a limited number of language pairs. In this work, we describe mintzai-ST, a parallel speech-text corpus for Basque- Spanish in both translation directions, prepared from the sessions of the Basque Parliament and shared for research purposes. This language pair features challenging phenomena for automated speech translation, such as marked differences in morphology and word order, and the mintzai-ST corpus may thus serve as a valuable resource to measure progress in the field. We also describe and evaluate several ST model variants, including cascaded neural components, for speech recognition, machine translation, and end-to-end speech-to-text translation. The evaluation results demonstrate the usefulness of the shared corpus as an additional ST resource and contribute to determining the respective benefits and limitations of current alternative approaches to Speech Translation.
BIB_text
title = {mintzai-ST: Corpus and Baselines for Basque-Spanish Speech Translation},
pages = {190-194},
keywds = {
Speech Translation, Basque, Spanish, Corpus
}
abstract = {
The lack of resources to train end-to-end Speech Translation models hinders research and development in the field. Although recent efforts have been made to prepare additional corpora suitable for the task, few resources are currently available and for a limited number of language pairs. In this work, we describe mintzai-ST, a parallel speech-text corpus for Basque- Spanish in both translation directions, prepared from the sessions of the Basque Parliament and shared for research purposes. This language pair features challenging phenomena for automated speech translation, such as marked differences in morphology and word order, and the mintzai-ST corpus may thus serve as a valuable resource to measure progress in the field. We also describe and evaluate several ST model variants, including cascaded neural components, for speech recognition, machine translation, and end-to-end speech-to-text translation. The evaluation results demonstrate the usefulness of the shared corpus as an additional ST resource and contribute to determining the respective benefits and limitations of current alternative approaches to Speech Translation.
}
date = {2021-03-24},
}