From Subtitles to Parallel Corpora
Authors: Mark Fishel and Panayota Georgakopoulou and Sergio Penkale and Volha V. Petukhova and Matej Rojc and Martin Volk and Andy Way
Date: 28.05.2012
Abstract
We describe the preparation of parallel corpora based on professional quality subtitles in seven European language pairs. The main focus is the effect of the processing steps on the size and quality of the final corpora.
BIB_text
author = {Mark Fishel and Panayota Georgakopoulou and Sergio Penkale and Volha V. Petukhova and Matej Rojc and Martin Volk and Andy Way},
title = {From Subtitles to Parallel Corpora},
pages = {3-6},
abstract = {
We describe the preparation of parallel corpora based on professional quality subtitles in seven European language pairs. The main focus is the effect of the processing steps on the size and quality of the final corpora.
}
date = {2012-05-28},
year = {2012},
}