On Checking Robustness on Named Entity Recognition with Pre-trained Transformers Models
Autores: Justina Mandravickaite Egidija Versinskiene
Fecha: 01.01.2023
Baltic Journal of Modern Computing
Abstract
In this paper we are conducting a series of experiments with several state-of-the-art models, based on Transformers architecture, to perform Named Entity Recognition and Classification (NERC) on text of different styles (social networks vs. news) and languages, and with different levels of noise. We are using different publicly-available datasets such as WNUT17, CoNLL2002 and CoNLL2003. Furthermore, we synthetically add extra levels of noise (random capitalization, random character additions/replacements/removals, etc.), to study the impact and the robustness of the models. The Transformer models we compare (mBERT, CANINE, mDeBERTa) use different tokenisation strategies (token-based vs. character-based) which may exhibit different levels of robustness towards certain types of noise. The experiments show that the subword-based models (mBERT and mDeBERTa) tend to achieve higher scores, especially in the presence of clean text. However, when the amount of noise increases, the character-based tokenisation exhibits a smaller performance drop, suggesting that models such as CANINE might be a better candidate to deal with noisy text.
BIB_text
title = {On Checking Robustness on Named Entity Recognition with Pre-trained Transformers Models},
journal = {Baltic Journal of Modern Computing},
pages = {591-606},
volume = {11},
keywds = {
Dutch; English; model robustness; NER; Spanish; Transformers
}
abstract = {
In this paper we are conducting a series of experiments with several state-of-the-art models, based on Transformers architecture, to perform Named Entity Recognition and Classification (NERC) on text of different styles (social networks vs. news) and languages, and with different levels of noise. We are using different publicly-available datasets such as WNUT17, CoNLL2002 and CoNLL2003. Furthermore, we synthetically add extra levels of noise (random capitalization, random character additions/replacements/removals, etc.), to study the impact and the robustness of the models. The Transformer models we compare (mBERT, CANINE, mDeBERTa) use different tokenisation strategies (token-based vs. character-based) which may exhibit different levels of robustness towards certain types of noise. The experiments show that the subword-based models (mBERT and mDeBERTa) tend to achieve higher scores, especially in the presence of clean text. However, when the amount of noise increases, the character-based tokenisation exhibits a smaller performance drop, suggesting that models such as CANINE might be a better candidate to deal with noisy text.
}
doi = {10.22364/bjmc.2023.11.4.05},
date = {2023-01-01},
}