On Checking Robustness on Named Entity Recognition with Pre-trained Transformers Models

Authors: Aitor García Pablos Justina Mandravickaite Egidija Versinskiene

Date: 01.01.2023

Baltic Journal of Modern Computing


Abstract

In this paper we are conducting a series of experiments with several state-of-the-art models, based on Transformers architecture, to perform Named Entity Recognition and Classification (NERC) on text of different styles (social networks vs. news) and languages, and with different levels of noise. We are using different publicly-available datasets such as WNUT17, CoNLL2002 and CoNLL2003. Furthermore, we synthetically add extra levels of noise (random capitalization, random character additions/replacements/removals, etc.), to study the impact and the robustness of the models. The Transformer models we compare (mBERT, CANINE, mDeBERTa) use different tokenisation strategies (token-based vs. character-based) which may exhibit different levels of robustness towards certain types of noise. The experiments show that the subword-based models (mBERT and mDeBERTa) tend to achieve higher scores, especially in the presence of clean text. However, when the amount of noise increases, the character-based tokenisation exhibits a smaller performance drop, suggesting that models such as CANINE might be a better candidate to deal with noisy text.

BIB_text

@Article {
title = {On Checking Robustness on Named Entity Recognition with Pre-trained Transformers Models},
journal = {Baltic Journal of Modern Computing},
pages = {591-606},
volume = {11},
keywds = {
Dutch; English; model robustness; NER; Spanish; Transformers
}
abstract = {

In this paper we are conducting a series of experiments with several state-of-the-art models, based on Transformers architecture, to perform Named Entity Recognition and Classification (NERC) on text of different styles (social networks vs. news) and languages, and with different levels of noise. We are using different publicly-available datasets such as WNUT17, CoNLL2002 and CoNLL2003. Furthermore, we synthetically add extra levels of noise (random capitalization, random character additions/replacements/removals, etc.), to study the impact and the robustness of the models. The Transformer models we compare (mBERT, CANINE, mDeBERTa) use different tokenisation strategies (token-based vs. character-based) which may exhibit different levels of robustness towards certain types of noise. The experiments show that the subword-based models (mBERT and mDeBERTa) tend to achieve higher scores, especially in the presence of clean text. However, when the amount of noise increases, the character-based tokenisation exhibits a smaller performance drop, suggesting that models such as CANINE might be a better candidate to deal with noisy text.


}
doi = {10.22364/bjmc.2023.11.4.05},
date = {2023-01-01},
}
Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (Spain)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbao (Spain)

close overlay

Behavioral advertising cookies are necessary to load this content

Accept behavioral advertising cookies