Vicomtech at MEDDOPROF: Automatic Information Extraction and Disambiguation in Clinical Text

Abstract

This paper describes the participation of the Vicomtech NLP team in the MEDDOPROF shared task. The challenge consists in automatic detection of occupations and employment status, as well as their normalization or entity mapping, within medical documents in Spanish language. The competition is split into three tasks, NER, CLASS and NORM. We have participated using a multitask joint model based on Transformers, which tries to solve all the three tasks at once. However,
the NORM task, which consists on disambiguation of the detected entities against thousands of different possible codes, can be solved more effectively using other approaches. Because of that, we have submitted an additional sequence-to-sequence based approach and a semantic-search based approach to deal with the NORM task. We achieve a 77% of F1-score for the NER task, and 70% of F1-score for the CLASS task, and a 48% of F1-score for the NORM task.

BIB_text

@Article {
title = {Vicomtech at MEDDOPROF: Automatic Information Extraction and Disambiguation in Clinical Text},
pages = {776-787},
keywds = {
Clinical Text, Information Extraction, Automatic Indexing
}
abstract = {

This paper describes the participation of the Vicomtech NLP team in the MEDDOPROF shared task. The challenge consists in automatic detection of occupations and employment status, as well as their normalization or entity mapping, within medical documents in Spanish language. The competition is split into three tasks, NER, CLASS and NORM. We have participated using a multitask joint model based on Transformers, which tries to solve all the three tasks at once. However,
the NORM task, which consists on disambiguation of the detected entities against thousands of different possible codes, can be solved more effectively using other approaches. Because of that, we have submitted an additional sequence-to-sequence based approach and a semantic-search based approach to deal with the NORM task. We achieve a 77% of F1-score for the NER task, and 70% of F1-score for the CLASS task, and a 48% of F1-score for the NORM task.


}
date = {2021-09-21},
}
Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (Spain)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbao (Spain)

close overlay

Behavioral advertising cookies are necessary to load this content

Accept behavioral advertising cookies