Vicomtech at MEDDOPROF: Automatic Information Extraction and Disambiguation in Clinical Text

< Itzuli

Data: 21.09.2021

Abstract

This paper describes the participation of the Vicomtech NLP team in the MEDDOPROF shared task. The challenge consists in automatic detection of occupations and employment status, as well as their normalization or entity mapping, within medical documents in Spanish language. The competition is split into three tasks, NER, CLASS and NORM. We have participated using a multitask joint model based on Transformers, which tries to solve all the three tasks at once. However,
the NORM task, which consists on disambiguation of the detected entities against thousands of different possible codes, can be solved more effectively using other approaches. Because of that, we have submitted an additional sequence-to-sequence based approach and a semantic-search based approach to deal with the NORM task. We achieve a 77% of F1-score for the NER task, and 70% of F1-score for the CLASS task, and a 48% of F1-score for the NORM task.

BIB_text

@Article {
title = {Vicomtech at MEDDOPROF: Automatic Information Extraction and Disambiguation in Clinical Text},
pages = {776-787},
keywds = {
Clinical Text, Information Extraction, Automatic Indexing
}
abstract = {

}
date = {2021-09-21},
}