Vicomtech at MEDDOCAN: Medical Document Anonymization
Authors: Naiara Perez Miguel Laura García Sardiña Manex Serras Saenz
Date: 01.08.2019
Abstract
This paper describes the participation of Vicomtech s team in the MEDDOCAN: Medical Document Anonymization challenge, which consisted in the recognition and classification of protected health information (PHI) in medical documents in Spanish. We tested different state-of-the-art classification algorithms, both deep and shallow, and rich sets of features, obtaining an F1-score of 0.960 in the strictest evaluation. The models submitted and scripts for decoding will be available at https://snlt.vicomtech.org/meddocan2019.
BIB_text
title = {Vicomtech at MEDDOCAN: Medical Document Anonymization},
pages = {696-703},
keywds = {
PHI De-identification Textual Anonymisation Machine Learning Spanish Corpus
}
abstract = {
This paper describes the participation of Vicomtech s team in the MEDDOCAN: Medical Document Anonymization challenge, which consisted in the recognition and classification of protected health information (PHI) in medical documents in Spanish. We tested different state-of-the-art classification algorithms, both deep and shallow, and rich sets of features, obtaining an F1-score of 0.960 in the strictest evaluation. The models submitted and scripts for decoding will be available at https://snlt.vicomtech.org/meddocan2019.
}
date = {2019-08-01},
}