Synthetic Annotated Data for Named Entity Recognition in Computed Tomography Scan Reports

Abstract

It is widely acknowledged that clinical data, in general, is scarce, and this scarcity worsens when focusing on specific domains. Moreover, the challenge escalates when annotated data is required. In this paper, we propose an approach to create synthetic annotated datasets for Named Entity Recognition (NER) tasks in Computed Tomography Reports (CTR) by leveraging large language models (LLMs). We investigate the potential of LLMs to generate meaningful texts in the healthcare domain through a combination of text generation techniques and automatic annotation using LLMs. Additionally, we conducted a series of experiments to demonstrate the efficacy of using synthetic data compared to real data for solving NER tasks.

BIB_text

@Article {
title = {Synthetic Annotated Data for Named Entity Recognition in Computed Tomography Scan Reports},
pages = {69-78},
keywds = {
Biomedical NER; data synthesis; text generation
}
abstract = {

It is widely acknowledged that clinical data, in general, is scarce, and this scarcity worsens when focusing on specific domains. Moreover, the challenge escalates when annotated data is required. In this paper, we propose an approach to create synthetic annotated datasets for Named Entity Recognition (NER) tasks in Computed Tomography Reports (CTR) by leveraging large language models (LLMs). We investigate the potential of LLMs to generate meaningful texts in the healthcare domain through a combination of text generation techniques and automatic annotation using LLMs. Additionally, we conducted a series of experiments to demonstrate the efficacy of using synthetic data compared to real data for solving NER tasks.


}
date = {2024-09-24},
}
Vicomtech

Parque Científico y Tecnológico de Gipuzkoa,
Paseo Mikeletegi 57,
20009 Donostia / San Sebastián (España)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbao (España)

close overlay

Las cookies de publicidad comportamental son necesarias para cargar el contenido

Aceptar cookies de publicidad comportamental