STEER - Virtual Cohorts from Text-Conditional Medical Image Generation

< Back

Context

Can you imagine an artificial intelligence tool able to generate synthetic computed tomography scans according to specific characteristics, clinical features or conditions steered by textual inputs?

Technological Challenges

Adapt tools developed for natural images to the medical domain
To validate anatomical correctness of the generated images and their intensity values according to the physical properties of the tissues
To know how the textual prompt impacts on the generated images
To implement text-conditioned diffusion models for medical imaging by adapting a natural image generation pipeline to grayscale medical image generation.

Methodology:

Implement text-conditioned diffusion models for medical imaging by adapting a natural image generation pipeline to grayscale medical image generation. Prompt generation module

Technical Results

The final step of the project consisted on the integration of the prompt generation module and the trained Stable Diffusion model into a complete text-guided image generation pipeline contenerized in a web application. From an input radiology report, the prompt generation module extracts relevant entities and constructs a prompt like the ones used to train the Stable Diffusion model. Starting with an input Gaussian noise image, the text prompt guides the image generation process using the trained Diffusion Model to create a synthetic image. The final image can be evaluated with the common validation metrics, and with the ones proposed by us in this work.

The web application operates in two modes:

User Mode: This mode is designed to simulate the experience of an end user, such as a radiologist, allowing them to upload a radiological report. Based on the uploaded report, a text prompt is automatically generated. The user can control which entities, such as sex, age, hepatopathy, and tumor size (if available in the report), are included in the prompt. Following this, the user can specify the number of images to generate and adjust some parameters to guide the generation process. The generated synthetic images can be downloaded individually or as a batch and can be evaluated using Frechet Inception Distance, Multi-structural similarity index, and our proposed metrics.
Researcher mode: This mode is designed for researchers, offering the ability to experiment with different trained models. In addition to uploading a radiological report, the user can manually input a custom text prompt to generate synthetic images. As with the User Mode, the generated images can be downloaded and evaluated.

Scientific Results:

Txurio, M.S. et al. (2023). Diffusion Models for Realistic CT Image Generation. Innovation in Medicine and Healthcare. KES InMed 2023. Smart Innovation, Systems and Technologies, vol 357
Platas, A. et al. (2024). Synthetic Annotated Data for Named Entity Recognition in Computed
Tomography Scan Reports. Accepted in proceedings SEPLN 2024.
Martínez-Arias, P. et al (2024). Text-conditioned abdominal CT slice generation using stable diffusion. Accepted in proceedings MICAD 2024.

Applications

Clinical trials (i.e. medical device sector - implants), biomedical research (avoid data sharing issues), data augmentation for deep learning model development

Watch the YouTube Video

Looking for support for your next project? Contact us, we are looking forward to helping you.