Representation learning strategies to model pathological speech: Effect of multiple spectral resolutions

Egileak: Gabriel Figueiredo Juan Camilo Vasquez Correa Juan Rafael Orozco Elmar Nöth

Data: 01.04.2024

Computer Speech and Language


Abstract

This paper considers a representation learning strategy to model speech signals from patients with Parkinson's disease, with the goal of predicting the presence of the disease, and evaluating the level of degradation of a patient's speech. In particular, we propose a novel fusion strategy that combines wideband and narrowband spectral resolutions using a representation learning strategy based on autoencoders, called the multi-spectral autoencoder. The proposed model is able to classify the speech from Parkinson's disease patients with accuracy up to 97%. The proposed model is also able to assess the dysarthria severity of Parkinson's disease patients with a Spearman correlation up to 0.79. These results outperform those observed in literature where the same problem was addressed with the same corpus.

BIB_text

@Article {
title = {Representation learning strategies to model pathological speech: Effect of multiple spectral resolutions},
journal = {Computer Speech and Language},
pages = {101584},
volume = {85},
keywds = {
Dysarthria; Parkinson's disease; Representation learning
}
abstract = {

This paper considers a representation learning strategy to model speech signals from patients with Parkinson's disease, with the goal of predicting the presence of the disease, and evaluating the level of degradation of a patient's speech. In particular, we propose a novel fusion strategy that combines wideband and narrowband spectral resolutions using a representation learning strategy based on autoencoders, called the multi-spectral autoencoder. The proposed model is able to classify the speech from Parkinson's disease patients with accuracy up to 97%. The proposed model is also able to assess the dysarthria severity of Parkinson's disease patients with a Spearman correlation up to 0.79. These results outperform those observed in literature where the same problem was addressed with the same corpus.


}
doi = {10.1016/j.csl.2023.101584},
date = {2024-04-01},
}
Vicomtech

Gipuzkoako Zientzia eta Teknologia Parkea,
Mikeletegi Pasealekua 57,
20009 Donostia / San Sebastián (Espainia)

+(34) 943 309 230

Zorrotzaurreko Erribera 2, Deusto,
48014 Bilbo (Espainia)

close overlay

Jokaeraren araberako publizitateko cookieak beharrezkoak dira eduki hau kargatzeko

Onartu jokaeraren araberako publizitateko cookieak