Improving Acoustic Models for Dysarthric Speech Recognition using Time Delay Neural Networks
DOI : 10.1109/ICELTICS50595.2020.9315506
Date : 2020
Recently, deep learning approaches have been widely used to solve problems in the pattern recognition area, especially speech recognition. The deep structures of neural networks have made the system gain impressive performance for the normal speaker speech acoustic model. However, there has remained a challenge to build a speech recognition model for dysarthric speakers. This paper investigates the performance of speech recognition models for dysarthric speakers using time delay deep neural networks. Moreover, we also explore the model performance by combining dysarthria and normal speech corpus. Finally, well-tuned hyperparameters of deep neural network structures give promising results on Mandarin and English dysarthria speech.