2011 IEEE International Conference on Multimedia and Expo

RAPID SPEAKER ADAPTATION USING MAXIMUM LIKELIHOOD NEURAL REGRESSION

Mohamad Hasan Bahari, Hugo Van Hamme



Abstract

In this paper, a new method called Maximum Likelihood Neural Regression (MLNR) is introduced for Rapid Speaker Adaptation (RSA). MLNR, which is conceptually simple, adapts the Gaussian means of a speaker independent (SI) model to the data of a new speaker by assuming a non-linear mapping from the SI Gaussian means to the adapted Gaussian means. It performs a weighted non-linear regression between maximum likelihood (ML) estimates of the means and the speaker independent means using General Regression Neural Networks (GRNN). Evaluation on the Wall Street Journal benchmark shows that the suggested scheme outperforms different conventional approaches.

Read Submission [796]