RAPID SPEAKER ADAPTATION USING MAXIMUM LIKELIHOOD NEURAL REGRESSION
Mohamad Hasan Bahari, Hugo Van HammeAbstract
In this paper, a new method called Maximum Likelihood Neural Regression (MLNR) is introduced for Rapid Speaker Adaptation (RSA). MLNR, which is conceptually simple, adapts the Gaussian means of a speaker independent (SI) model to the data of a new speaker by assuming a non-linear mapping from the SI Gaussian means to the adapted Gaussian means. It performs a weighted non-linear regression between maximum likelihood (ML) estimates of the means and the speaker independent means using General Regression Neural Networks (GRNN). Evaluation on the Wall Street Journal benchmark shows that the suggested scheme outperforms different conventional approaches.
Read Submission [796]