2011 IEEE International Conference on Multimedia and Expo

RATE DISTORTION BOUNDS FOR SPEECH CODING BASED ON A PERCEPTUAL DISTORTION MEASURE (PESQ-MOS)

Ying Yi Li, Jerry Gibson



Abstract

We develop practical rate distortion bounds for speech coding based on composite source models and the PESQ-MOS distortion measure. Specifically, the bounds are formulated using composite source models for speech, the rate distortion function for Gaussian autoregressive sources, the classical reverse water-filling result, and conditional rate distortion theory, along with a recently devised MSE-to-PESQ-MOS mapping. The resulting rate distortion bounds are shown to lower bound the performance of the AMR, G.729, and G.718 standardized codecs, and based on the tightness of these bounds, to indicate how the performance of voice codecs might be improved.

Read Submission [123]