Publications
Sort:
Open Access Issue
Optimizing the Perceptual Quality of Time-Domain Speech Enhancement with Reinforcement Learning
Tsinghua Science and Technology 2022, 27(6): 939-947
Published: 21 June 2022
Abstract PDF (707 KB) Collect
Downloads:81

In neural speech enhancement, a mismatch exists between the training objective, i.e., Mean-Square Error (MSE), and perceptual quality evaluation metrics, i.e., perceptual evaluation of speech quality and short-time objective intelligibility. We propose a novel reinforcement learning algorithm and network architecture, which incorporate a non-differentiable perceptual quality evaluation metric into the objective function using a dynamic filter module. Unlike the traditional dynamic filter implementation that directly generates a convolution kernel, we use a filter generation agent to predict the probability density function of a multivariate Gaussian distribution, from which we sample the convolution kernel. Experimental results show that the proposed reinforcement learning method clearly improves the perceptual quality over other supervised learning methods with the MSE objective function.

Total 1