Paper Title
Improving Speech Quality Using MetricGAN and Kolmogorov-Arnold Networks
Abstract
Speech enhancement (SE) aims to improve speech quality and intelligibility under noisy conditions. Traditional SE models often optimize generic losses, which may not align well with human perception. MetricGAN introduced a metric-driven approach using GANs to directly optimize perceptual metrics like PESQ. Speech enhancement methods that use neural networks and focus on improving how good the sound sounds have shown excellent results. One example is MetricGAN and its versions, which try to make better sound by learning to copy a target quality level. Recently, a type of network called Kolmogorov–Arnold networks (KANs) has performed better than usual networks in many tasks. In this paper, we use KANs in the MetricGAN system, replacing regular layers with KAN versions. We check how this affects the quality of the sound and the size of the model. On a test called Voicebank–DEMAND, our best model using MetricGAN and KAN improves the PESQ score by 0. 13 compared to the regular MetricGAN model, and uses 79. 9% fewer parameters.
Keywords - Speech Enhancement, Kolmogorov-Arnold Networks , Deep Neural Networks, Generative Adversarial Network.