Paper Title
SEOP: Speech Enhancement System for Punjabi Language
Abstract
In this paper, the process of Punjabi speech enhancement using the Bidirectional Long Short-Term Memory (BLSTM) -Kalman Filter (KF) for improved Punjabi speech quality has been presented. We implemented the Speech Enhancement System of Punjabi (SEOP) system, in which we trained two separate BLSTM. One BLSTM learns to map from acoustic to magnitude of clean speech, while the other learns to map from acoustic to Line Spectrum Frequencies (LSFs). The estimated clean speech is then rebuilt, and the LSFs are transformed to Linear Prediction Coefficients (LPCs) for use in implementing KF. Experiments are carried out on the acoustic and tonal features. Our acoustic features include Linear Prediction Coefficient (LPC), Gammatone Frequency Cepstral Coefficients (GFCC), Mel-Frequency Cepstral Coefficient (MFCC), and Bark Frequency Cepstral Coefficients (BFCC). The experiment on acoustic and tonal features of noises revealed the effectiveness of BLSTM with KF. MFCC+pitch, MFCC+BFCC+pitch, and MFCC+GFCC+BFCC+pitch achieve the best Word Error Rate (WER) of 22.97%, 22.11%, and 17.40%, respectively.
Keywords - Speech enhancement, Punjabi, Deep learning, Kalman filter, Bidirectional Long Short Term Memory (BLSTM), Tonal features.