Font Size: a A A

Detection And Suppression Of Typed Keystrokes In Speech Signals

Posted on:2018-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:Ullah RizwanFull Text:PDF
GTID:2348330512985624Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In recent decades,there has been a significant increase in using computers and laptops to capture audios in different communication scenarios e.g.recording a lecture,meeting recording,video conferencing and voice over internet protocol(VOIP)communication.Some use laptops and computers for capturing audios and lectures for archival purposes and others use voice recorders.Due to the close vicinity of the keyboard to the microphone,the speech recorded is severely corrupted by the additive keystrokes,generated mainly from the typing on a mechanical keyboard.Due to its highly non-stationary and abrupt nature,it has been a challenging problem in the field of single channel speech enhancement.In this dissertation,the main focus is on the suppression of typed keystrokes in the speech signals.Two new two-stage algorithms have been proposed for the detection and suppression of typed keystroke transient noise in speech signals i.e.correlation based technique named as sparse non-negative matrix factorization-correlation(SNMF-CR)and thresholding based technique Sparse non-negative matrix factorization-thresholding technique(SNMF-TT).In both of the methods,the first stage is achieved by using sparse non-negative matrix factorization(SNMF),which is the same in both of the techniques.In the second stage,two new techniques have been developed.In SNMF-CR,correlation is taken between the estimated clean speech obtained from stage-?(SNMF stage)and the original noise speech.Based on the low correlation coefficient between the noise corrupted segments in the original noisy signal and the noise suppressed segments in the estimated speech,the noisy segments in the original noisy speech are replaced by the corresponding noise suppressed segments from the estimated clean speech obtained from stage-?.In the thresholding based technique,the whole spectrogram is divided into two parts horizontally,based on the observation that the energies of keystrokes are more widely distributed via frequency axis as compared to speech.For each spectral vector norm is computed and then,the norm of the lower portion is divided by the norm of upper portion for each spectral vector.The ratio is compared with a threshold and based on that the keystrokes are detected.The detected keystrokes in the original noisy speech are then replaced by the corresponding estimated clean speech segments from the estimated clean speech spectrogram obtained from stage-I.Thus,in both of the proposed algorithms,the speech segments that are not corrupted by the keystroke noise are preserved and remain the same.In the noisy segments of speech,the keystrokes are suppressed but the quality of speech is uncompromised.Hence the keystrokes are suppressed efficiently without introducing significant audible distortion,which is the main contribution of this dissertation.The performance of the proposed algorithms is compared with the spectral subtraction algorithm and enhanced OM-LSA algorithm,and the proposed algorithms demonstrated better performance.
Keywords/Search Tags:Single channel speech enhancement, short time Fourier transform, supervised sparse non-negative matrix factorization, correlation, keystrokes suppression, thresholding technique
PDF Full Text Request
Related items