Font Size: a A A

The Research Of Single Channel Speech Separation Based On Computational Auditory Scene Analysis

Posted on:2020-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q B FengFull Text:PDF
GTID:2428330596986200Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the development of science and technology,the demand for man-machine communication has greatly increased.As the most convenient and direct way of interaction,speech communication is often interfered by all kinds of noise in the practical application environment,which limits its application.In contrast,humans can effectively distinguish the information of the target sound source in a noisy environment with only one ear.Computational auditory scene analysis(CASA)is the use of computer technology to simulate the process of human perception of speech information.The voice separation system based on CASA has low requirement for prior information of mixed speech and strong applicability,which arouses great interest in research.In this paper,the theoretical knowledge of CASA is introduced.on this basis,the improvement measures of traditional single-channel speech separation system based on CASA are put forward.the main work is as follows:(1)Because the final separation results of CASA speech separation algorithm based on harmonic characteristics are closely related to the accuracy of pitch estimation,but the performance of the traditional pitch detection algorithm is not good under strong noise interference.In order to solve this problem,a pitch detection algorithm based on multi-classification support vectormachine is proposed in this paper.The algorithm uses the static frame level characteristics of speech signal to supervise and train the multi-classification support vector machine,and calculates several possible pitch sizes of each frame speech as the corresponding pitch candidate values.The main body extension method is used to process the candidate pitch state,and the suitable value is selected to connect with the candidate pitch to obtain the pitch state estimation curve of the measured speech.The experimental results show that this method can effectively improve the pitch detection rate in low signal-to-noise ratio environment,and provide a better classification basis for CASA speech separation system.(2)In order to solve the problem that the traditional CASA separation algorithm based on autocorrelation function will give up too much speech information under the interference of noise,an improved algorithm is proposed.The algorithm uses a frequency domain feature which is robust to noise interference as a characteristic clue instead of the traditional correlation function to calculate the periodic information of each time-frequency unit.These periodic information are matched with the dominant pitch detected by the pitch estimation algorithm proposed in this paper,and then each speech component is marked.The experimental results shows that more accurate dominant pitch will improve the performance of CASA speech separation system.At the same time,the frequency domain features extracted in this paper can retain more speech information,so that the synthesized target speech has a higher intelligibility,butalso conducive to subsequent speech signal processing.
Keywords/Search Tags:Computational auditory scene analysis, multi-classification support vector machine, single channel speech separation, Logarithmic frequency domain characteristics, pitch detection
PDF Full Text Request
Related items