Font Size: a A A

Research On Speech Enhancement Algorithm Of Dual Microphone Mobile Phone Based On Machine Learning

Posted on:2018-07-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:L F ZhangFull Text:PDF
GTID:1318330518490182Subject:Physical Electronics
Abstract/Summary:PDF Full Text Request
The mobile phone market is currently the largest consumer market with the biggest number of users. The improvement of phone's communication quality has been widely discussed and addressed. Since the mobile phone conversation is normally used in all kinds of noisy environments, it requires speech enhancement algorithms to deal with many kinds of noise in order to suppress unwanted noise as much as possible and at the same time, to keep desired speech as much as possible. In addition,since different users hold phone with very different ways, speech enhancement algorithms should be sufficiently robust against the holding ways in real environments.In recent years, Artificial Intelligence (AI) is being used more and more.Machine learning (e.g., Neural Network, NN) is the core of AI. NN can improve performance since NN can be continuously trained by data in order to handle the complicated problems flexibly. In this dissertation, NN is used to enhance the performance of VAD and speech enhancement algorithms in mobile phone, e.g.,increasing the flexibility and robustness of the algorithms in real scenarios.Major work and innovations in this dissertation are as follows:(1) Since existing dual microphone voice activity detection (VAD) algorithms use afixed threshold, they cannot get an accurate detection on voice under various noise environments and furthermore. In the second chapter, a new dual-microphone VAD method is proposed based on neural network (NN). This proposed method takes both sub-band power level difference and inter-microphone cross correlation as features,and uses the neural network to classify speech and noise. In the proposed method,there is no need to adjust any threshold or parameter for different types of noise environments, because the method can adapt itself to those environments. Compared to existing methods based on power level difference (PLD), the proposed method can provide a higher accuracy in all situations, which makes it very suitable for mobile phone applications.(2) In the third chapter, the proposed NN based VAD method is further combined with the inter-microphone signal power ratio to get a new detection algorithm for more accurate detections of speech signal and noise respectively. Furthermore, the detection algorithm is used in noise suppression of mobile phone to avoid performance degradation due to VAD misjudgment estimator. Experimental results show that the proposed method provides better noise suppression performance and lower speech distortion, in particular, for directional speech interference, compared to the existing methods.(3) Noise estimation and noise reduction are two very important aspects in all speech enhancement algorithms. In the fourth chapter, the two aspects are mainly improved in the frequency domain. Firstly, the algorithm combines both single-microphone and dual-microphone noise estimations to improve the accuracy of the noise estimation. Secondly, the algorithm uses an improved pitch frequency estimation in de-noising processing, and further accurately detects bin-wise components for both speech and noise, and then controls the parameters of Winner filter to get less voice degradation and more noise suppression. Experiments demonstrate that the proposed method can effectively keep voice and improve the quality of the phone communications, compared to existing methods.(4) Different users hold mobile phone with very different ways, which causes the variations in relative positions between the phone and user's mouth. The variations seriously affect the performance of all dual-microphone speech enhancement algorithms. If the location of the phone can be estimated in real time and further the parameters of the noise suppression algorithms can be adjusted accordingly, the performance of the algorithms can be improved greatly. Most of the existing localization methods need at least three microphones, which can not be directly used on the dual microphone processing. In the fifth chapter, a new NN-based method to estimate the mobile phone position in three dimensional space using only two microphones. Besides the time difference of arrival (TDOA) is used as one of the features, the sub-band spectral power ratios are also taken as new features. NN is used to map the features to the corresponding positions for mobile phone.(5) When the mobile phone is detected to deviate from the standard position, the proposed algorithm directly adjusts speech enhancement processing and parameters based on phone position estimate. In such case, the proposed algorithm can avoid the performance degradation caused by the movement of the phone position. The experimental results show that the existing dual microphone algorithms can not get good performance when phone is rotating. Instead,the new algorithm proposed in this dissertation can get reasonable good performance and is more stable and practical in real applications.Finally, the main innovations of the dissertation are summarized, and further some further research is prospected at the end of the dissertation.
Keywords/Search Tags:Neural network, Mobile phones, Dual-microphones, Speech enhancement, Voice activity detection
PDF Full Text Request
Related items