Font Size: a A A

Research On Key Technologies Of Embedded Speech Recognition Front End Processing

Posted on:2022-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z H CuiFull Text:PDF
GTID:2518306554964639Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Since the 21 st century,with the rapid development of speech recognition technology and embedded technology,more and more speech recognition systems based on embedded devices appear in people's work and life,and play an important role.However,in the actual speech recognition environment may be accompanied by a variety of noise interference,affect the performance of the speech recognition system.Therefore,in the front-end processing of speech recognition,it is particularly important to enhance the speech signal with noise and improve the quality of the speech signal.In this paper,the key technology of speech enhancement in embedded speech recognition front-end processing is studied deeply.An improved algorithm of dual microphone speech enhancement based on first-order differential array is proposed,and the implementation scheme of FPGA hardware is given.Specific research contents are as follows:Firstly,the research status of microphone array speech enhancement algorithms at home and abroad is analyzed and summarized,and it is found that many of the current microphone array speech enhancement algorithms are only in the experimental simulation stage,and most of the algorithms have high complexity,which is not easy to implement in the resource-limited embedded devices.In order to meet the demand of embedded device speech enhancement,this paper chooses a dual microphone speech enhancement algorithm FDM-SS based on first-order differential array with moderate complexity.Through the theoretical analysis and simulation experiments of FDM-SS algorithm,it is found that the algorithm depends on the estimation of the silent segment,and the performance of speech enhancement will be greatly reduced when the estimation deviation of the silent segment is too large.To solve this problem,an improved algorithm combined with speech activity detection(FDM-SS+VAD)is proposed,which uses speech activity detection to realize accurate estimation of silent segments,so as to improve the performance of the speech enhancement algorithm.Secondly,Matlab was used to build an experimental simulation environment.In the range of 0d B?10d B SNR,the proposed improved algorithm was simulated and tested,and its performance was compared with FDM-SS algorithm.Firstly,the time-domain waveform is compared.The experimental results show that the improved algorithm improves the speech enhancement effect significantly under the conditions of cafe noise and white noise.Then,the PESQ score was compared by speech perception quality evaluation.The experimental results show that the PESQ score of the improved algorithm is 13.09% higher than that of the FDM-SS algorithm under the condition of cafe noise.In the condition of white noise,the PESQ score of the improved algorithm is 12.48% higher than that of the FDM-SS algorithm.Finally,based on FPGA,the proposed improved algorithm is implemented in hardware.Based on the balance of hardware resources and performance,a fixed-point floating-point partitioning algorithm is designed to realize the overall hardware architecture.The hardware design of preprocessing module,speech activity detection module and frequency domain speech enhancement module is completed.Functional simulation and FPGA board level verification are carried out to verify the overall design.The final results show that the PESQ score of FPGA output is 1.08% lower than that of MATLAB output,which proves the accuracy of the result achieved by FPGA within the error range.In addition,this design only needs 1.92 ms to complete voice enhancement for16 KHz voice processing 1s,which can meet the demand of real-time voice enhancement.
Keywords/Search Tags:Embedded device, dual microphone, speech enhancement, voice activity detection, FPGA
PDF Full Text Request
Related items