Font Size: a A A

Research On Distant Speech Recognition With Joint Enhancement And Adaptive Technology

Posted on:2020-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y D LouFull Text:PDF
GTID:2428330596485790Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Distant speech recognition(DSR)is widely used in research fields such as smart home,office environment,humanoid robots,automotive and speech translation.However,due to its complex environment such as noise and reverberation,robust and convenient recognition of far-field speech remains a challenge.This paper takes this task as a starting point,combined with the advantages of array signal processing and adaptive technology,to further study the distant noise reverberation speech recognition.The effectiveness of the proposed algorithm is verified by simulation and actual experiments.The main work completed in this article is as follows:1.Firstly,the simulation model-IMAGE model in far-field environment is briefly introduced.Secondly,the basic theory of distant speech recognition is introduced,including pre-processing,speech feature parameter extraction,acoustic model,language model,decoding and search algorithm,and the basic flow of the distant speech recognition system is further described.2.The sound source is located by using the time difference of arrival(TDOA)algorithm in this paper,and then the weight of Delay & Sum Beamforming(DSBF)is adjusted by the positioning result to reduce the interference in the undesired direction to improve voice quality.On this basis,the Minimum Variance Distortionless Response(MVDR)beamforming and Super-directive Beamforming(SDBF)are used to reduce the spatial coherent noise.Zelinski and McCowan post filtering methods are used to further reduce residual noise.The simulation experiment configuration is introduced,and the proposed algorithm is tested under different noises with a reflection coefficient of 0.6.The results show that beamforming can significantly reduce theinterference of speech signals and improve the recognition rate of the system,and post-filtering can further improve the robustness of the system.3.The Maximum Likelihood Linear Regression(MLLR)and Maximum A Posteriori(MAP)methods are used to adjust the acoustic model parameters to obtain a new acoustic model suitable for the far field environment.The performance of the two algorithms and the graduality of MAP are verified by simulation experiments.The far field speech is collected in the actual conference room and the practicability of the MAP algorithm is verified.4.In order to further enhance the robustness of the system,it is proposed to combine the speech enhancement algorithm and MAP for distant speech recognition.The traditional speech recognition system is used as a baseline system to compare the performance between single algorithm system,joint algorithm system,and the baseline system.The results show that the joint algorithm is robust in the noise reverberation environment,its system performance is better than the single algorithm system,and their performance is better than the baseline system.
Keywords/Search Tags:Microphone array, Beamforming, Post filter, MLLR, MAP, Distant speech recognition
PDF Full Text Request
Related items