Font Size: a A A

Research On Voice Recognition In Ship VDR

Posted on:2010-09-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:W J ZhouFull Text:PDF
GTID:1118360302487620Subject:Navigation, guidance and control
Abstract/Summary:PDF Full Text Request
Robustness is the key problem for voice recognition systems in practical application. Background noise and stressed pronunciation are two main factors that influence the performance of voice recognition systems. Because the voice model is difficult to set up, that non-specific person is even more difficult to indetify. In the studies about the noise, reseatchers found that the emotion of speaker and the style of pronunciation change with the augment of noise, which will introduce stress. This thesis focuses on voice recognition in Voyage Data Recorder (VDR) environment, several new approaches are proposed in the view of both feature-based and model-based processing for voice recognition with noise.From feature-based processing point of view:Firstly, considering the different apperceive degree to different frequency segment of speech signal owned by human ear, sub-band frequency weighted methods combining human ear loudness characteristic are proposed to reduce the niose impact to MFCC feature, the principle is that the more contribtion rate for identification, the more relatively high weight of sub-band is endued, whereas, the relatively low weight is endued. Secondly, motivated by the obvious nonlinear phenomena in the process of speech generation, through the in-depth study of nonlinear AM-FM model, weighting method of improved MFCC feature coefficient is proposed. It can make use of the information in amplitude envelope and instant frequency, at the same time taking the sub-band frequency characteristic of cochlear into account to improve the system performance. To some extent, it can solve the problem that each feature dimension has different sensitivity, and the system robustness is improved by weighting improved MFCC feature coefficient with the maximum relative entropy weight vector.From model-based processing point of view:Unified background model GMM-UBM is established with self-adaptive target model methods proposed by Reynolds. Dynamic threshold algorithm is proposed in this thesis and it can dynamically track configuration to implement voice recognition under the condition of open set. GMM-UBM has function of shielding background noise, and better system performance is obtained by solving problems not only in model training speed but also the same mixed degree speaker models by smaller training samples.Additionally, in the pre-processing stage, dynamic self-adaptive threshold endpoint detection method based on Approximate Entropy is proposed to reduce the ships noise impact to recognition system, and the result shows that the method of the endpoint detection based on Approximate Entropy is superior to that of Entropy-based.
Keywords/Search Tags:Voice Recognition, Approximate Entropy, Sub-band Frequency weighting, MFCC feature weighting, Unified Background model, AM-FM model
PDF Full Text Request
Related items