Font Size: a A A

Research On Feature Invariance In Speech Recognition

Posted on:2018-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:H B ChenFull Text:PDF
GTID:2348330518494544Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Speech is the most direct and effective way for people to exchange information,under the background of the rapid development of mobile Internet and intelligent hardware, speech recognition technology is gradually being everyone's attention. At present, speech recognition has become one of the important interfaces of human-computer interaction field, but at the same time,there are still many problems in speech recognition,for example,the problem of the invariant representation of speech features for semantic content, speech features contain semantic information and individualized information, but for speech recognition, individualized information will blur the boundary between features, so in order to enhance the ability of speech features to express the semantic content, this paper has done three aspects of the work as follows:1 .A spectral warping algorithm which makes the subglottal frequency factor and formant frequency factor to be weight confusion is proposed.In the past, the spectral warping algorithm often only considered one of differences in the pronunciation, such as the vocal tract and the glottal. But in fact the pronunciation is a complex process and the pronunciation differences are not independent, so only considering one of differences can not well solve the problem of speech feature invariance. Therefore, this paper presents a spectral warping algorithm which makes the formant frequency factor and subglottal frequency factor to be linear weight confusion and non-linear weight confusion, the algorithm considers differences in vocal and glottal and the spectum alignment is smoother that ensure to maximize the retention of the semantic information in spectrum warping process.Experimental results verify the effectiveness of the proposed algorithm.2.A feature extraction algorithm which combines the vocal tract length normalization and the spectraltilt compensation is proposed.People change the content of pronunciation by changing the shape of the vocal tract, so the personalized information will be introduce to the speech because of the difference of vocal tract.Vocal tract mainly affects the location of formant and the amplitude of formant is also vary from person to person. Formant frequency is also an important indicator of the timbre representation.In the past, speech recognition mainly focused on the difference of formant frequency while ignoring the amplitude difference, but actually amplitude will also affect speech features.Therefore, this paper puts forward a feature extraction algorithm which combines the vocal tract length normalization and spectraltilt compensation.The algorithm can simultaneously solve the differences of location and amplitude of formant frequency.Experimental results verify the effectiveness of the proposed algorithm.3.Dimensionality reduction of speech features based on supervised neighborhood preserving embedding algorithm is studied.The redundant content of speech features cause the feature distributions overlap between class and disperse within class.So this paper tries to reduce the redundant content by means of dimension reduction and has also introduced the speech distribution information and the categories constraint to the unsupervised NPE algorithm, but the experimental results are not satisfactory.
Keywords/Search Tags:Speech recognition, Spectrum warping, Formant, Subglottal, Feature transform
PDF Full Text Request
Related items