Font Size: a A A

Effect Of Phase-locking Response In Auditory Midbrain On Speech Perception And Automatic Initial/final Segmentation In Continuous Mandarin Speech

Posted on:2020-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:M N SunFull Text:PDF
GTID:2428330623965313Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
Phase locking response is one of mechanisms in auditory midbrain neurons encoding spectral-temporal features of periodic signal,which transfers into higher-level auditory system for integration,and further realize speech perception.It is unclear that whether or not phase-locking coding mechanism of auditory midbrain has an effect on speech perception.The first part of this thesis studied effects of complex harmonics with different temporal modulation coding characteristics on speech perception in continuous mandarin speech.We chose mandarin perception corpus following by manual initial/final labels.We replaced the vowel segments within the sentence by complex harmonics with four kinds of phase relationship.Consonant and silence segment remained as it was.And then four test speech stimuli were produced.Original speech stimuli chose as control group.Within-subject design was used and twenty normal hearing Chinese native speakers were invited to participate into this experiment.Each subject listened to all test conditions in quiet.Word and sentence recognition accuracy were recorded to quantify speech intelligibility.Results showed that degraded temporal fine structure in vowel mitigates speech intelligibility;Complex harmonics with identical starting phase relationship evokes stronger phase locking response in auditory midbrain neurons,which facilitates speech perception and intelligibility of central neural system;Complex harmonics with random starting phase relationship evokes weakest phase locking response in auditory midbrain neurons,which impaired speech perception and intelligibility of central neural system.Manual initial/final segmentation has many shortcomings,such as subjectivity,non-reproducibility and long time consumption.In order to improve segmentation efficiency,accuracy and realize batch processing,the second part of this thesis proposed a new automatic initial/final segmentation algorithm in mandarin Chinese consisted of two-stage support vector machine model and regular boundary fusion strategy.The logarithm energy and 39-dimensional mel frequency cepstrum coefficients in train speech were chosen as frame feature successively,inputting two-stage support vector machine to train it.Two candidate boundaries based on segmental cosine similarity and Euclidean distance used to refine initial boundaries predicted by trained two-stage support vector machine model in test speech sample.F-measure results shows that the mean value of proposed algorithm in test set was 94.01%,which increased by 12.08%compared to initial boundaries.We also analyzed noise immunity of proposed algorithm with Gaussian noise environment.It is concluded that the algorithm have a stronger anti-noise ability.Thus it is not only overcome shortcomings of manual segmentation,but also adapts noiseenvironment much better,which has important practical value in the speech recognition,synthesis,coding or enhancement application field.There are 49 figures,6 tables and 66 references in this thesis.
Keywords/Search Tags:phase-locking coding, complex harmonics, speech perception, two-stage support vector machine, regular boundary fusion strategy, noise immunity of algorithm
PDF Full Text Request
Related items