Font Size: a A A

Morphological Normalization Of EMA-based Data For Articulatory Speech Recognition

Posted on:2018-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:J S ZhangFull Text:PDF
GTID:2348330542481355Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Minimizing morphological variances of the vocal tract across speakers is a challenge for articulatory analysis and modeling.Reducing morphological variances would benefit for analyzing speech production characteristic and speech recognition performance.In order to reduce morphological differences in speech organs among speakers and retain speakers' speech dynamics,our study proposes a method of normalizing the vocal-tract shapes of Mandarin and Japanese speakers by combining vocal tract gridline system and Thin-Plate Spline(TPS)interpolation method.We apply the properties of TPS in a two-dimensional space in order to normalize vocal-tract shapes.And the straightening palate wall method serves as a reference.The effect in acoustic space is also studied with comparing several vowel normalization methods.Furthermore,we also use DNN(Deep Neural Networks)based speech recognition to evaluate articulatory information before and after normalization.Electromangnetic Articulographic(EMA)databases for Mandarin and Japanese vowels from NTT have been used.We obtained our template for normalization by averaging three speakers' palates and tongue shapes.Our results show a reduction in variances among subjects.The similar vowel structure of pre/post-normalization data indicates that our framework retains speaker specific characteristics.Results in acoustic space also show that the variations have also been reduced and the vowel spaces keep the consistency in articulatory and acoustic space.Our results for the articulatory pronunciation movement recognition of isolated phonemes show a decrease of 25% in phone error rate.Moreover,the phone error rate of continuous articulatory pronunciation movement recognition reduced by 5.84%.
Keywords/Search Tags:Vocal tract normalization, Articulatory data, Acoustic data, Thin-Plate Spline, DNN, Articulatory recognition
PDF Full Text Request
Related items