Font Size: a A A

Chinese Dialect Identification Based On Manifold Learning

Posted on:2015-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:J J JiaFull Text:PDF
GTID:2308330479983926Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
One key issue of Chinese dialect identification is the extraction of dialect features, because the quality of feature extraction has a direct influence on system performance. Traditionally, feature extraction mostly inherits the theory and methodology of language identification, while ignores the characteristics of Chinese dialect, for example, tonality. Beside, the voice data is a typical manifold distribution and not applied in Chinese dialect identification. The single feature adopted by traditional system has great limitation in the description of data structure and the exploration of information. To solve the above problems. this paper will introduce manifold learning algorithm to Chinese dialect identification, and improve the performance of Chinese dialect identification in the following aspects: low-dimensional visualization of the voice of Chinese dialect; the feature extraction of Chinese dialects through manifold algorithms; the enhance of manifold learning algorithms and the fusion of features. Details are as follows:1. To prove the existence of the manifold structure in Chinese dialects. This paper analyzes Chinese dialects in terms of low-dimensional visualization, the simulation results show that compared with the linear dimension reduction algorithm, manifold learning algorithms are better able to reflect the differences between the different parts of the Chinese dialect speech in low-dimension, and proved that the manifold structure in the dialect voice data indirectly.2. The manifold learning is utilized to extract the Chinese dialect new features. Through the observation of low-dimension manifold structure and the analysis manifold learning algorithms, using the locally linear embedding algorithm to extract features of Chinese dialects.3. The algorithm itself is improved on the basis of the manifold learning algorithm. Aiming at the shortcomings of the local embedding algorithm exists, improving method of Euclidean distance to improve the distribution of sample data sets, and combining with clustering algorithm to extract Chinese dialect new features.4. Building a set of Chinese dialect identification system based on manifold learning, to prove the validity of the new features. Using Gaussian Mixture Model and Support Vector Machine as the back-end classifier system, and the simulation results show that new features can effectively improve the performance of the system. At the same time by using the method of feature fusion of new feature and traditional features fusion effectively, further enhance the effectiveness of the features.
Keywords/Search Tags:manifold learning, low dimensional visualization, feature extraction, feature fusion, Chinese dialect identification
PDF Full Text Request
Related items