Font Size: a A A

Research On Initialization Of Non-negative Matrix Factorization And Its Application To Chinese Text Mining

Posted on:2008-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y L DiFull Text:PDF
GTID:2120360242498665Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Nonnegative Matrix Factorization (NMF) is one of recently developed dimensionality reduction methods. It has the abilities of interpretation and parts-based representation for perception. The NMF method has broad application. According to the initialization of NMF method and its application to Chinese text mining, this paper emphasizes on the following aspects:1. We summarize the system info of NMF methodWe research the emergence, developing process and analytical character of the NMF method. Additionally, we summarize the problems of this method that are required to be resolved. Especially, we research the exisitance, uniqueness, convergence, and stationarity of the solution to NMF method.2. Three initial methods was proposed.For the initialization problem of NMF method, we proposed three initial methods, PCA, supervised PCA and fuzzy c-means applying in Chinese text mining. At the same time, we discuss the theories of the three initial methods, explain and prove the application technique. We have solved the problem of great matrix decompose in PCA and SPCA.The experimental results of multi-class text classification indicate: comparing with random initialization, the three methods effectively improve the text classification results and enhance the result stationarity. The SPCA is the best of the three methods.3. We propose a new text feature select method, Improved Mutual Information method. We describe the common process of Chinese text mining, analyze the feature select issue.A new feature select method is proposed, the experimental results indicate: comparing with traditional methods, the improved method has smaller computation complexity and achieves well effect.4. We apply NMF method to Chinese text mining successfully.We compare NMF method with several dimensionality reduction methods that are commonly used in text mining, such as Random Projection, Concept Indexing, and Latent Semantic Indexing. The experimental results indicate: NMF method has the merits of efficient computation, improving the text classification results and enhancing the result stationarity, easily dealing with largescale data. In the clustering search result problem of web information searching, camparing with traditional Lingo method, the NMF method is more straight and more facile, because NMF method avoids transforming the abstract concepts to prepared concepts. This point makes use of nonnegative and explicable qualities of NMF method.
Keywords/Search Tags:Nonnegative Matrix Factorization, supervised PCA, fuzzy c-means, Chinese text mining, feature select method
PDF Full Text Request
Related items