Font Size: a A A

Information Retrieving Based On Non-Negative Matrix Factorization

Posted on:2007-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:J X JiangFull Text:PDF
GTID:2178360212965579Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Recently, the non-negative matrix factorization (NMF) algorithm is proposed and can be used as a new method for feature extraction. Compared to other methods, the non-negative constraint makes the features extracted by NMF reflect more localized characteristics of the samples, and thus correspond more to human cognition. So the feature vectors extracted by NMF are easier to explain. This thesis focuses on the research about application of NMF in the field of information retrieving. Using NMF as a method for feature extraxtion, we designed some new algorithms for information retrieving, which has better performance than the previous ones.First, by surveying the existing NMF algorithms and its modifications, we find the characteristics of each algorithm and the relationship between them. Then, we compared NMF with other existing feature extraction methods, and make clear about the virtue of NMF as a feature extraction method. At last, we apply the NMF to the field of text processing and log mining, and design effective classification and clustering method based on NMF. In the field of document classification, we propose the method to obtain local semantic spaces based on NMF, and prove, from both theoretical analysis and experiments, that document classification in local semantic spaces will improve the classification precision greatly. In the field of log mining, we designed an approach to obtain typical user session profiles based on Localized NMF. We conduct experiments to show the effectiveness of our approach, and prove that the Localized NMF, which requires its basis vectors to be as orthogonal as possible, has better performance than NMF in obtaining typical user session profiles.
Keywords/Search Tags:Feature extraction, non-negative matrix factorization, document classification, typical user session profile
PDF Full Text Request
Related items