Font Size: a A A

Non-negative Matrix Factorization In Network Data Dimension Reduction

Posted on:2014-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:R Q HuangFull Text:PDF
GTID:2298330452453667Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of modern technology, massive amounts of data are col-lected every day. The network data structure, as one of the important data storage modes,is playing a significant role in information industry. Then how to extract informationfrom network efciently can be challenging and meaningful to us.In this thesis, we choose Journals citation network as a research object. The networkhas two properties worth working on: Aeoplotropism and Sparsity. The former renders astereo structure which provides the possibility one could both cluster the journals globallyand rank the magazines locally while the latter limited the complexity of the object.The thesis starts from learning the features of the journals network. Building anew model that can be used to do the dimension reduction for the network and applyingproper algorithm to the model, we got a new low-dimension representation of the perplexnetwork. Generally the classical dimension reduction algorithms transform the networkdata into high-dimension data, then generate a symmetric metric matrix which can bedealt with spectral method. In this thesis, a new statistical model is proposed to labeleach journal a distribution-style tag. Finally a natural low-dimensional representationcomes out.The Non-negative Matrix Factorization(NMF) algorithm fits well in the model set-ting and processing environment, thus it could be an appropriate choice to realize thetarget. The original network data had been preconditioned on the basis of some prioriinformation about the journals citation network. After that the NMF algorithm is appliedinto the network dimension reduction task.The thesis chooses diferent types of journal network to test the new method. Al-so the MDS algorithm and some classical statistical method are introduced as controlmethods. The numerical results show that our method performs better than the commonmethods. And with or without strong priori information about the journals, our methodachieves reasonable results stably.
Keywords/Search Tags:Directed Network, Journal Citation, Strong(Weak) Prior Information, Semi-supervised Learning, Dimension Reduction
PDF Full Text Request
Related items