Font Size: a A A

In-Network Information Retrieval, Structure Analysis And Their Applications

Posted on:2015-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:X S YinFull Text:PDF
GTID:2180330467463038Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the big data era, complex network analysis become more and more im-portant. What big data brings us expands in two dimensions, one is the big volume, the other is the huge connectivity. The former represents that the ever-increasing amount of data makes the population analysis possible, the latter tells us that the relationship become more and more important, therefore our world is not the same with single source, independent assumption one in math-ematics. The ubiquitous networks are in our life. Massive network nodes and links analysis are in face of challenges, and bring us the chance. The flourish-ing social networks point out the importance of network analysis in one aspect. Recommendation engine, advertisement targeting, the fuse of social profile and interest profile, content aggregation and recommendation are the embodiments.In network nodes and links analysis are facing with much more severe challenges. First is the complexity of network obscures our view, which makes comprchension difficult. Take Taobao as an example, in the "Double11" pro-motion, the volume of transaction is35billion. In this scenario, the implicit bipartition graph of "buyer-seller" is huge and massive. There are6million shops,1.1billion items and200million users in Taobao, the trilateral rela-tionship of "buyer-item-seller" becomes even more complex. Second is the improper of node attributes. Due to the privacy clause, social network provide anonymization service for their users. What’s more, users in the network usu-ally fill in with fake messages or even left their attribute in blank. The fake or missing value hinder the network analysis. Third, huge amount of data and massive network challenge the traditional algorithms. New distributed algo-rithms and frameworks are desideratum.Facing with above challenges of complex network in massive data analy- sis epoch, we analyze the nodes and links in network, then we implement the attribute inference and the distributed framework for the sequential machine learning optimization algorithms. As follows:1. We implement attribute inference in network. Our studies includes the traditional classification methods, the random walk in network and the co-prediction of link and attribute. We construct "social-attribute" network, then we map the attribute inference task to link prediction. Our emperi-cal experiments show that there are strong relationship between link and attribute, and our method increases the performance of both two works;2. We present4levels to speedup machine learning algorithms, namely, scale-up, scale-out, breaking machine learning sequential barrier and the program model. We first demonstrate the speedup of NativeTask for MapReduce, then we describe the adaptive between scale-out framework with program model. Finally we concentrate on breaking the sequential barrier of machine learning algorithms and make them run asynchronous-...
Keywords/Search Tags:Complex Network, Attribute Inference, Distributed Algo-rithm, Probabilistic Graphical Model
PDF Full Text Request
Related items