Research On Top-K Relevant Search In Heterogeneous Information Network

Posted on:2015-01-18

Degree:Master

Type:Thesis

Country:China

Candidate:S L Bu

Full Text:PDF

GTID:2250330431457203

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

The world we are living is interconnected. Most of the data objects such as individuals, organizations or groups are interconnected and interactive, which forms a huge, interconnected and sophisticated network. Without loss of generality, information network is constructed. Examples of the information network in the real world are all around and it has become an important component of modern information infrastructure. Nowadays mining on information network or on its specific kinds such as social networks and e-commercial networks has gained extremely wide attentions from researchers in computer science, biology and social science.The current research in information networks can be divided into research in homogeneous or heterogeneous networks according to the difference of networks. Nodes on homogeneous network are of the same entity type, thus the edges on it have identical meaning. Numerous influential algorithms generate in homogeneous networks such as the PageRank and the community detection methods. However most networks in real world are heterogeneous, in which nodes and links are of multi types. For example, network generated from Renren consists of persons, photos, movies, groups and so on. In addition to the friendship between persons, there may be relationships of other types such as person-movie reviewing relationships and person-photo tagging relationships. Heterogeneous information networks are powerful in representing the interactions between different kinds of entities in real world.There have been many research achievements on heterogeneous information networks, and relevance search in it is a basic and crucial operation which is usually used in recommendation, clustering and anomaly detection. Existing relevance search methods focus on objects in homogeneous information networks. In this paper, we propose a method to find the Top-k most relevant objects to a specific one in heterogeneous networks. It is a two phase process that we get the initial relevance score based on the method of pair wise random walk along given meta-paths, which is a meta-level description of the path instances in heterogeneous information networks, and then take user preference into consideration to calculate the weights combination of meta-paths and model the problem into a multi-objective linear planning problem which can be solved with the method of generic algorithm. Besides, to ensure the efficiency, we use graph partitioning and distributed computing to accelerate the searching process. The experiments on IMDB and DBLP dataset show that the method can gain a better accuracy and efficiency.

Keywords/Search Tags:

Heterogeneous information network, relevant search, user preference, graphpartitioning, distributed computing

PDF Full Text Request

Related items

1	User Interest Model Based On Social Network Research
2	Research On Similarity Search Algorithm Of Heterogeneous Information Network Based On Hierarchical Attention
3	Discovering The Dynamics Of Social Networks And Distributed Search Strategies For Networked Environments
4	Researches Of The Potential User Mining Based On Complex Network
5	Research On Heterogeneous Subnet Evolution Based On Information Spreading And Searching
6	The Research On Complex Conditional Community Search In Heterogeneous Information Network
7	Research On Most Influential Community Search In Heterogeneous Information Network
8	Information Search Algorithm Based On Markov Model In The Mobile Social Network
9	Research On The Influence Of Maximization Based On User 's Preference In Social Network
10	Research And Design Of Gene Similarity Search Method Base On Heterogeneous Network