Font Size: a A A

Author Collaboration Prediction In Academic Heterogeneous Information Networks

Posted on:2017-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhangFull Text:PDF
GTID:2348330488459951Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the explosive growth of research works and publications in recent years, scholarly big data has become a hotspot increasingly. To reveal the information hidden in the scholarly big data, relationship analysis among various academic entities has been studied from different perspectives. In this paper, we focus on the problem of collaboration relationship prediction between authors in the academically heterogeneous information networks, which aims to predict whether two authors that have never collaborated before will build the collaboration relationship sometime in the future. Learning about the future collaboration of an author is helpful to understand the author’s academic circles, whether the author is a cooperated or an independent researcher, and the mechanism behind the collaboration relationship building.Different from the traditional relationship analysis in the homogeneous information networks, we use the heterogeneous information networks closer to the real world to solve the problem of co-authorship prediction. In this paper, we propose a new model called MCCP, i.e., Meta path and Content information based Collaboration Prediction, which includes two stages:the meta path and content information based feature extraction, and the logistic regression based collaboration prediction. First, topological features are extracted from the scholarly heterogeneous information networks in accordance with different measures on meta paths. And then, content information is obtained according to the temporal dynamics, transitive similarity and authors" attributes in the networks. We combine the topological features and content information, and get the meta path and content information based feature space. At last, a supervised learning algorithm is employed to determine author collaboration.In this paper, we present the experiments on real information networks, namely the APS and DBLP networks, which show that our proposed model can generate more accurate results compared with the method only considering topological features, and prove that the content information in the networks can improve the link prediction accuracy. In addition, the level of significance of each topological feature can be learned from the model, which is helpful in understanding the mechanism behind the collaboration relationship building.
Keywords/Search Tags:Heterogeneous Information Network, Collaboration Prediction, Meta Path, Content Information
PDF Full Text Request
Related items