Font Size: a A A

Combination Of Feature Engineering Andranking Models For Friends Recommendin Social Networks

Posted on:2016-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:S FengFull Text:PDF
GTID:2298330467992836Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet industry, we ushered the era of web2.0. In recent years, both at domestic and abroad, various types of social networking Service (SNS) platform has developed rapidly, and the SNS soon becomes the new hot spot. The SNS not only offers Internet users information support and services in deep-level and multi-angle, but also get people together more closely, so that the real world continue to penetrate in the virtual network, which greatly promoted the development of the Internet, and let the virtual networks integrated into people’s real lives more increasingly. However, more and more large-scale users and the vast amounts of information generated by users caused a problem of information overload, while it is difficult for users in the social network to find the potential friends, the interest topics and the information among the huge number of users and information. Therefore, to recommend suitable friends for users in the social network is very meaningful, and it has also become a source of great commercial value of the internet in web2.0era. The major work of this paper is presented as follows:First, this paper has reviewed and summarized the concept of relationship recommendation Systems of social network, focused on the classic relationship recommendation algorithms based on content matching, common interests and social graph. And then we compared the application scenarios, elaborated evaluation indicators of these three algorithms. Secondly, inspired by the KDD Cup2012tournament and the Facebook recommendation competition, this paper proposed a new algorithm named "Combination of Feature Engineering and Ranking Models for Friends Recommendation in Social Networks" and give the overall framework of this model. In addition to use the Sina Weibo API for data acquisition module, the model can be divided into three modules, the candidate set construction module, the feature extraction module and the ranking algorithms module. Among them, the candidate set building module use local random walk algorithm to compute the similarity between the user and all other users, then we rank these users with the value of similarity from high to low, finally got the top N users as a candidate set. In this module, we use LRW Friend algorithm, and because of its bad results, we proposed Biased-LRW algorithm to suit the property of Sina Weibo. we can build a "User-candidates" pair with the target user and each candidate in the candidate set, then use the feature extraction module to extract the features of the " User-candidates " pair by using the user’s attributes, social relationships and text information. When we do feature extraction for text information, we used the LDA topic model to do the user clustering job, and proposed a new model "the word vector-based semantic similarity matrix used in the user clustering algorithm". In the Ranking algorithm module, we used combination of tree models as ranking algorithm like Random Forest and GBDT, this module is divided into model training and model prediction two parts. We labeled the existing "fans list","followers list" and the "bi followers list" mutually as training set, and use this training set to train the ranking model; The model prediction part receives the feature vector of the "User-candidate" pair, and the output is a score, this value represents the interest matching degree between the user and the candidate, the score is higher indicates the probability of the user will follow the candidate in the feature is higher. In this paper, we used the internal cross-validation methods to adjust parameters of the model, so that the training model has a strong generalization ability, can not only improve the performance of the model, but also avoid over-fitting.
Keywords/Search Tags:Social Network, Friends Recommendation, RandomWalk, Feature Engineering, Ranking Model
PDF Full Text Request
Related items