Font Size: a A A

A Research On The Similarity Of DBLP Website Users Based On Bayesian Network

Posted on:2020-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:J M YeFull Text:PDF
GTID:2417330575490829Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile devices and mobile services,people's daily life is closely related to various forms of online social networks.Every moment,people gather and generate a large amount of data,which contains abundant user behavior information.Therefore,it is very meaningful and necessary to study this huge data source.However,a lot of uncertain knowledge exists in the research and mining of these user information data.Taking the recommendation algorithm in social network as an example,we hope to achieve the goal of making more accurate prediction of the future choice tendency of a certain part of users who know their past information,how to use this part of information,combined with the user group behavior similar to that user.In fact,how to clearly display and measure the uncertain knowledge of user similarity has always been a great challenge in the research process of commodity recommendation and evolution of user relationship in social networks.Based on this,this paper proposes a user similarity discovery and measurement method based on Bayesian network,which is an important probability graph model.Combining the characteristics of topological structure and probabilistic reasoning,Bayesian networks have great advantages in expressing and discovering uncertain knowledge.The research on social user groups based on Bayesian network model at home and abroad is at a relatively new stage,especially in the case of large data.In the research of social user similarity based on Bayesian network,Xu?2015?[39]used DBLP?DBLP,Data Base Systems and Logical Programming?data set to construct Bayesian network with large-scale data in Hadoop software.However,in this paper,it only gives inferential prediction for the user similarity measurement of DBLP website.The theoretical nature of this model,the attributes of data itself,and the extended application in other data cases are not explained in detail.In this paper,we give a more detailed theoretical knowledge explanation of Bayesian network model.After updating the data set,we first study the DBLP data set itself.After qualitative judgment of the cooperative relationship between users,we measure the similarity based on Bayesian network.Because Hadoop software has high requirements for hardware facilities,this paper decides to use Python software to implement the algorithm of model building,in order to verify the efficiency and convergence of user similarity Bayesian network construction.?1?DBLP data set processing.For data format analysis and conversion,and using FP-Growth algorithm mining,it is found that there is a certain correlation between the collaboration of DBLP website users,and the collaboration between users with different papers output is different.?2?Establishment of user similarity Bayesian network in social networks.This paper presents a user similarity Bayesian network?USBN?model structure.In user similarity Bayesian networks,directed acyclic graphs are used to represent the conditional dependencies among user nodes,and the corresponding conditional probability parameter tables of each node are calculated to quantitatively describe the dependencies among its nodes.Through the graph structure of user similarity Bayesian networks,the user relationships in social networks are simulated to reflect the true strength-weakness relationship among users.And the indirect similarity of users is obtained based on the reasoning function of user similarity network.?3?Simulation experiment based on user similarity Bayesian network.Using the processed DBLP data set,Python algorithm verifies the accuracy of the USBN model.The stability of the model is good and the operation efficiency is high under certain scale data.The direction of improving the weighted user Bayesian network is put forward.The aim is to reduce the network nodes and simplify the network structure so as to improve the operation efficiency.DBLP is an important English paper database in computer field,so it is necessary to study the database itself.Considering that there are few research and analysis Literatures Based on DBLP data set at home and abroad,this paper will deal with and analyze the data,which is expected to provide reference for the research in this direction.
Keywords/Search Tags:Social Network, User Similarity, Bayesian Network, DBLP Data Set, USBN models
PDF Full Text Request
Related items