Font Size: a A A

The Research Of Microblog User Influence Algorithm Based On Hadoop

Posted on:2018-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z H HuangFull Text:PDF
GTID:2348330536970879Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The emergence of microblog has become a media phenomenon and makes people have more timely and profound understanding on news of current political,economic,sports,entertainment.Words in all directions on the people's thinking and behavior had a huge guiding role,which is the most obvious impact of large V users.Large V user is referring to the High-impact users,who initiated discussions and topics are highly influential in society.We can even say that these large V users lead and create the current hot topic.Therefore,the microblog user influence is a worthy in-depth research direction.This paper will analyze the traditional Page Rank algorithm in the disadvantage of microblog influence and put forward the new influence algorithm.The rapid development of the Internet makes the world into the era of large data.The word "big data" is the topic of discussion in all areas of the industry.In order to study the user influence,the research data of this paper originates from the massive user data of Sina microblog.This paper uses a convenient and efficient Hadoop distributed computing platform for data processing and algorithm implementation.At first,this paper describes the Hadoop platform and its related technical theory in detail such as: HDFS,Map Reduce,HBase.And then describes the background application and algorithm principle of Page Rank algorithm on the current assessment of microblog user influence.Then this paper analyzes the characteristics of microblog users and found that Page Rank algorithm only considers the number of followers,which has a big flaw in microblog user influence assessment and it is difficult to accurately rank the user's influence.Because the Page Rank algorithm divides the number of fans' concerns in the process of assigning the influence value of the user,ignores the behavior between users,such as: forward,comment,point praise,So this paper comes up with an improved algorithm based on Page Rank algorithm which is WB-UR algorithm by integrating the four main behavioral factors of attention,forward,like,comment on the user influence in the distribution of weight.Then use the Sqoop tool to import data into HBase in order to efficiently provide the data needed to implement the algorithm.At last,this paper implements the Page Rank algorithm and the WB-UR algorithm respectively in the successful construction of the Hadoop platform and verified that the ranking of WB-UR algorithm relative to Page Rank algorithm is more in line with the actual situation by analyzing the experimental results of the Page Rank algorithm and the WB-UR algorithm.The optimized WB-UR algorithm has more comprehensive and stronger reliability in the evaluation of user's influence.
Keywords/Search Tags:Sina microblog, larger V user, Hadoop platform, Page Rank algorithm, WB-UR algorithm
PDF Full Text Request
Related items