Font Size: a A A

Excavation Of Public Figure From Public Opinion

Posted on:2017-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhangFull Text:PDF
GTID:2348330533966501Subject:Engineering
Abstract/Summary:PDF Full Text Request
Large amount of valuable information is contained in data produced by microblog,by excavation of public figure from public opinion we can find bloggers that have large influence,so as to provide support for further data analysis.In order to deal with big data,we need Hadoop platform,using HBase as database and MapReduce step by step in the meanwhile.The microblog data crawled by crawler can be classified into three tables,including blogger information table,blogger relationship table,and microblog information table.We can extract useful fields from them to build a overall influence model.There are many blogger influence models based on PageRank algorithm in the industry.In this paper,based on discussing and analysising various microblog influence factors,the author carried out modeling of fans influence,forward influence and user effectiveness,and finally unified these factors together and established a comprehensive model: microblog bloggers influence model.In the modeling of fans influence,improvement made by author is to let blogger distribute influences to bloggers focused by him or her according to the proportion of comments,so as to improve the degree of differentiation and to reflect the differences of the fans.In the modeling of forward influence,improvement is to let blogger's influence increase with the increasing of forwarded multiplicity,so as to let those bloggers whose microblogs has profound significance gain higher rank.In the modeling of user effectiveness,improvement is to get microblog text quality by the number of forwarded times and commended times,so as to take depth and breadth into account.Finally,the author integrated several indicators: forward rate,comment rate,whether verified,fans index,establishing blogger influence model in the meanwhile.The author first used these models to act on microblog data,the result showed the models' characteristics respectively and showed blogger influence model more comprehensive.And then used these models to act on a lot more microblog data from many hot topics,compared results obtained by these models,showing results get by bloggerinfluence model is more accurate.
Keywords/Search Tags:influence model, pagerank algorithm, Hadoop platform, HBase database
PDF Full Text Request
Related items