Font Size: a A A

Research And Realization Of The Maximization Of The Micro-Blog Network Influence Based On Hadoop

Posted on:2017-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z J JingFull Text:PDF
GTID:2348330488986679Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,a large number of social media have become an indispensable part of people's lives,including Facebook,Twitter,microblogging,etc.Among them,the Sina microblogging has become an important way for people to communicate with each other and spread information.In recent years,various types of research on Sina Weibo have emerged,among which the research of the influence maximization has become one of the hot research subjects.The influence maximization problem is to find K nodes with the greatest impact range in the network.It has great potential value in the monitoring of public opinion and commercial advertising and so on.Currently,the researches on the algorithm for influence maximization of social networks have been relatively mature.These traditional algorithms are of general applicability to various social networks.However,the drawbacks of the applicability of these algorithms is lack of pertinence to a specific social network,such as the micro-blog network,resulting in their accuracy is lower and time complexity is higher.In this paper,in view of the above problems,this paper proposes hadoop-based algorithm for influence maximization in micro-blog network,HBM.This algorithm fully considers the characteristics of the micro-blog network,and redefines the influence among users and the user's activation threshold to calculate the potential impact value of each micro-blog user.In the elicitation phase,the user who has maximum PI value is selected as the seed node to activate other users,then update the PI values of the users who are affected during the activation process each time.Then in the greedy phase,the user who has the biggest influence range increment is selected as the seed node each time.At the same time,the design of the algorithm is based on Hadoop distributed computing framework,which can make full use of the computing platform with powerful data processing ability to solve the calculation problem of a large amount of data on the microblogging network.Finally,we designed and implemented a system using this algorithm to calculate the maximum influence of microblogging to apply the idea to the actual.This paper based on a series of experiments was compared with the traditional greedy algorithm to verify the superiority of this algorithm in Hadoop distributed computing.And the original data used in the experiment are the real data of Sina Weibo users.It was found by the experiments that under certain parameter conditions,the algorithm had the best effect,and was much better than the greedy algorithm,and the algorithm computation time was less than the greedy algorithm.Therefore,the algorithm proposed by this paper has a good performance in the impact range and time complexity in the micro-blog network,compared to the traditional greedy algorithm has certain advantages.
Keywords/Search Tags:Hadoop, influence maximization, micro-blog network, heuristic algorithm, greedy algorithm
PDF Full Text Request
Related items