Font Size: a A A

Research On Influence Maximization Algorithm Based On Term Frequency And Node Similarity

Posted on:2017-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:Q Z HuFull Text:PDF
GTID:2348330512456395Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of network related technologies, relations between people are becoming more diverse and a variety of social networks have formed. The relationship between nodes in the social networks is becoming more and more complex, which contains a lot of valuable information. The influence of social networks has many applications in the real world, which has a great impact on people's lives and work. For example, someone may use the influence of the nodes in the network to publicize his merchandise or to find the most influential network nodes to conduct public opinion analysis. So it has important theoretical and practical significance to study the influence of social networks.Influence Maximization problem is that computing the influence ability of the nodes in the networks and then finding the top K nodes in terms of influence ability, called seed set. As research on the influence maximization problem, a large number of influence maximization algorithms have been proposed, such as greedy algorithm, PageRank and IRIE. Since the greedy algorithm's calculation is too large, it is not suited to large networks. So people made a lot of improvements by proposing many approximation algorithms. Based on the degree of nodes in the network and edge density, these influence maximization algorithms are not able to compute the influence ability accurately, which results in the limited influence ability of the seed nodes. To solve this problem, this thesis proposes an influence maximization algorithm based on word frequency and node similarity (IMFS), so that we can compute the influence ability more accurately and the seed set also can influence more nodes in the social networks.In order to compute the influence ability of nodes more accurately, this thesis combines the node similarity and word frequency. Since the scope of information propagation in the networks is not only related to the node similarity in the network, but also on a great relationship with the content of the information. We use the word frequency to distinguish different information and measure the similarity between nodes with the help of node similarity. Meanwhile, this thesis proposes three propagation modes and calculates the influence ability synthetically. We not only take into account the information propagating to neighbor nodes, but also to non-neighbor nodes. On the other hand, this thesis proposes a node preprocessing algorithm based on the node attribute to eliminate useless nodes and reduce the impact of unnecessary nodes. Finally we implement our algorithms, using three real datasets and tested the algorithms under linear threshold model and independent cascade model and compared with conventional influence maximization algorithms. The experiment results show that IMFS is superior to other heuristic algorithms to obtain a greater influential scope of seed set. Meanwhile, the running speed of the algorithm is also an increase, indicating that the proposed algorithm is efficient algorithm to solve the influence maximization problem, which can be applied to large networks.
Keywords/Search Tags:Influence Maximization, Node Similarity, Term Frequency, EM Algorithm
PDF Full Text Request
Related items