Font Size: a A A

Research On Efficient Processing Techniques Of Influence Maximization In Large-scale Social Networks

Posted on:2014-09-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:X D LiuFull Text:PDF
GTID:1228330422974001Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, with the in-depth research of Internet and Web2.0techniques, socialnetwork, serving as an important medium for communication, knowledge sharing andinformation spreading, has been widely used for bridging the human world. Influencemaximization, as one of the key issues in the filed of social network analysis, has beenextensivelyappliedtomanycrucialscenarios,suchasviralmarketing,advertisementpub-lishing, public sentiment warning, water quality and epidemic monitoring, which showssubstantial research and application importance.Currently, many researchers in academia and industry work on the influence max-imization problem, and propose lots of greedy algorithms and heuristics, which fairlyimprove the efficiency of influence maximization algorithms. Nevertheless, modern so-cial networks are mostly large-scale, highly complicated and essentially dynamic, whichpose serious challenges to the high efficient processing of influential user identification.Most existing algorithms suffer the following problems:First, existing algorithms only focus on designing algorithms with low computationcomplexity, while ignoring the parallelism of influence spread computation. They al-so take no advantage of existing heterogeneous parallel computing frameworks, such asCPU+GPU, for acceleration. In fact, the influence spread computation of many nodesin social network can be performed in parallel to overlap the running time and thus theefficiency can be dramatically improved. Second, they take no consideration of the nodedistribution characteristics in social network. Existing works mainly focus on comput-ing the exact influence spread, leading to low computational efficiency and limiting theirapplication to real-world social networks. Third, previous studies overlook the dynamiccharacteristicsexhibitedduringtheevolutionofreal-worldsocialnetworks. Mostofthemareproposedtodealwithstaticsocialnetwork. While,asamatteroffact,real-worldsocialnetworks keep evolving over time. When confronting dynamic social networks, existingworks will suffer from computing from scratch.To well address the above challenges, this thesis systematically investigates somekeyissuesofinfluencemaximizationinlarge-scalesocialnetworks,especiallyonthehighefficient processing techniques. The research mainly focuses on the following aspects. For parallel computation of influence spread, this thesis in-depth investigates thedependency relationship among nodes in social networks. To improve the parallelismand reduce the complexity of existing algorithms, we propose a bottom-up traversal al-gorithm with inherent parallelism. On the one hand, BUTA can greatly reduce the timecomplexity through DAG conversion and bottom-up traversal. On the other hand, BUTAisdesignedwithsufficientparallelismandcanbemappedtomodernheterogeneousparal-lel computing frameworks. For this reason, we map BUTA to GPU to exploit the parallelprocessing capability of GPU, thus further reducing the execution time. To best fit BUTAwith the GPU architecture, we further develop an adaptive K-Level combination methodto maximize the parallelism and reorganize the influence graph to minimize the potentialdivergence.Inthisthesis, wealsoexploitMonte-Carloestimationtosignificantlyimprovetheef-ficiencyatonlynegligiblecostofprecision. Wefirstanalyzethenodedistributioncharac-teristics in social network. To address the key bottleneck of influence spread computationfor nodes with large degree, we design a power-law exponent supervised Monte-Carlo es-timation method, named ESMCE. ESMCE exploits the power-law exponent of the socialnetwork to guide the initial sampling. Then, based on the disparity of estimation errorand precision requirement, ESMCE utilizes the grey forecasting method to forecast thenumber of child nodes needed in further iteration. Multiple iterative steps run until theprecision requirement is finally achieved.To deal with the influence maximization problem in dynamic social networks, weinvestigate the dynamic characteristics of social networks and observe from real-worldtraces that the evolution of social network follows the preferential attachment rule and theinfluential nodes are mainly selected from high-degree nodes. Such observations shedlight on the design of IncInf, an incremental approach that can efficiently locate the top-Kinfluential individuals based on previous information instead of calculation from scratch.In particular, IncInf quantitatively analyzes the influence spread changes of nodes by lo-calizing the impact of topology evolution to only local regions, and a pruning strategy isproposed to effectively narrow the search space into nodes experiencing major increasesor with high degrees.None of previous content distribution methods comprehensively exploits the valu-able social information of social network, such as social relationship and user geograph-ic information. In this thesis, we propose SCORE, a social-aware content distribution method based on influence maximization problem. SCORE fast locates the contents thatare potential to trigger large cascades. Then SCORE leverages the social relationship andgeographicinformation, andselectstargetserversbyK-MEANSclusteringalgorithmandweighted spherical mean calculation. Through this, SCORE effectively pushes selectedcontent to geo-located servers before potential users actually request the content so as toreduce user latency, improve quality of experience, and alleviate network traffic pressure.To summarize, our works present solutions to several essential issues of influencemaximization which are key requirements in social networks. Comprehensive experi-ments demonstrate that proposed algorithms can properly achieve their design goals.
Keywords/Search Tags:Large-scale Social Network, Influence Maximization, CPU+GPUHeterogeneous Parallel Computing, Monte-Carlo Sampling Estimation, DynamicSocial Network, Content Distribution Method
PDF Full Text Request
Related items