Font Size: a A A

The Edge Weight Computation With MapReduce For Extracting Weighted Graphs

Posted on:2018-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:J P WangFull Text:PDF
GTID:2348330536956338Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Data has penetrated into many industries with the rise of Internet technology and the development of big data.On the one hand,the amount of data generated by various applications and social networks is increasing,and on the other hand,the data tends to be diversified and complicated.With the increase in the amount of data and data types,data mining and analysis can be used to obtain more hidden patterns,find more business information,thereby to achieve value-added value.At present,there are many methods of data mining,for example,correlation analysis,collaborative filtering,cluster analysis,regression analysis,and deviation analysis.However,many data mining methods are based on the structure of the graph,and then use the relevant algorithm to analyze the graph,so that the data association is ful y expressed,so as to obtain valuable information.The weighted graph structure of massive data is very important for data mining based on weighted graph theory.Therefore,it is necessary to automatically extract weighted graphs by large-scale data in data mining based on weighted graphs.The construction of the weighted graph includes:(1)vertex determination;(2)feature extraction;(3)edge weight computation.Where the computing of the edge weights is a computationally intensive and I / O intensive task.When calculating the mass data,the edge weight computation is time-consuming or even fails to complete on a single machine when necessary resources are exhausted,for example memory.In order to overcome the limitation of single machine resource,this paper focuses on the research of weighted graph construction scheme,we first propose the classification,implementation and evaluation of edge weight computation algorithms with MapReduce parallel programming model.First,this paper proposes a parallel weighted algorithm based on the popular distributed parallel programming model of MapReduce,and introduces how to implement them in the parallel distributed framework of MapReduce,so as to realize the automatic construction of weighted graphs.Secondly,the accuracy of the weight in the weighted graph will affect the result of data mining.We combine the current work to measure the accuracy of the side weights and propose comprehensive assessments on edge weight accuracy in terms of the number of edges,strength distribution,community structure,Hop-plot,and effective diameters.Finally,a performance study has been conducted to evaluate these algorithm in term of execution time,memory consumption and disk usage by the social network data set.Our experimental results show that we can find the most efficient parallel and distributed edge weight computation algorithm for the automatic construction a weighted graph for a given mass data set.
Keywords/Search Tags:Extract weighted graph, Edge weight computation, Similarity measurement, MapReduce, Data analysis
PDF Full Text Request
Related items