Font Size: a A A

Research On Breadth-first And Depth-first Sampling Strategy For Social Network Data

Posted on:2019-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y N GuoFull Text:PDF
GTID:2417330545963019Subject:Statistics
Abstract/Summary:PDF Full Text Request
The rapid development of modern society's information technology has provided the foundation for all information data,and huge data covers all aspects of life.From the point of view of data generation,social networking is an important source of big data today.At the same time,social networks have long become a global mode of communication,with more than a billion active users per month,ranging from several hundred million.The data generated by social networks has the characteristics of "big data" with a large volume of data,a variety of types(Variety),a low value density(Value),and a high speed(Velocity)4V characteristic.In addition,social network data also has complex network characteristics such as small-world,scale-free,and community structure.How to effectively analyze such data,whether the traditional sampling method is applicable,whether it can obtain a better sample network to perform accurate statistical inference on the original network,etc.,needs to be solved urgently.It is in this context that the data is obtained from social network data through breadthfirst sampling and deepth-first sampling.The sample network is obtained and the effect of the two sampling strategies on the estimation of the original network is compared.On the one hand,three kinds of network models are constructed to simulate experiments,and on the other hand,empirical data are analyzed empirically through the real watercress social network.The effects of the two sampling strategies were compared from topological network indicators such as degree distribution,average degree,and clustering coefficient.Combined with the simulation experiments and empirical analysis of this paper,the following conclusions are drawn: 1.Breadth-first sampling and depth-first sampling have good sample acquisition capabilities,and the samples obtained are biased samples,with a certain degree of asymptotic properties.2.Breadth-first sampling has the characteristics of high sampling speed and large memory usage.Depth-first sampling is slow in sampling speed and occupies less memory.3.Different network types,breadth-first and depth-first sampling are not the same.Depth-first sampling is suitable for stochastic networks,and breadth-first sampling is generally more suitable for WS-small-world networks and BAscale-free networks.On the actual social network data,in general,under the condition of low sampling rate,breadth-first sampling has more advantages.Under the condition of higher sampling rate,depth-first sampling is more worth considering.
Keywords/Search Tags:Big Data, Social network, Breadth-first Sampling, depth-first Sampling
PDF Full Text Request
Related items