Font Size: a A A

The Analysis And Research Of Large-Scale Telecom Data Based On Complex Network

Posted on:2011-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:S Q YangFull Text:PDF
GTID:2178360308462342Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recent studies on network science, especially in complex network and graph mining, have spurred significant interests in human behaviors. Many analytical methods have been actually implemented in various scenarios, such as physics, biology, politics, economics, world wild web, engineering and social life. During the past decade, by abstracting data into network (graph) based structure and employing methods from data mining, machine learning, pattern classification, information retrieval and statistic reference, researchers have revealed the underlying patterns behind the complex data, and thus provided unprecedented insights into objective individuals. At the same time, the continued exponential growth in both the volume and the complexity of information is giving birth to a new challenge to the specific requirements of analysts, researchers and intelligence providers. With respect to this challenge, a new class of techniques and computing platforms, such as MapReduce model, which mainly focus on scalability and parallelism, has been emerging.In this paper, we first present a comprehensive multidimensional study of massive telecom data. The important difference laid in our work and previous mainly topological analyses is that we report on multiple aspects of the dataset, such as the age, gender, call times, duration and base station. We resolve to describe the hidden behavior patterns of the daily interaction of human beings. In addition, temporal pattern in social structure is an important research area. Uncovering those evolving patterns has posed a series of challenging problems in network evolution. In this paper, we present a fundamentally different framework for uncovering the intricate properties of evolutionary networks. The framework firstly traces the timelines of networks. Then based on extracted smooth segments from the timeline, a graph approximation algorithm is applied to capture the frequent characteristics of the network and reduce the noise of interactions. By employing the relationship between multi-attributes, an innovative community detection algorithm is proposed for detailed analysis on the approximate graphs. To track these dynamic communities, we also introduce a community correlation and evaluation method.Moreover, to process terabytes or even petabytes data, in this paper, we propose a unified distributed method in solving some critical graph mining problems. These problems include graph transformation, sub-graph partition, maximal clique enumeration, connected component finding and community detection. In response to recent cloud computing, to move the scientific prototype forward to practice, we also elaborate a prototype of our applied distributed system, DisTec, for knowledge discovery and data mining in the field of telecommunications. DisTec is implemented by layers on top of a cluster environment and applied to real-world large-scale telecom dataset. By experiments, we demonstrate that our system has a good performance in such cloud-scale data computing.
Keywords/Search Tags:Graph Mining, Multidimensional Analysis, Evolution Analysis, Community Detection, Cloud Computing
PDF Full Text Request
Related items