Font Size: a A A

The Mining And Analysis Of The Aberration Of Mass Telecommunications Data

Posted on:2014-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:F D LiaoFull Text:PDF
GTID:2248330398470743Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of scientific research, telecommunication technology and IT technology, the traffic of telecom service grows significantly. Therefore, the fierce competitions between telecom services providers make them pay more attention to the quality of network and service in order to increase their industry competitiveness. One of the most important parts of ensuring the quality of services is to obtain anomalous and useful potential information from large amount of information. This kind of mining information is also the main task of aberrant data mining.This paper undertakes a series of studies on both the relevant data mining and parallel computation technique in order to improve network and communication quality by miningaberrant information fromlarge amount of telecom traffic.According to the characteristics of those abnormal users, this paper comes up with a analysis algorithmof aberration combining outlier detection and cluster coefficience. Outlier detection algorithm is an evolvement of LOF based on capacity by substituting K-neighboursearch algorithm by SimHash algorithm. Besides, combining the peculiarity of Ping-pong switching, this paper suggests a predication using multi-label classified algorithm to solve Ping-pong switching. Based on the multi-label of Random walk diagram, this paper combines total probability and random process to accomplish the multi-label classified algorithm. In order to enable the improved algorithm to tit with large amount of traffic, all of the algorithms are implemented by MapReduce framework. What’s more, this design reduces the time complexity by trading space for time and finally achieves the parallel computation. A lot of experiment results show that this aberration analyses algorithm and parallel Ping-pong switching predication algorithm prove to be relatively highly accurate and effective.This project finally suggests a prototype design of aberrant analyses, combining Hive and MapReduce which achieve data’s preprocessing, ETL and statistic according to different service logic. By parallelly analyzing difference data mining algorithm, we can draw a meaningful result of data analyses which can be demonstrated on the user interface.This paper introduces different machine learning algorithm into specific utilization, overcoming the artificial detection’s shortcomings which are low efficiency and easily affected by subject factors. Proved by a large number of experiments, This analyses system of aberration have an advantage over the traditional one.
Keywords/Search Tags:outliers mining, parallel computing, multi-labelclassification, ETL
PDF Full Text Request
Related items