Font Size: a A A

Massive Network Traffic Data Analysis And Key Algorithms Based On Cloud Computing

Posted on:2015-06-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:M J GuoFull Text:PDF
GTID:1228330467963636Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the development of ADSL network technologies, upgrades of mobile network and promotion of smartphones, there is a giant increase in network users. Internet has been an indispensable part of humans’ daily life. Network is giant and complicated, so it is a long way to have a good and comprehensive understanding of either the emerging mobile network or the constantly updated ADSL network. Network traffic monitoring technology is the key for network traffic analysis. Combining network traffic information got through network traffic monitoring technology with cloud computing and data mining technologies, analyze and mine characteristics of traffic and users deep. Build network model to give reference for network design and optimization and cluster user online and preference behaviors to provide better user recommendation experience. The major contributions of the paper are as follows:(1) Introduce cloud computing technology based on Hadoop to massive network traffic analysis, combine key algorithms of data mining, Hadoop cloud computing platform and massive network traffic analysis in an innovative way and construct the massive network traffic analysis system based on Hadoop cloud computing platform. This system accomplishes tasks in massive network traffic data distributed storage and data mining and experiments shows that this system has a high performance in both efficiency and accuracy. Key algorithm:Massive traffic data is classified by classification algorithm; Clustering algorithm obtains the user behavior preferences; Recommend by recommendation algorithm based on user preferences.Massive network traffic analysis system based on Hadoop cloud computing contains two systems:mobile network website classification system and mobile network website browse recommendation system. For the website classification system, through using classification algorithms of data mining based on MapReduce, website classification that based on massive mobile network traffic is accomplished and website preference behavior of mobile network users is obtained; for the recommendation system, recommend specific website to users according to their preference by using recommendation algorithms based on MapReduce; ISAKMMR clustering algorithm is a MapReduce distributed clustering algorithm, which can deal with the massive user services traffic data and obtain user behavior preferences. Classification, clustering and recommendation algorithms complement each other, accomplish the classification and clustering of massive mobile network traffic and recommend based on user preference. This paper also includes a brief introduction of the cloud computing platform. In addition, system performance test using true traffic data collected from current domestic networks proves that this system has high efficiency.(2) Mobile network website classification system. Discuss system architecture in details. Compare the classification model and key classification algorithms through experiments. Compare various website traffic characteristics in time dimension based on URL classification results of current mobile network traffic data, which reflects user website preference.This paper collects mobile network traffic data from two provinces in duration of three years using traffic monitoring equipment (Traffic Monitor System, TMS) of10Gigabytes that deployed on current network backbones. For different classification requirements, three classification models (All Model,1&Other Model and1&1Model) are compared. Based on URL classification results of current mobile network traffic data, key algorithms classification accuracy of this system (Naive Bayes algorithm and LDA algorithm based on MapReduce) are analyzed and dig application scenarios of these two algorithms; classification results verify this system has high efficiency and classification accuracy and dig various website traffic characteristic models in time dimension and App website traffic characteristic.(3) Based on ADSL network traffic of a certain province in China, We analyze on-off line and service usage behaviors of ADSL users. Build model for on-off line behaviors of ADSL users using non-homogeneous Poisson Process model (NHPP) and show its derivation. Propose ISAKMMR clustering algorithm of MapReduce that apply to massive ADSL service data and achieve user service preference model.Though experiments and deviation, it finds that user on-off line behaviors of ADSL network obey non-homogeneous Poisson Process. According to the definition and features of non-homogeneous Poisson Process model (NHPP), we get the probabilities of user login and logout and build on-offline states transition model for ADSL users. Experiment results show that this model could predict user on-off line behaviors. Through introducing Simulated Annealing, MapReduce clustering feature and sparse vectors, ISAKMMR algorithm based on MapReduce is designed, which could cluster massive, high dimensional and sparse traffic data fast and accurately. Use this algorithm clustering ADSL user service behavior data of current network and get user service preference behavior model. Experiment results show algorithm’s validity and high efficiency.(4) To fit massive network traffic distributed computation, this paper improves association rules and collaborative filtering algorithms. Develop a mobile network website browsing recommendation system based on Hadoop. Using this system, website recommendation is accomplished through analyzing data of user website preference. Through experiments, high efficiency and diversity of application scenarios of this recommendation system are verified.This paper studies the recommendation algorithms based on mobile network website browsing and develop a recommendation system based on Hadoop. Describe three modules of system architecture, data pre-processing module, recommendation algorithm module and system upper layer module, in details. Study intensively on distributed methods of the three key recommendation algorithms (Apriori algorithm, MRUCF algorithm and MRICF algorithm based on MapReduce) in recommendation algorithm module. Though conducting a series of experiments based on mobile network user preference data collected from current network, different recommendation scenarios of these three key algorithms are obtained and high efficiency and algorithm validity of this recommendation system are verified.
Keywords/Search Tags:Mobile Network, Data Mining, Hadoop CloudComputing, Network Traffic Monitoring
PDF Full Text Request
Related items