Font Size: a A A

Network Flow Classification Study Based On Model Clustering And Feature Selection Strategy

Posted on:2015-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:P X MaoFull Text:PDF
GTID:2298330422483760Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years, the Internet represented by the information technology hasbecome the most dynamic and developing new technology in the field of science andtechnology in the world today. With the rapid development of internet, the internetbegan to carry more and more the emerging web applications. The method of basedon the traditional port identification classification and payload analysis has alreadycan’t meet the needs of web applications to identify. Therefore, we need to study akind of effective method about traffic clustering and classification in order to identifyall kinds of network traffic from the rapid and accurate internet. Thus, there is greatrealistic significance for network management and plan, network fault and check andnetwork quality of service or network security research.There are some problems in the current network traffic classification. I havecarried out the research under the background. The main work of this thesis are asfollows:In the aspect of traffic classification based on unsupervised machine learning,this paper proposes a clustering algorithm based on quick solution of GMM to studythe classification of network traffic and achieve a better clustering effect. It is showthat it is more appropriate on traffic clustering than other algorithm. The simulationresults with matlab indicate that this method is of excellent clustering precision andafter the initial clustering center of the EM algorithm has a better accuracy of costestimation to solve GMM, and effectively raise the convergence speed of the EMalgorithm.In the aspect of traffic classification based on supervised machine learning, thispaper put forward a kind of Two-Phase Filter flow feature selection algorithm basedon CFS+PCA. Firstly, I used the CFS algorithm to remove redundant and irrelevantattributes, and then I combined it with principal component analysis (PCA) todimension reduction of data sets so as to get a better feature subset, which was usedfor network traffic classification recognition. Experiments have showed that theoptimal feature subset of a feature selection algorithm based on Two-Phase Filter canreduce the redundancy and dimensions as far as possible. Meanwhile, it contains moreflow information to maintain the better classification performance and classification effect.
Keywords/Search Tags:gmm, feature selection, traffic clustering, traffic classification
PDF Full Text Request
Related items