Font Size: a A A

Research On Partitioning Method For Min-Max Modular Support Vector Machine And Its Application

Posted on:2013-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:X M JieFull Text:PDF
GTID:2218330371957555Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the progress of the information technology, large scale data have been produced in the real-world. The traditional simple classifier can not classify the massive data accurately. However, ensemble learning is an effective solution to massive data processing. In the thesis, we focus on an ensemble learning method based on emergence theory- Min-Max Modular Support Vector Machine (M3-SVM). M3-SVM consists of two parts: the partition of data and integration of the results of each base classifier by MIN and MAX rules.For the partition strategy, its performance has a great influence on the efficiency of the M3 network. Then how to find an effective and low complexity partition method, which can result in a relatively balanced division, is very important for M3 network. Some methods have been presented in M3 network, such as randomly division, hyper-planes division, spectral clustering and so on. But these methods at least have one of disadvantages as follows, some of them do not take the distribution of original data set into account or they are too complicated with large time cost. So a new data partitioning method-bisecting K-means, which has low complexity and effective mechanism to avoid falling into local optimal, is presented and validated in the real-world data sets. However, the criterion function of bisecting K-means only considers the compactness of one cluster without taking the differences between the clusters into account. Then the bisecting K-means based on the equalization function (BEK) is proposed. Generally BEK can get a global optimal solution with low time complexity, and more importantly, it can obtain the relatively balanced partitions, which are very important for M3-SVM to deal with massive imbalance data. Experimental results on real-world data sets show that this partitioning method can improve the classification performance of M3-SVM effectively without increasing its time cost.The collected data in intrusion detection is large and imbalance. In order to further verify the classification performance of M3-SVM with BEK for imbalanced data sets, some experiments are done on the real intrusion detection data sets-KDDCUP 99. The results show that the partition method introduced in this thesis can improve the detection performance of the M3-SVM for intrusion detection.
Keywords/Search Tags:Min-Max Modular Network, Partitioning Method of Training Sets, Bisecting K-means, Equalization function, Intrusion Detection
PDF Full Text Request
Related items