Font Size: a A A

Outlier Mining Algorithm Research And Application

Posted on:2014-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:J MengFull Text:PDF
GTID:2268330401454999Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Abnormal data is inconsistent with the most of the data or deviate from thenormal data which represents a deviation or the beginning of a newmodel.Identification of the abnormal data is more valuable than the normaldata.Outlier Mining is a very important branche of Data Ming which has been widelyused in fault diagnosis,intrusion detection,fraud detection,novel text mining andimage processing fields.At present,researchers have proposed a number of differentOutlier Mining algorithms,these algorithms can discover the abnormal data in datasetsefficently.But in face of the complexity environment in the practical application,thesealgorithms have a lot of shortages,such as have more computing time and overmoremanual intervention,difficult to choose parameters. This paper researches on thesealgorithms and proposes some inprovements, the main innovations in the dissertationare outlined as following:1. The tranditional LOF algorithm must recalculate the local outlier factors ofthe all of data when do the second outlier mining in dynamic incremental databaseenvironment.In this paper,we propose an outlier mining algorithm based on clusteringand rapid calculation.At first,the algorithm only calculate the local outlier factors ofthe data in abnormal clusters based on DBSCAN algorithm. In addition,we propose animproved clustering algorithm to avoid DBSCAN algorithm reclustering when newdata coming.Then we identify the abnormal data in clusters.At last,we onlyrecalculate the local outlier factor of the data which is the new original abnormal dataand the data in the abnormal cluster which local outlier factor is chaged. Experimentalresults show that this algorithm perform better than LOF and lncLOF algorithm notonly in the time consuming but also the accuracy of mining abnormal data.2. Clustering method as a common of Outlier Mining algorithm has beenapplied to intrusion detection,k-means algorithm is a classical division algorithm hasbeen widely used in the intrusion detection. To adress the issue in k-means algorithmthat clustering number has to pre-defined and sensitive to the initial center, anautomatic clustering number determination algorithm is proposed. Firstly, excutedsampling and max-min distance algorithm repeatedly to produce a series of preferredclustering centers and clustering numbers as the differential evolution algorithm’sinitial populations, then all of the individuals adjust their clustering center andclustering number automaticly based on the best individual in per generation, and by use of the differential evolution algorithm’s global optimization ability and thek-means algorithm’s local searching ability to find the best clustering center andclustering number.On the basis of the algorithm, the paper proposed an intrusiondetection method,as shown from our simulation experiment over networks connectionrecords from KDD CUP1999dataset,the algorithm has efficient performance inintrusion detection.3. Transformer fault diagnosis is a practical application of Outlier Mining.Thekey of the transformer fault diagnosis is able to discover the abnormal data in thetransformers’ dissolved gas and identify the type of the abnormal data. Support vectormachine as a classification method has been used in transformer fault diagnosis.Butsupport vector machine is very sensitive to the parameters,so we use the firefiyalgorithm to optimize the parameters of support vector machine and establish thebinary tree model of support vector machine in transformer fault diagnosis.Experimental results show that this method perform better than IEC three ratios andthe nerve network algorithms.
Keywords/Search Tags:outlier mining, dynamic incremental database, local outlier factor, intrusion detection, clustering number, transformer fault diagnosis, support vectormachine
PDF Full Text Request
Related items