Font Size: a A A

Research On Outlier Mining Algorithm

Posted on:2017-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:C Y WangFull Text:PDF
GTID:2348330485452682Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Outliers refer to these objects that do not accord with the general rule of the normal data or deviate from the normal data.The significant information hidden behind outliers is easily overlooked,so all kinds of outlier mining have become a hot research.However,most kinds of the outlier mining research are based on static data sets,mainly obtained by calculations based on LOF algorithm,but for practical applications,the study data sets is changing.The main work is facing the outlier mining research of increasing sets.OPTICS algorithm is a widely used density-based clustering methods.OPTICS algorithm differs from other density-based clustering algorithm because it does not need the two parameters ? and MinPits as global measure to identify the cluster,but to build up an augmented graph to represent density-based data structure.In forming the reachable graph,the object is always high-density data distribution towards regional expansion,eventually forming a visual sequence.When the expansion of neighborhood data,every neighborhood query has to scan the entire table to introduce the adjacent table to store the points in the neighborhood,as long as the core object traversing a neighborhood can create a convenient adjacency table behind query.While giving seeds queue with a NM pointer which always points up to minimum point,it can optimize the strategy of updating queue.In dynamiclly increasing databases,the IncLOF which improved based on LOF has overcomed the problem that the high time in re-calculating the local outlier factor of all the data in the secondary mining and it works well.When a large number of objects are inserted into the database at the same time,the time efficiency of the algorithm decreases rapidly.In this paper,a new algorithm for outlier data mining is proposed,which uses the improved OPTICS algorithm to cluster the original and newly added data,then use the IncLOF algorithm to calculate the LOF.Experimental results show that compared with the traditional IncLOF algorithm,in the dynamiclly increasing database experimental environment,this algorithm not only the algorithm time efficiency is improved,the accuracy of abnormal data mining is also optimized.
Keywords/Search Tags:outlier mining, clustering, local outlier factor, OPTICS algorithm, IncLOF algorithm
PDF Full Text Request
Related items