Research On Outlier Data Mining High Dimensional Space

Posted on:2011-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Wu

Full Text:PDF

GTID:2178330332967865

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Outlier detection become an important research direction in the field of data mining,widely used in the field of financial fraud and network intrusion detection, disease prevention and control,disaster,and many other aspects of weather forecasting.As the research proceeded,large-scale,low-dimensional data in the detection of outliers have a more in-depth study,which has been made many achievements.However,in the large,high-dimensional data in the detection of outliers is still faced with many problems and challenges,a lot of problems need to be in-depth,systematic study.This paper is based on existing algorithms,presents a outlier mining method based on the combination of genetic algorithm and simulated annealing algorithm on the large,high-dimensional data.This paper introduces the data mining and outlier mining concepts,compares and analyses the existing outlier detection algorithms,discusses several important high-dimensional outlier detection algorithms,and points out the drawback of them.On this basis,an new outlier detection methods of genetic algorithm and simulated annealing algorithm in the high-dimensional space is proposed.In this method,the high-dimensional data of each dimension is divided into grid,in order to overcome the crack caused by the grid of adjacent data points of the division,two grid classification methods have been used, and the results of the two have been stored into the same grid computing tree, then the data points in each grid are coded, and the sparsity coefficient of each grid is calculated.To reduce the computational complexity,find the smallest factor of the top-n grid and the points in high-dimensional space,genetic algorithm is adopted in this paper.In order to prevent "premature" phenomenon,simulated annealing algorithm is introduced.The experiments shows that the above method is effective.

Keywords/Search Tags:

data mining, outlier, sparsity coefficient, grid count tree, genetic algorithm, simulated annealing algorithm

PDF Full Text Request

Related items

1	The Application Study Of Genetic Algorithm And Simulated Annealing Algorithm In Phylogenetic Tree Construction
2	The Research Of Router Nodes Placement Problem Based On Simulated Annealing--Genetic Algorithm In Wireless Mesh Networks
3	Research Of Grid Task Scheduling Based On Genetic Simulated Annealing Algorithm
4	Grid Computing Environment, Research And Implementation, Based On Adaptive Genetic Simulated Annealing Algorithm Sgsa Task Scheduling
5	Research On Ellipsometric Measurement Of Thin Film Based On Adaptive Genetic Simulated Annealing Algorithm
6	Design And Realization Of Task Scheduling Algorithm In Grid Computing
7	Path Planning In A Static Environment Based On Genetic Simulated Annealing Algorithm
8	Research, Genetic Algorithm-based Clustering Method
9	Research On Cluster Analysis Based On Optimized Genetic Algorithm
10	Simulated Annealing Based On Ant Colony Algorithm For Solving The Grid Of Task Scheduling Problem