Font Size: a A A

Research On Local Outlier Detection Algorithm

Posted on:2017-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:F MaFull Text:PDF
GTID:2308330485459012Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As an important branch of data mining, outliers detection has received increasing attention from researchers both at home and abroad. Outlier detection is widely used in many fields, such as credit card fraud, ecological system disorder, network intrusion detection, and so on. Outlier detection is aimed at finding the data objects which obviously deviate from others objects or manifest abnormal behavior from the target dataset according to specific rules.Many classical outliers detection algorithms have sprung up after years of development. However, with the rapid growth of data records and dimension, the outlier detection algorithms existing today expose its shortcomings in execution efficiency and detection accuracy. Based on the research of data set itself, combined with the traditional outlier detection algorithms, this paper designs an algorithm that can be effectively detect local outliers in large scale datasets. The main work of this paper include the follow aspects:(1) Analysis the research background and significance of outliers mining, and study the present research status at home and abroad.(2) Some classical outliers mining algorithms are analyzed in detail, such as statistical distribution based algorithm, depth based algorithm, distance based algorithm, density based algorithm, clustering based algorithm.(3) Through the research of fixed grid, this paper designs a novel variable grid division algorithm and then combines it with LOF algorithm to detect outliers in large data sets. For the key parameters involved in the proposed algorithm, this paper give the corresponding formula and discuss the effectiveness.(4) The experiment of this paper uses artificial data sets and UCI data sets to demonstrate the proposed algorithm is better in grid division effect, execution efficiency and detection accuracy than OMAGT algorithm and LOF algorithm.
Keywords/Search Tags:outlier detection, local outlier factor, variable grid division, large scale datasets
PDF Full Text Request
Related items