Font Size: a A A

The Outliuer Detection Algorithm Based On Cluster Outlier Factor And Unique Closet Neighbor Set

Posted on:2020-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:J Y QiuFull Text:PDF
GTID:2428330599460280Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the important research fields of data mining technology,outlier detection is an important method to explore the potential value of data.It can be used to find outliers from a large amount of data which are different from most data.These outlieres always have more valuable information.It is precisely because of the important research significance of outlier detection that the research is very active in this field.In this dissertation,the inefficiency and low coupling problem of cluster-based outlier detection algorithm are studied in depth.The main research contents include the following two aspects.Firstly,this dissertation analyzes the algorithm of clustering by fast search and find of dendisy peaks in detail,to salve the parameter problem and decision fraud phenomenon of the algorithm,proposes an improved algorithm called outlier detection algorithm based on cluster outlier factor which introducing mutual neighbor and mutual neighbor search algorithm to solve the parameter problem;The mutual density is used to describe the closeness between data point and surroundings to decrease decision fraud;The cluster outlier factor is propose to measure the degree of cluster outliers.The outlier factor is used to find single outlier and outlier cluster in this algorithm.Secondly,to solve the fake neighbor phenomenon and cluster merge of algorithm called unique neighborhood set parameter indepdent density-based clustering with outlier detection which is based on unique closet neighbor.An improved outlier detection algorithm based on unique closet neighbor set is proposed called IPIDC.The cardinality of closet neighbor set is used to detect single outlier.The cluster outlier factor is used to detect cluster outlier.The concept of transmission domain is introduced to solve cluster merge problem.The algorithm can detect both signal outlier and cluster outliers.Lastly,under the UCI simulation dataset and real data,the proposed algorithm is validated and compared with algorithm of clustering by fast search and find of dendisy peaks and algorithm called unique neighborhood set parameter indepdent density-based clustering with outlier detection.It is verified that the proposed algorithm perform well inclustering and outlier detection.
Keywords/Search Tags:data mining, outliers, mutual density, ? density, cluster outliers
PDF Full Text Request
Related items