Outlier detection is an important field in data mining,which is widely used in credit card fraud detection as well as network intrusion detection and so on. General outlier detection algorithms are introduced in this paper, such as algorithms based on statistical method, distance method, density method, and the offset of the outlier detection method, at the same time outlier detection algorithms of the high dimension data and data stream method are also presented. On the other hand, common clustering algorithms In data mining are presented in the paper, such as partition-based algorithm, hierarchical-based algorithm, density-based algorithm, grid-based algorithm,model-based algorithm, and fuzzy clustering algorithms, etc.Similarity function of high dimensional data sets and concept of the classes density are presented in light of hierarchical clustering and similarity principle, so a new outlier detection algorithm is proposed basing on the concept. The algorithm principle is simple and realization of the program is not difficult, at the same time it is valid by test. On the other hand, it owns some defect with which the run time is more long with increasing dimension of data. In order to solve the problem, a new outlier detection algorithm, namely algorithm of NMF and similarity metric which is combined NMF and the similarity metric outlier detection, puts forword.The algorithm has higher time efficiency dealing with the high dimension data through experimental results because it can depress the dimension of data by NMF firstly, namely it is the fusion of NMF and similarity metric outlier detection method. |