Font Size: a A A

Outlier Detection Based On Markov Random Walk And Its Application

Posted on:2022-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:T T XiFull Text:PDF
GTID:2518306521494984Subject:Computer technology
Abstract/Summary:PDF Full Text Request
It is an important method to detect outliers by using neighborhood-based outlier detection algorithms.Outlier detection algorithm based on neighborhood is an important means of outlier detection.However,with the explosive growth of data volume and data dimension,it is difficult to directly apply it to high-dimensional data,and unreasonable parameter selection leads to performance degradation of the algorithm.To address the above issues,this paper starts with reducing the influence of parameters on outlier detection,the outlier detection based on neighborhood is studied deeply,and proposes an outlier detection algorithm suitable for high-dimensional data.(1)A feature extraction algorithm for outlier detection is proposed,FEOD algorithm.Firstly,the optimal information entropy threshold is obtained through the iterative process to delete redundant features and realize the preliminary screening of data matrix;secondly,the intra-class distance and inter-class distance are introduced and redefined as the weight of linear projection to extract low dimensional features with good recognition ability and improve the efficiency of outlier detection.(2)Based on the above study,a two-stage outlier detection algorithm based on Markov random wandering is proposed,DLS algorithm.The algorithm first conducts a uniform sampling strategy to generate a series of triangulation graphs,and the removal rules are introduced to get the topology structure of the node,so that the algorithm obtains a transition probability matrix defined by the node connectivity,which reduces the computational complexity and running time of the algorithm.Second,the weighted voting principle is used to redefine the restart vector,and the average deviation of the smoothly distributed vector on different graphs is used as the outlier score,the accuracy of the algorithm is effectively improved.(3)Based on the above research results,using QT + Pycharm as the development tool,a celestial spectral outlier detection system based on feature extraction is designed and implemented,detailed description is given from the aspects of demand analysis,system architecture and software functions.The analysis of the operation results shows that the system provides an effective way for the knowledge discovery of the spectrum of special unknown celestial bodies.In this paper,we use synthetic data sets and UCI data sets to prove the effectiveness of FEOD algorithm and DLS algorithm,and have higher detection efficiency and accuracy than traditional algorithms.In addition,it provides an effective way to search for valuable celestial bodies by applying it to the detection system of celestial spectral outlier data.
Keywords/Search Tags:Outlier detection, Markov, Triangulation, Information entropy, Feature selection
PDF Full Text Request
Related items