Font Size: a A A

Outlier Detection Based On Dirichlet Process Mixture Model And Persistent Homology

Posted on:2019-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:L L LiuFull Text:PDF
GTID:2428330545955148Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The history of anomaly detection is relatively long and can be traced back to the Bernoulil commentary of 1777.It has been widely applied in economics,society,and networks,and has become an important area of data mining.In the financial field,abnormalities are not necessarily data to be ignored or deleted.On the contrary,it may be a new model.The existing algorithms can be divided into three categories:unsupervised detection,supervised detection,and semi-supervised detection methods.The criteria for classification are mainly whether the dataset has tags or not.This article first summarizes the related researches of anomaly detection,then classifies the existing anomaly detection algorithms,conducts experiments through simulated data,displays the accuracy of data recognition by various methods through visual tools,and the accuracy of each method.Misclassification rates,etc.are sorted and visually compared with the use of PR maps.There may be multiple schemas for the dataset.A new combination method is proposed in this paper with reference to Song(2009)method.The Dirichlet process mixture model proposed by Fergusont(1973)is used to determine each schema,and then the data in each schema is applied by Ocsvm.Plats(2003)proposed Ocsvm mining anomalies.In the paper,we achieve Dirichlet process through the method of Neal(2000).At the same time,the persistent homology proposed by Carlsson(2005)is applied to the outlier mining.This method can identify the topological features of the dataset itself,and can ignore the mode effects of the data.the visual description of persistent homology by Carlsson(2005)is depicted into two distances in the barcode map,then mining anomalous point of the data set at the proposed distance.Use the Javaplex developed by Adams(2014)to achieve persistent homology.Compared with the ROC chart of traditional methods,the proposed two methods have stable performance under different data patterns and overcome the dependence on data patterns.
Keywords/Search Tags:Anomaly detection, Dirichlet process, ocsvm, Persistent homology
PDF Full Text Request
Related items