Font Size: a A A

Research On Mixed Attribute Outlier Detection Methods Based On Neighborhood Rough Sets

Posted on:2019-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z YuanFull Text:PDF
GTID:2428330545482776Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Outlier detection is one of the most important research directions in the field of data mining.Its purpose is to find objects whose behaviors in the data set are very different from other data objects.It has very significant applications in intrusion detection systems,credit card fraud,interesting sensor events,medical diagnosis,law enforcement and earth science,etc.However,the traditional geometric distance detection method can not effectively deal with the categorical(or nominal)attribute data sets,and the classical rough set method can not effectively handle numeric attribute data sets.In order to solve these problems,the paper takes the neighborhood rough set as a unified framework and uses its rough representation and computational capabilities.It focuses on the research of mixed(or hybrid or heterogeneous)attribute outlier detection methods based on the neighborhood rough set.The main innovations in this paper are as follows:(1)Using the idea of ordered binary and neighbor search to improve the computation of the single-attribute neighborhood basic algorithm,it reduces the complexity of the actual computation time of the traditional out-of-order comparison algorithm.(2)In order to effectively deal with the hybrid attribute data sets,a new type of neigh-borhood information system is constructed by studying and optimizing heterogeneous distance metrics and adaptive neighborhood radii of standard deviation.(3)Based on the above work basis,three new outlier detection methods based on neigh-borhood value difference metric,sequence and neighborhood information entropy are proposed in the neighborhood rough set.Both the theoretical models and UCI dataset experiments show that the three new algorithms can effectively deal with categorical,numeric,and mixed attribute data,and have good adaptability and effectiveness.
Keywords/Search Tags:Neighborhood rough sets, Outlier detection, Mixed attribute, Data mining
PDF Full Text Request
Related items