Font Size: a A A

Study Of Locally Linear Embedding To Outlier Detection In High Dimensional Space

Posted on:2018-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LiuFull Text:PDF
GTID:2348330542991469Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the development and popularization of computers,more and more data have been produced whose dimensionality become higher and higher.Mining useful information hidden in high-dimension space has become a hot topic over the world.Outlier detection is an important part of data mining,aiming to find the data that has inconsistent performance and behavior with most of the data,and it has been applied to many fields,such as credit card fraud,network intrusion,medical treatment,public security monitoring.Therefore,it has the important theoretical and practical significance to detect outliers in the high-dimension space.Because the distribution of high-dimension data is sparse,traditional methods of outlier detection in high dimensional space perform low efficiency and ineffectiveness.Nonlinear dimensionality reduction based outlier detection is proposed by incorporating dimensionality reduction into outlier detection.Due to the nonlinear structures of real datasets,this dissertation incorporates locally linear embedding into outlier detection,and proposes a local neighborhood-preserving locally linear embedding and an outlier detection method based on locally linear embedding.The main researches of the dissertation are as follows:Firstly,this dissertation reviews research status of the techniques of data mining,outlier detection and dimensionality reduction,typical outlier detection methods and several classical dimensionality reduction methods,and analyses the advantage and disadvantage of the locally linear embedding method emphatically.The reason why the traditional outlier detection methods cannot accurately detect outliers in the high-dimension space has been demonstrated,and the feasibility of detecting outlier has been presented by mapping data to the low-dimensional space.Secondly,aiming at the problem of locally linear embedding(LLE)is sensitive to noise,the Laplacian eigenmaps is incorporated into LLE,and a new dimensionality method,called local neighborhood-preserving locally linear embedding,is proposed.Theoretical analysis shows the proposal can keep the intrinsic structure of high-dimension data and robustness to noise.Experiments are conducted on real datasets to verify the effectiveness of proposed method against three classical outlier detection methods.Finally,due to the fact that outliers are always located in the low density area,a new rough set model is introduced to characterize outlier,and an outlier detection method base on locally linear embedding is proposed.Due to the new constructed rough set model,the data set is divided into dense area and sparse area.A local neighbor graph of dataset and a local linear neighbor graph of positive domain are constructed.In order to effectively separate outlier from inliers,a new weight is added to the local neighbor graph.The minimum spanning tree-inspired k-nearest neighbor method is adopted to detect the outliers in the low dimensional space.Experiments are conducted on eight datasets to verify the effectiveness of proposed method against four classical outlier detection methods.
Keywords/Search Tags:Locally linear embedding, outlier detection, dimensionality reduction, Laplacian eigenmaps
PDF Full Text Request
Related items