Font Size: a A A

Research And Application On Outlier Detection Algorithm For High-dimensional Data Stream

Posted on:2018-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:L P YuFull Text:PDF
GTID:2348330542965257Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
The aim of outlier detection is to identify outliers from data sets quickly,and it is widely used in the fields of financial data analysis,network security evaluation and so on.The Highdimensional data stream represented by internet of things data has the characteristics of massive,isomeric and noisy,even the need of real-time processing,which makes the traditional data analysis algorithm invalid and results in high time complexity and other potential limitations.The outlier detection of high-dimensional data stream is confronted with many challenges.In this paper,we propose an efficient outlier detection algorithm and an improved trend analysis algorithm for high-dimensional data stream.Based on the experimental analysis and the actual analysis of the wireless sensor networks of elevator,these two algorithms show high reliability.In the research,the main contents are as follows:(1)Based on the characteristics of high-dimensional data stream and the high time complexity of the existing angle-based detection algorithm,we put forward the HDSOD algorigthm.According to the information entropy theory,we retain the valuable attribute of high-dimensional data stream,achieving the purpose of dimension reduction;then use the grid partition theory to build the grid of best data and the grid of recent data,which forms a small scale data stream to calculate the outlier factor of the latest data point,and establishes the mechanism of updating real-time to ensure the accuracy of detection.(2)An improved trend analysis algorithm for data stream is proposed.According to the application demand and the intensity changes of the data stream,we can choose the total least-square method or the exponential regression method for trend analysis so as to improve the accuracy of trend analysis;combining with the confidence interval theory for outlier detection in data stream,we provide fault warning and important decision support for the monitoring activity.(3)In order to apply the anomaly detection algorithm to the practical field,we test these algorithms to analyze the data stream from the elevator sensor.And the simulation results show that the outlier detection algorithm can effectively provide real-time outlier warning and the better trend analysis.The outlier detection algorithms this paper proposed are more suitable for high-dimensional data stream produced in the Internet of Things.
Keywords/Search Tags:High-dimensional data stream, Outlier detection, Trend analysis, Angle variance, Grid partition
PDF Full Text Request
Related items