Font Size: a A A

The Research Of Real Time Anomaly Detection Of Massive Log Stream Based On GPR Model

Posted on:2017-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z A GuoFull Text:PDF
GTID:2348330482486926Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and the deepening informatization,a large amount of logs generated by large Internet companies are also having an explosive growth.Timely detection of user behavior and system's abnormality by analyzing and testing logs plays an important role for improving user satisfaction and system stability.The traditional log anomaly detection adopts the way of processing after storage,but in the era of massive log,this approach faces bottlenecks like high occupation of storage space and poor timeliness,so it is urgent to study the new log anomaly detection architecture and algorithms.Therefore,this paper conducts studies in terms of detection algorithms and real-time computing for the massive log real-time anomaly detection issue.Firstly,the log stream anomaly detection generally adopts methods which are based on rule-matching,but these methods are less efficient.Therefore this paper studies the numerical representation of the text log and proposes to use the information content to characterize the log.Due to the high complexity of the direct calculation of information content,the relationship between lossless compression algorithms and the information content is used to indirectly estimate the information content.In order to meet the special needs of the log stream compression,this thesis presents a lossless compression algorithm LSCA which suits the log stream scene on the basis of sequence compression algorithm.After the text logs are converted into numeric form,a log stream anomaly detection algorithm based on the GPR forecasting model is put forward by introducing Gaussian regression model.A comparison of the estimated data value and the actually received data value is conducted to determine whether this method detects the log anomaly within the deviation range.Secondly,GPR-based forecasting model can effectively detect isolated exceptions,but it is less efficient in local anomaly detection.To solve this problem,this paper introduces sampling method and proposes sampling algorithm LSUS which suits the log stream scenario,then combines it with GPR and forms a new model LSUS_GPR,then the new model is applied to the global anomaly detection.Experimental results show that the computational complexity and the false positive rate of the new modelare significantly reduced,which greatly improves the detection efficiency.Thirdly,with the JStorm stream computing framework,this paper designs and achieves real-time log stream anomaly detection system LRADS that is based on GPR anticipation model.The LRADS system is elaborated separately in the aspects of the overall design and performance optimization.The core log collection and real-time calculation module are introduced from the perspective of overall design.For performance optimization,offline and online scheduling optimization method is presented.Finally,the system evaluation showed that LRADS is stable and efficient,and it has use value in production environment.
Keywords/Search Tags:log stream, anomaly detection, Gaussian Process Regression, JStorm
PDF Full Text Request
Related items