Font Size: a A A

Research On Real-Time Identification Methods Of Continuous Events In Time Series Flows

Posted on:2023-01-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:J L WangFull Text:PDF
GTID:1520307316464454Subject:Big data statistics and intelligent computing
Abstract/Summary:PDF Full Text Request
The identification of continuous events of time series flow is the basis of time series applications.Due to the characteristics of time series flow data,such as mass,drift,and sequential,and there are no specific rules before an event occurs,thus,the existing event monitoring systems and approaches of time series flow are mostly post-triggered,which cannot meet the requirements of timeliness and accuracy of identification since all the information of continuous events should be obtained before calculation.With the development of computer science and artificial intelligence technology,how to use information methods to cooperate with advanced statistics and computer science knowledge to realize real-time identification of continuous events in time series has become a hot Topic in the field of time series research.To solve the above problems,this thesis proposes a real-time event identification approach,which covers the whole process of time series flow,including continuous event acquisition,adaptive clustering,feature classification,model construction,and similarity matching.The main research contents of this thesis are as follows:(ⅰ)This dissertation first proposes a method to obtain the continuous events of time series flow based on statistical analysis to locate and obtain accurate events from massive historical time series flow data.Specifically,the method filters out storage nodes that may store the same event in a sensor network,and proposes a weak repeating path constraint strategy to improve query efficiency and accuracy.On this basis,the time series flow data stored by the sensor can be preprocessed.Based on a low-pass filtering method,the historical time series flow data is denoised to filter out the background noise and interference signals,and the accuracy of the trigger location of the time series flow event can be improved.After that,to improve the acquisition efficiency of time series flow events,a time series flow continuous event triggering method based on time-domain analysis is proposed(STA/LTA).By setting a dynamic long and short time window,the average value ratio of the time series flow data within the range of the long and short time window can be calculated.If the ratio is greater than the threshold(THr),it is considered that the window contains an event trigger point.Finally,the AIC criterion is introduced to the time window to find the exact trigger point and ending point of a time series flow event,and then the accurate acquisition of the time series flow event can be realized.Experimental results show that the proposed method is superior to the existing methods in terms of sensor query efficiency,query accuracy,event query efficiency,and event query accuracy.(ⅱ)After obtaining time series flow events from massive time series flow data,these events will be clustered to get their labels adaptively.The existing clustering methods usually calculate events with drift characteristics in the historical time series flow dataset directly.This will lead to poor clustering efficiency and accuracy,and serious noise interference.Thus,this thesis proposes a continuous event clustering method based on a dynamic matrix to solve the above problems.Specifically,this method first defines the concept of neighbor evaluation criteria for sequential flow events to measure the double constraint event RDS(Representative and Diversifying Sequences).Then a backdifference method is proposed to calculate and measure the nearest neighbor score,to find out the candidate set of RDS efficiently,and to improve the efficiency of RDS selection.After that,an RDS optimal solution combination filtering strategy is proposed to obtain RDS efficiently.Finally,to solve the problems of low clustering efficiency and poor accuracy of time series flow events,a distance matrix between historical time series flow events and RDS is established adaptively,and a dynamic clustering method based on Kmeans is proposed to dynamically partition the types of time series flow events adaptively.Experimental results show that the proposed clustering method has great advantages over the existing methods in terms of clustering accuracy,clustering reliability,and clustering efficiency.(ⅲ)After obtaining labels by using the proposed clustering method,these time series flow events with their labels are further used to train an event classifier to realize the efficient classification of the subsequent time series flow events.This thesis proposes a Gram matrix-based classification method for continuous events of time series flow.Specifically,the time series flow events are firstly denoised by wavelet threshold,and then a Gram matrix is introduced to realize the full attribute transformation from time series flow events to the time domain image matrix.After that,the image matrix is used as the input parameter to input the CNN model to perform the classification operation.The convolution kernel is optimized based on the Toeplitz matrix in the convolution layer,and the convolution operation is replaced by the matrix product to improve the computational efficiency.Finally,build T-CNN(Triplet-CNN)classification model,the Triplet network model is introduced into the fully connected layer to compare the results of the difference between the same type of input matrix and different types of the input matrix,and the degree of difference function is used to optimize the loss function of the neural network.Experimental results show that the proposed method is superior to the existing methods in terms of classification accuracy,precision,recall,and F1 score.(ⅳ)To solve the problem that the classifier methods identification lag,this thesis proposes an adaptive dual-model construction method for continuous events of time series flow based on the well-clustered time series flow data to support the hierarchical identification of time series flow events.Specifically,based on the data feature extraction method of moving regression,the historical time series flow data is firstly normalized,and the time series flow data is divided into blocks by a dynamic grid mechanism.Based on the idea of the weight support domain,the feature points of time series flow events are calculated in the grid.After that,according to the features of linear and nonlinear data,a dual model of the continuous events of time series flow is constructed adaptively.For linear features,a linear regression event model is fitted based on the extracted feature point data,the event identification domain of the regression equation is established,and an adaptive updating strategy of region compression is proposed.For nonlinear features or the case that the order of the linear model exceeds the threshold setting,a nonlinear identification model construction method based on the B-spline curve is proposed.The proposed method improves the genetic algorithm,which can quickly screen the control vertices of the BSpline curve,calculate the probability of corresponding selection,crossover,and mutation operator,to obtain the nonlinear identification model of time series flow events.Finally,the idea of decentralized storage is introduced,and a time series flow event storage method based on blockchain is proposed to ensure that the event information cannot be tampered with and at the same time,the information can be shared credibly.Experimental results show that the proposed method has great advantages over the existing methods in terms of data scale,model error rate,event identification model construction efficiency,and identification accuracy.(ⅴ)Due to the characteristics of drift and randomness in time series flow data,this thesis proposes a real-time identification method of continuous events in time series flow with drift features based on the established standard event identification model to achieve real-time similarity matching between time series flow events and the standard event identification model.Specifically,the dynamic time warping similarity matching model is firstly constructed based on the drift characteristics of the time series flow data,and the real-time similarity matching between the time series flow events and the standard event identification model is realized.After that,the dynamic sliding window mechanism is introduced to obtain the monitoring domain of the event occurrence,the similarity matching adaptive dynamic threshold is set to find the recognition ratio of similar matching,and the incremental similarity matching strategy is proposed to reduce the computational complexity of re-matching.Finally,an event-level real-time identification method is proposed based on the distribution of the identification proportion.This thesis optimizes a matching strategy based on piecewise aggregate approximation,reduces the data size of real-time,proposes a real-time identification method of the initial domain of time series flow events to quickly determine the occurrence probability of time series flow events,and obtains the complete information of time series flow events through the real-time identification method of event termination domain,to realize the hierarchical real-time identification of sequential flow events.Experimental results show that the proposed method has great advantages over the existing methods in terms of event identification efficiency and accuracy.
Keywords/Search Tags:Time series flow events, Statistical analysis, Dynamic matrix clustering, T-CNN classification, Double model construction, Real-time identification
PDF Full Text Request
Related items