Font Size: a A A

The Research Of Dimension Reduction And Similarity Measure In Multivariate Time Series Classification

Posted on:2018-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q F WeiFull Text:PDF
GTID:2310330515497948Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The MTS(Multivariate Time Series)is a common and significant type of data that used in many domain of the real world.The use of multivariate time series to describe things can be more comprehensive and complete reflection of their own characteristics.The research and analysis of these sequences provide a reliable way for people to know more about the things and discover the inherent laws.At present,the data mining of multivariate time series have been paid more and more attention.However,it is difficult of mining the multivariate time series because of its own characteristics:temporality,multiple variables,variables relevance.When the dimension of the input variables increases to a certain extent,the prediction accuracy of the model will be greatly reduced,which will lead to "Curse of dimensionality".To be specific,there are many irrelevant and redundant variables in the MTS.If all the variables are calculated directly without treatment,the computational efficiency will be very low,and it will have a negative impact on the prediction results of the model.The task of data mining cannot be separated from the similarity calculation between samples.The selection of similarity measurement method has a direct impact on the accuracy and feasibility of mining tasks.It has become more and more difficult to solve the problem of similarity measure for multivariate time series,but the research on improvement methods is relatively few.Therefore,this paper focuses on the dimension reduction and similarity measure of multivariate time series classification.(1)This paper summarizes the existing classification research of multivariate time series.In particular,the core problems,such as similarity measurement,data reduction and classification,are analyzed in detail.Refers to the lack of the off-the-shelf methods,resulting in the motivation of this paper.(2)To reduce the dimension of multivariate time series,according to the intra class and inter class scatter in the training data,this paper puts forward a standard to measure the contribution of each variable to class separability,and then sorts the variables according to the standard.According to the mutual information value of the input variables,the redundant variables are eliminated.Finally we select the best subset of variables.The method can select the core variables that contribute the most to the classification,and eliminate redundant variables,which can reduce the dimension of multivariate time series variables,and improve the efficiency and performance of classification.(3)Aiming at the similarity measurement problem of multivariate time series,this paper improves the existing shapelet method.Based on the non-similarity of shapelet,a fast method for searching multiple shapelets is proposed.A distance threshold is set according to the whole distance distribution between subsequences,so as to filter out the similar subsequence of the candidate set.Then the class separability is used as the evaluation criterion of the filtered subsequence,and finally the best performance of multiple shapelets is selected.The experimental results show that the proposed method can greatly reduce the time of searching shapelets and maintain high classification accuracy.The method is extended to the multivariate time series,and the multiple classifiers are used to improve the classification accuracy.
Keywords/Search Tags:multivariate time series, dimension reduction, shapelet, similarity measure, class separability
PDF Full Text Request
Related items