Font Size: a A A

Research Of The Algorithm Of Similarity Query And Classification With Shapelets Based On DTW Distance

Posted on:2017-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:H W SunFull Text:PDF
GTID:2308330503957658Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of time series mining technologies, how to balance efficiency and accuracy of the algorithm has become the focus of attention of researchers. Due to the high dimensional and complex characteristics of time series, it is often difficult to analysis and process time series efficiently and accurately. Therefore, without losing critical information, it has theoretical significance and value to take time series dimension reduction, reduce computational complexity which shows great dependence on data dimensions in the algorithms, and obtain a balance between efficiency and accuracy.This paper analyzed the characteristics of time series mining algorithms, focusing on the key technologies of similarity searching and classification, analyzed and compared the classical algorithms and existing problems, proposed a lower bound algorithm of dynamic time warping based on sliding window fragment and a shapelet transformation classification algorithm based on efficient series matching, and then prove the feasibility and effectiveness of the improved algorithm from both theoretical and practical data. The main work is as follows:(1) Analysis of the underlying time series algorithms. The reseach gave a summary and analysis to representation and distance metric methods in time series mining, citing the corresponding classical algorithms to describe in detail and compare. It also described and analyzed the purpose, theory, practical applications and other related content of time series similarity search and classification algorithm.(2) Proposed a lower bound algorithm of dynamic time warping based on sliding window fragment. The lower bound algorithm plays a very important role in increasing the time series similarity searching efficiency and reducing redundant computation, existing lower bound algorithm based on piecewise aggregate approximation fragment notation, which can calculate similarity degree at low cost, but when the variation of time series’ amplitude turns larger, it cannot fit the time series tightly. Aiming at the problem, the sliding window fragment notation is introduced into lower bound algorithm, and a lower bound algorithm of dynamic time warping based on sliding window fragment is proposed, to build upper and lower boundary curve with higher degree of fitting, filter the time series and screen out the time series with low similarity. The algorithm is proved to be effective to simplify the calculation process of time series’ similarity by the results, to reduces the time complexity and improve the efficiency of similarity searching, and when the variation of time series’ amplitude turns larger, it has higher tightness and pruning power to calculate similarity degree.(3) Proposed a shapelet transformation classification algorithm based on efficient series matching. Existing shapelet classifying processes are relatively inefficient and slow due to the large amount of necessary complex distance computations. The paper therefore introduces piecewise aggregate approximation(PAA) representation and an efficient subsequence matching algorithm for shapelet classification algorithms to propose shapelet transformation classification algorithm based on efficient series matching. The proposed algorithm took the PAA representation for appropriate dimension reduction, and then used a subsequence matching algorithm to simplify the data classification process. The efficient subsequence matching algorithm can be combined with the shapelet classification algorithm; the new algorithm can ensure relatively high classification accuracy, effectively simplifies the algorithm calculation process, and improves classification efficiency.
Keywords/Search Tags:time series, similarity searching, classfication, dynamic time wrapping, sliding window fragment notation, efficient subsequence matching
PDF Full Text Request
Related items