Research On Fast Query Technology Of Similar Subsequences In Time Series Data

Posted on:2020-10-10

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Wang

Full Text:PDF

GTID:2428330578469614

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Finding subsequences with similar trends from the sequence data set is a key technology in sequence data mining.The technology has important applications in several fields such as finance,healthcare,meteorology and network security.Subsequence query generally uses Dynamic Time Warping(DTW)as the similarity measure algorithm.However this algorithm has high time complexity,so it is difficult to implement online query when querying long subsequences.The time series representation method can effectively reduce the time overhead of the query by reducing the dimension of the sequence.Therefore,this paper uses a combination of time series representation and similarity measure algorithm to solve the problem of fast query of similar subsequences in time series data.The specific research contents are as follows:(1)An algorithm MONEX(Modify ONline EXploration of time series)for fast querying of long subsequences is proposed.First,all subsequences of a certain length in the data set are grouped,and the representative subsequences are marked.Secondly,during the query process,the query sequence is divided into short sequences of a specified length and determined by the DTW algorithm.Subsequence candidate sets similar to these short sequences.Finally,sequence splicing is performed on the candidate sets to obtain a sequence of query results.A large number of experiments on real data sets show that the proposed MONEX algorithm is nearly 10 times more efficient than the most advanced algorithm.(2)The grouping process of subsequences(the time series representation process)uses the Euclidean Distance(ED)to measure the similarity between the subsequences,then group them according to the similarity results.This paper proves a solusion that is a triange inequality between the ED and the DTW.Therefore,using the DTW algorithm after sequence representation can ensure the accuracy of the query.(3)In order to meet the query requirements under different similarities,the rules for dividing the query level are proposed.A large number of experiments shows that when the similarity threshold is equal to 0.2 and the sequence segmentation length is 20,the algorithm can be executed efficiently with high accuracy.

Keywords/Search Tags:

Sequence Data Query, Dynamic Time Warping, Subsequence, Time Series

PDF Full Text Request

Related items

1	Research On Similarity Measurement Method Of Time Series Data Based On Dynamic Time Warping
2	Implementation Of Dynamic Time Warping Algorithm Acceleration System Based On SoPC Platform
3	Time Series Similarity Search Based On Adaptive Cost Dynamic Time Warping Distance
4	Research Of Similarity Search Based On Dynamic Time Warping In Time Series Data Mining
5	The Research Of Similarity Search Based On Dynamic Time Warping In Time Series
6	Improving efficiency and effectiveness of dynamic time warping in large time series databases
7	A Number Of Issues In The Time Series Data Mining Research
8	Design And Application Of Prediction Model Based On Time Series Analysis Technique
9	Time Series Similarity Search Under Piecewise Dynamic Time Warping
10	Research On Series Data Similar Search Technology