Research On Uncertain Time Series Similarity Matching

Posted on:2013-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Zuo

Full Text:PDF

GTID:2218330371455858

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

A time series is a sequence records according with the chronological order. Similarity matching is one of the underlying operations for time series clustering, outlier detection and pattern discovery tasks.Currently, study of time series similarity matching mainly focuses on deterministic data. With the development of the Internet of things and privacy protection technology, uncertain time series will be in large numbers and time series similarity matching technology is facing new challenges.In the case of uncertain time series, the distance between the two sequences is uncertain, so the way of similarity matching on deterministic time series cannot use directly.In order to solve the problem of uncertain time series similarity matching, we have established a data model to describe the uncertain time-series. Under this model, the data point at each time slot was built up by the set of one sample observations. Each sampling point has the same probability of occurrence, that is uniformly distributed and different time points of the time series is relatively independent. In this model, the true distance between two uncertain time series are consisting of a large number of possible distance (with a certain probability value). Therefore, on the basis of the model proposed by this paper, two algorithms have be proposed for uncertain time series similarity matching:a-PRQ (mean method) and k-PRQ (cluster method).(1) a-PRQAccording to the query sequence and time series data stored in database are whether deterministic, The uncertain timing sequence similarity query is divided into three different types; Then, for each type, by the means method (averaging method) extracted from the sequence of uncertainty out of a deterministic sequence to represent the original sequence take the deterministic time series similarity matching the query. (2)k-PRQThis algorithm is mainly through a two-step pruning to reduce the computational complexity:1) Through the cluster to reduce the sample size (sample size) to calculate the distance to each cluster after clustering as a unit, thereby greatly reducing the computational complexity.2) Pre-calculated a given thresholdε, from the number of upper and lower bounds, we can get the distance to the probability of the upper and lower bounds through probability of the upper and lower bounds, it filters out unnecessary calculations and thus reduce the computational complexity.The experiments show that the two uncertain time series similarity matching algorithm has better performance and accuracy.

Keywords/Search Tags:

uncertain time series, uncertain data model, similarity matching, probabilistic range query, time series distance

PDF Full Text Request

Related items

1	Research Of Key Issues On Similarity Matching For Uncertain Time Series
2	Research On Data Mining And Forecasting Methods Over Time Series Data With Complex Structure
3	Study On Similarity Query Over Time Series Data
4	Research And Analysis Of Uncertain Time Series Clustering Algorithm
5	Research On Dimensionality Reduction And Storage Methods Of Uncertain Time Series
6	Research Of Dimensionality Reduction And Similarity Matching For Uncertain Time Series
7	Representing And Real-time Probabilistic Query Processing To Uncertain Data With Complex Correlation
8	Research On The Similarity-Based Time Series Data Mining
9	Time Series Similarity, Aggregate Top-k Query Algorithms And Applications
10	A Classification Approach For Uncertain Time Series