Font Size: a A A

Research Of Key Issues On Similarity Matching For Uncertain Time Series

Posted on:2014-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:H H WuFull Text:PDF
GTID:2248330395480918Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Time series as an important class of problems in data mining, whose importance is reflected a large number of data collection with the relation of time in the real world, and the data is time-related. The time series is an ordered set of sequence chronological order. The similarity of time series is an important basis of the problem on time sequence mining, which provides the necessary technical support and means for other means of excavation, also frequently used as a subroutine in other mining. The problem of similarity matching for time series is already quite mature. However, With the development of information technology and real-world applications demand continues to expand gradually emerged in a special kind of data, that is the uncertainty of the data, in the application of wireless sensor networks, radio frequency identification (RFID) network, tracking moving objects, weather radar network and Privacy. And the uncertainty of data is in accordance with the chronological sequence, so the formation of the uncertainty of the time series data.Due to the data of sensor collected is often uncertain in the real-world, combining with today’s popular time series similarity matching methods are built on the basis of deterministic data, and did not consider the uncertainty of the data, therefore, existing series similarity matching methods are not suitable for these areas. Currently, research on the uncertainty of the time series is just getting started, the problem of uncertain time series on similarity matching is still no effective solution. But the efforts of many scholars, there are also some excellent methods of similarity search, each of them are presented in the context of specific applications, but none of them recognized to be the efficient matching methods.To solve this problem, we perform pre-processing over uncertain time series. It is divided into horizontal and vertical dimensions, that is, time dimension and probability dimension. First, given a uncertain time series, which is compressed by the Haar wavelet transform. On this basis, we process the obtained uncertain time series longitudinally, and put forward a kind of method of elected representatives, which is maximum probability method, and the mean method select to select a certain time sequence. After pretreatment, we will carry on the dimensionality reduction and indexing with generated certain time series. According to the query sequence and each time series in the database in the combination of uncertainty, we put forward the similarity matching algorithm corresponding to a combination of them respectively.Finally, related experiments will be conducted in two ways, which is one of directly elect representatives, the other of after wavelet compression and then elect their representatives. Experiments show that the two processing algorithms are feasible, and draw two results recall and precision. Through the comparing between the query efficiency of the different amount of data, the result shows that the latter treatment efficiency is higher.
Keywords/Search Tags:time series, uncertainty, matching, dimensionality reduction, Haar wavelet transform
PDF Full Text Request
Related items