Font Size: a A A

Research On Series Data Similar Search Technology

Posted on:2019-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y B CuiFull Text:PDF
GTID:2348330542956399Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Series data is widely used in medicine,economics and other disciplines,series data mining has been successfully applied in the fields of medical diagnosis,financial data analysis,meteorological forecast and astronomical observation.Time series data is a typical massive and high dimension data,how to analyze the massive series data efficiently is great significance to reveal the law of the development of things and provide the basis for scientific decision.Two key technologies in series data mining are studied in this paper: series similarity measure and series similarity search.This paper's specific work and contributions include:(1)Similarity alignment algorithm for series based on adaptive searching windowThis paper proposes a similarity alignment algorithm for series based on adaptive search window(ADTW).The algorithm uses Piecewise Aggregate Approximation(PAA)strategy to obtain low-precision series,then computes the alignment path under low-precision series and predicts the path deviation according to the gradient variation on the low-precision distance matrix,limits the scope of the path search window.Finally,ADTW improves the series accuracy gradually,modifies the path in the search window and calculates a new search window.ADTW realizes fast solutions of DTW distances and warping path.Compared with FastDTW,the ADTW algorithm can improve the computational efficiency by about 20% with the same measurement precision and the time complexity is O(n)(2)Series similarity search algorithm based on multi-level low bounds filteringAiming at the low efficiency of series data similarity search,we proposed a similarity search algorithm based on multi-level low boundary filter(Multi_LB).The algorithm selects multiple low boundary distance functions to form a multi-level filter,which can filter invalid series in the candidate set and dynamically adjusting the filtering order according to the success rate to maintain high efficiency.Multi_LB avoids the time-consuming lower bounds metric of invalid series with some distinctly differences and reduces the computational of invalid filtering.Experiments show that compared with the single low boundary filter search algorithm,the algorithm improves the search efficiency by about 15% while guaranteeing the completeness of the search.
Keywords/Search Tags:series data, similarity search, dynamic time warping, time series metric, low bound distance
PDF Full Text Request
Related items