With the development of economy, technology and society, informationtechnology is developing rapidly, and the people researching and paying attention tothe information and data are on the increase. Time series told from broad sense isalong with the time sequence data, the spatial variation of the sequence data, using thesame time or in the same distance metric space; as a kind of mass data in the data,embodied in various fields, in real life, for example: the rise in stock prices fell trenddata sequence, shopping malls data sequence, product sales recorded data series,patients’ characteristics of the data series and so on. How many time series datainformation fast, efficient search analysis of similarity with the known sequence dataof implicit information and knowledge, is now an important topic.Because of the time series of high noise, high dimension, fluctuationcharacteristics, analysis of time series data quickly and efficiently, and tap thepotential of information and the relationship between them, will be an important issue.Mining time series is divided into two stages, the first stage is the representation oftime series model; the second stage is the data mining of time series. Patternrepresentation is extracted, fit the original sequence curve according to the mainfeatures of the original time series data, refitting the time series data representation.Mining the sequence pattern is that after further study, deep layer. This paper is basedon the similarity of the time series analysis as the main line, from the patternrepresentation of time series to study the measurement method and similarity of timeseries of these two aspects. The innovation points and contributions are as follows:(1) Pattern representation of time series based on information entropyPiecewise linear time series representation based on information entropy ismainly to remove the noise of time series, and improve the fitting error, and find theeffective solution to the interference between the analysis data change data inaccuracy issue. The traditional piecewise linear representation methods mostly through thedirect use of the difference between data to pattern representation, these methods cannot effectively remove noise. Compared with the previous methods, experimentsshow that the said method, piecewise linear time series representation method basedon information entropy, has the obvious superiority in the aspect of noise eliminationand fitting error(2) The modified method of similarity analysisIn time series similarity study, putting forward a longest common subsequence tomark important measure method can effectively improve the speed and efficiency ofsimilarity analysis of time series. The important point sequence, the concept and thepointer matrix vector angle cosine similarity comparison method is introduced. Usingthe model represent important point sequence, based on the steering angle, combinedwith a piecewise average division idea, the longest common subsequence similaritymeasurement principle and pointer matrix, the similarity analysis use the steeringangle between the vector cosine value fast measurement sequence and pointer matrixanalysis of two sequence similarity.Experiment show that, and the similarity analysisis fast, efficient and effective. |