Time Series Data Mining Based On Large Margin Theory

Posted on:2013-03-24

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X Yu

Full Text:PDF

GTID:1268330392967567

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

The time series have been widely used in each field. Sequence data analysis anddata mining become hot spots and continuous attention has been paid in scientific area.Data of high dimension and features such as non-independent of time dimension informa-tion lead to the difficulty of using information effectively in knowledge discovery fromsequence data. Therefore, many traditional machine learning algorithms can not readilyobtain satisfactory results. Aiming at the particularity of the time series data, the largemargin theory in machine learning is adopted to study the time series data mining in thisdissertation. Some of the important problems are as follows:A sequential similarity measure method is designed based on the large margin the-ory. As a core problem in machine learning, similarity measure directly relates to theeffect of algorithm in the time series data mining. According to various phase shift phe-nomena commonly existed in sequential sample, a dynamic time warping similarity mea-sure method is designed based on the large margin theory. Compared with the Euclideanor dynamic time warping distance, the matching strategy for sequence distortion is im-proved. As for the distance instability phenomenon of high-dimensional data measure,the effectiveness of distance measurement is optimized through the norm learning.The feature extraction of supervised learning/data re-expression algorithm is de-signed based on the sequential characteristics of fragments. One of difficulties in thetime series data mining is that the effective identification information is often hidden inthe local sequence fragments rather than the entire area. This phenomenon often existsin sequence problems such as the trajectory from image edge. By contrasting variousfragments of useful information, several fragments with the largest discriminant capacityare selected to represent the entire sequence. Compared with traditional methods, thisfragment-based feature extraction/data re-expression method is especially suited for thetrajectory from the edge or sequence data obtained by curve. It also can improve theclassification accuracy, efficiency, and interpretability. Besides that, this method is com-pared with the well-known similar algorithm shapelet.The classification performance ofthis model is verified by the experiment.The sequence coarse graining algorithm is proposed based on the large margin the- ory. The changing relationship between useful and useless information is studied duringthe transformation of sequence data from values to symbols. Although some useful infor-mation is lost during the transformation, useless information is also reduced significantly.A supervised discretization method of sequence data is proposed to improve the classifi-cation accuracy and efficiency, which is also verified by the experiment.The sequential classification model is designed based on critical cases. During thedesign of critical sample set, the efficiency of each sample is evaluated by using the largemargin theory. The weights of samples which can produce the largest assumptions mar-gin are increased while the weights of outliers and the redundant samples are decreased.Those above can improve the generalization ability of the classification model. In addi-tion, the computational efficiency of classification model can be improved by reducingredundant training samples. The validity of this method is confirmed by the experiment.

Keywords/Search Tags:

Time series data mining, large margin, dynamic time warping, feature seg-ment, prototype selection

PDF Full Text Request

Related items

1	Implementation Of Dynamic Time Warping Algorithm Acceleration System Based On SoPC Platform
2	Research On Similarity Measurement Method Of Time Series Data Based On Dynamic Time Warping
3	Improving efficiency and effectiveness of dynamic time warping in large time series databases
4	Research Of Similarity Search Based On Dynamic Time Warping In Time Series Data Mining
5	Time Series Similarity Search Based On Adaptive Cost Dynamic Time Warping Distance
6	A Number Of Issues In The Time Series Data Mining Research
7	Multivariate Time Series Similarity Analysis Method And Application In Data Mining
8	The Research Of Similarity Search Based On Dynamic Time Warping In Time Series
9	Time Series Similarity Search Under Piecewise Dynamic Time Warping
10	Research On Feature Representation And Classification Methods In Time Series Data Mining