Font Size: a A A

Decision Tree Generation Algorithm Based On Time Series Pairs

Posted on:2018-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:L XuFull Text:PDF
GTID:2348330515950475Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Time series data is a kind of high dimensional data,which is closely related to our daily life.Its widely exist in the commercial,medical,meteorological and other fields.In the many study of time series classification algorithms,compered with other classification algorithm,the time series classification algorithm based on decision tree algorithm has a strong decision analysis ability.and it is not based on the normal statistical distribution hypothesis,as well as has higher classification accuracy and strong robustiness It has the characteristics of long time span,real-value orderly.and existence of autocorrelation between series data.How to extract knowledge from a large number of complex time series and forecast the newly generated data based on the extracted knowledge is the main research content of time series data mining.For the case of whether the time series data is completely labeled,our study discusses new time series classification algorithm based on decision tree algorithm and classifier ensemble technology,the detailed results are listed as follows:(1)Research on decision tree generation algorithm based on time series pairs under supervised learning.This research proposes the concept of sequence entropy based on the characteristics of autocorrelation and misalignment between sequences of time series.replaced the information entropy used in traditional decision tree as attribute selection cri-terion.And using sequence pair as the decision tree splitting attribute.A new time series decision tree classific:ation algorithm(TSDT)is designed based on sequence entropy,se-quence pair and Dynamic Time Warping(DTW).Moreover,the TSDT algorithm is used as base classifier.and a new ensemble classifier of time series decision tree(En-TSDT)is constructed by using the classifier dynamic ensemble technique.Experiments on the UCR data sets show that,c ompared with the best time series classifier,nearest neighbor with DTW classifier,the averaged F1 value of En-TSDT improves by 1.47%and the error rate reduces by 9.80%on the experimental data,set.The experimental results show that the decision tree algorithm based on sequence entropy and sequence information gain can effectively overcome the shortcomings of traditional decision tree algorithm to ignore the characteristics of autocorrelation and misalignment of time series data,and improve classification performance of decision tree algorithm on time series data.(2)Research on decision tree generation algorithm based on time series pairs under positive-unlabeled learning.This new algorithm is based on Positive-Unlabeled Deci-sion Tree algorithm(POSC4.5),and extended the characteristic attribute to the whole sequence.Usiing the sequence pair with highest information gain as splitting attribute,and build the new Positive-unlabeled time series decision tree(TSPOSC4.5)with this pair.The sequence pair is random combined with positive data and negative data,which is a,nearest neighbor set of a sample with farthest DTW distance to positive sets in un-labeled sets.Finally,our study proposes an ensemble classifierEn-TSPOSC4.5 based on TSPOSC4.5 and uses mean value of parameter estimation to reduce the influence of parameter estimation error on classification performance.On 16 meet the positive-unlabeled condition UCR data sets,the experimental results show that,compared with F1 value of the current best classifier,positive-unlabeled Markov(PU Markvo)time series classifier,and the widely used positive 1-nearest neighbor based DTW distance classifier,the En-TSPOSC4.5 classifier average improves by 4.95%and 11.45%with different labeled rate of positive samples.The results show that,the positive-unlabeled time series ensemble dicision tree based sequence pair algorithm has strongest cla,ssifica-tion performance.
Keywords/Search Tags:time series classification, positive-unlabeled learning, decision tree algorithm, sequence entropy, ensemble learning
PDF Full Text Request
Related items