Time Series Classification(TSC)has always received keen attention from the in-dustry and academia,and has a wide range of application scenarios.In recent years,in time-sensitive applications,the problem of time series classification not only pays atten-tion to the accuracy of classification results,but also to the timeliness of classification results.This type of problems is called Early Time Series Classification(ETSC).The core of early time series classification is that the data in all time intervals can be observed in the training phase,but the classification results with high reliability need to be output as early as possible in the testing phase.Although missing data is a common phenomenon,it is rarely discussed in existing early time series classification models.Meanwhile,missing values pose some severe challenges to existing early time series classification models.This thesis proposes a dynamic early binary classification incomplete time series model(DEBITS),which mainly includes two important parts: constructing a probabilistic classifier for incomplete time series data and learning a decision region for early classifi-cation reliability.This thesis conducts experiments on synthetic datasets,UCR databases and real datasets,respectively.These experimental results show that the method proposed in this thesis can achieve better performance in both the accuracy and earlyness of the classification results.The main work of this thesis can be summarized into the following aspects:(1)Designing a recursive construction method for a probabilistic classifier of in- complete time series: this method can build probabilistic classifiers by only using observed data in incomplete time series.This thesis considers the missing mecha- nism of incomplete time series data,and theoretically proves the feasibility of the method under the condition that the features have conditional independence and ran- dom missing mechanism.(2)Proposing a decision rule for judging the optimal time point of early classifi- cation: this rule is constructed by the log-posterior probability ratio,which not only subtly solves the distribution fitting problem of high-dimensional features in incomplete time series,but also the interpretability of early classification problems is directly and dynamically explained by drawing decision regions.In order to deal with the continuous value of features,this thesis proposes a discretization method of feature values based on weighted conditional quantiles.Moreover,this thesis also proposes a generalization of the feature discretization method,which can be applied to the situation where the feature has missing values. |