Font Size: a A A

Research And Analysis Of Uncertain Time Series Clustering Algorithm

Posted on:2019-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:X P ZhuFull Text:PDF
GTID:2428330596450380Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a special kind of big data,time series exist in many fields,such as economy,medical treatment and speech recognition.It is a common form of data in our daily life.With the ubiquity of data uncertainty,researchers began to pay attention to the impact of uncertainty on the potential of data.Based on the clustering analysis and research of continuous uncertain time series,in this paper,the traditional static data clustering algorithm is improved to find a clustering algorithm which is more suitable for uncertain time series.First of all,aiming at the representation of uncertain time series,this paper compares the difference between the dynamic characteristics of time series and traditional static data,and summarizes the continuous uncertainty time represented by the probability density and discrete uncertainty time series represented by probability values.Then,based on the research of traditional static data clustering algorithm,this paper analyzes the influence of different similarity calculation methods of time series and the initial cluster centroid selection method on the clustering results.Based on the concept of minimum spanning tree,the existing initializing centroid selection method maximum-minimum algorithm is improved to make the selected initial centroid distribution more uniform.Second,this paper improves the traditional Uncertainty Data Clustering Algorithm UK-Means algorithm by adopting the probability error function to represent the difference between the observed value and the true value of the time series at each time point.Aiming at the time-shift errors existing in the uncertain time-series datasets,the dynamic time-warping ULDTW algorithm with limited window width is proposed to deeply mine the complex similarity relationship among uncertain time series.Besides,this paper improves the traditional method of finding mean cluster centers and employs a 1ToNCenter Algorithm to enhance the quality of clustering.Experiments indicate that the ARI index of UKMeansULDTW algorithm based on ULDTW proposed in this paper has a significant improvement compared with the traditional UK-Means algorithm when dealing with uncertain time series.Finally,aiming at the large complexity of the ULDTW algorithm in the UKMeansULDTW algorithm,we combine the UK-Means algorithm and cohesive hierarchical clustering algorithm to cluster the uncertain time series.By using the concept of micro-clusters,the UK-Means algorithm based on the European expectation distance is adopted to divide the uncertain time series into more densely distributed micro-clusters,and then gradually merges these micro-clusters by employing thehierarchical clustering algorithm based on the ULDTW algorithm to reduce the computational cost of ULDTW.At the same time,this method reduces the dependence of clustering algorithm on the initial cluster centroid and makes the clustering result more stable.
Keywords/Search Tags:Uncertain time series, UK-Means clustering, Hierarchical clustering, DTW similarity, ARI
PDF Full Text Request
Related items