Font Size: a A A

Research On Unsupervised Clustering Algorithm And Applications On Series Data Analysis

Posted on:2017-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2428330569998887Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,as unlabeled data is readily available,the need for its analysis is increasing.Unsupervised clustering algorithm has been a hot topic in the research and application of machine learning algorithms for unlabeled data.Multiple kernel learning has an important influence on clustering algorithm,and the extreme learning machine for clustering and classification is also emerging in recent years.Based on multiple kernel learning and extreme learning machine,this paper proposes the clustering algorithms for data with different characteristics.Considering the lack of partial data and the scarcity of feature representation,an extreme learning machine based multiple kernel k-means algorithm with diversity-induced regularization is proposed.Considering the existence of noise and redundant information,this paper proposes a multiple kernel learning algorithm via low-rank and matrix-induced regularization.Aiming at the different objective functions of the above two algorithms,this paper proposes different iterative optimization algorithms,and verifies that it has good convergence and large range of optional parameters.The performance of the proposed algorithm is better than that of the classical clustering algorithms and the most state-of-the-art clustering ones.The common form of unlabeled data is a continuous statistic over a period of time,called series data.In view of the series data with different types,the analysis methods are very different.In this paper,we combine the clustering algorithm with its characteristics to realize the application of clustering algorithm in the series data.For the circuit series data,extract the trend of the series,determine the similarity between the data,cluster the data according to extreme learning machine based multiple kernel k-means algorithm with diversity-induced regularization,and then the dimensionally reduced series data and each sample is visualized.By comparing with shapelets and single kernel spectral clustering methods,it can be seen from the visualization results that the similar data distribution is dense and the clustering of circuit series data is of great help to data analysis.For the sound series data,the Meyer frequency cepstral coefficient and spectrogram are extracted as the feature,construct the kernel matrices,cluster the data according to multiple kernel learning algorithm via lowrank and matrix-induced regularization.It can be seen from the clustering performance that our algorithm converges the sound series data into the corresponding class better than other algorithms.
Keywords/Search Tags:Clustering, Extreme Learning Machine, Multiple Kernel Learning, Series Data
PDF Full Text Request
Related items