Font Size: a A A

Research On Panel Data Clustering Method Based On Dimension Reduction

Posted on:2022-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:T T TangFull Text:PDF
GTID:2518306521952429Subject:Statistics
Abstract/Summary:PDF Full Text Request
Dimension reduction is a common method in panel data clustering.By dimension reduction method,the panel data is reduced into cross-section data or time series data,and then the appropriate clustering algorithm is selected for clustering analysis.This kind of method is easy to understand and has strong applicability,and can select the appropriate clustering method according to the purpose of clustering.Principal component analysis and feature extraction are two common dimensionality reduction methods in panel data clustering.However,when there are outliers in the data,principal component analysis can not effectively identify and process the outliers,thus affecting the accuracy of time series extraction.Although feature extraction can effectively avoid the impact of outliers,hierarchical method is used in feature weighting The analysis rule is sensitive to expert empowerment and lacks objective rationality,which leads to unsatisfactory dimensionality reduction effect.Based on the existing dimension reduction methods,this paper analyzes and studies the shortcomings of principal component analysis and feature extraction methods,and proposes two improved dimension reduction based panel data clustering methods.Firstly,in the dynamic time warped panel data clustering,the influence of outliers on the dynamic time warping results is not considered when the principal component dimension reduction of panel data is carried out,and the obtained time series are not robust.A robust dynamic time warped panel data clustering method is proposed,which eliminates the influence of outliers on the dynamic time warping results by introducing robust statistics,And then improve the effect of panel data clustering.Using the population data of 31 provinces and autonomous regions,this paper makes an empirical analysis on the panel data clustering method of robust dynamic time warping,obtains the population clustering results of each province and autonomous region,and makes a comparative analysis with the method before improvement.Secondly,in view of the problem that the feature extraction method is used to reduce the dimension of panel data in the panel data clustering based on feature extraction,the AHP is used to weight the extracted features,which is easily affected by the experts' weight,resulting in the clustering effect deviating from the actual situation.Combined with the Bayesian thought,the AHP weight problem is transformed into the probability problem to calculate the weight Secondly,a clustering method of panel data based on Bayesian modified feature extraction is proposed.This method can not only satisfy the researchers' preference to choose weights for practical problems,but also improve the rationality and effectiveness of the weights.Through empirical analysis,it is proved that the clustering effect of feature extraction based on Bayesian correction is better.Finally,the applicability of the two improved clustering methods is tested through the cross-over test of different data.The empirical results show that the robust dynamic time warping panel data clustering method is more suitable for the panel data with small amount of data and weak volatility,and the Bayesian modified feature extraction panel data clustering method is suitable for the panel data with large amount of data and strong volatility.
Keywords/Search Tags:panel data, robust statistics, feature extraction, Bayesian theorem, cluster analysis
PDF Full Text Request
Related items