Font Size: a A A

Research And Application Of Panel Data Clustering Based On Feature Extraction

Posted on:2019-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:D Y DaiFull Text:PDF
GTID:2428330626450174Subject:Statistics
Abstract/Summary:PDF Full Text Request
In the era of big data,various types of data emerge in an endless stream.Panel data has become a popular data group arousing high attention and hot research.Used as a technical means in multivariate statistics analysis and data mining,clustering frequently appears in the preprocessing and basic analysis of panel data.As the background and emphasis in real problem are different,the existing panel data clustering method will fail.According to the actual problems,extracting the corresponding data characteristics and clustering by using the extracted features will make the clustering method more targeted and the effect more effective.Based on the characteristics of actual data,this paper studies the problem of panel data clustering,and proposes two panel data clustering methods suitable for different types of data.The results show that the two clustering methods have achieved good results through empirical analysis,in which the application effect and adaptability of the two methods are tested.Specifically,the main content of this paper includes the following four parts.Firstly,the development history and research status of panel data clustering methods are combed,and the research routes adopted in this paper are designed.The panel data types and data standardization methods are summarized,and principal component analysis,wavelet analysis,entropy method,hierarchical clustering and other related principles are provided.Secondly,an improved clustering method for panel data of feature extraction(PCA clustering method)is proposed.First of all,the principal component analysis was used to carry out a second extraction of the index features extracted by predecessors.Then,these features are given weight through the entropy method.After that,the hierarchical clustering method is used for clustering these weighted features.The validity of the method was verified by panel data of the real estate industry.Thirdly,a clustering method for panel data based on wavelet feature extraction(WLT clustering method)is put forward.The panel data is reduced to time series data by the principal component method.Based on this,some important features of time series data are extracted by wavelet theory.After weighting these features,the weighted features are clustered by the hierarchical clustering method.It turns out to be an effective method by the stock panel data.Fourthly,the applicability of PCA clustering method and WLT clustering method was tested by cross-control experiments with different data.The experimental verification shows that the WLT clustering method is more suitable for the multi-indicator panel data with longer periods and frequent fluctuations.The PCA clustering method is more suitable for the multi-indicator panel data with a small number of missing data with shorter periods that are not frequently fluctuated.
Keywords/Search Tags:panel data, principal component analysis, wavelet analysis, entropy method, cluster analysis
PDF Full Text Request
Related items