Font Size: a A A

The Application Of Clustering Analysis Based On Principal Component Analysis And Rough Set In Financial Index Data

Posted on:2013-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:S Y TaoFull Text:PDF
GTID:2248330395459468Subject:Software engineering
Abstract/Summary:PDF Full Text Request
“CHINA STATISTICAL YEARBOOK” is an authority statistical data book which wasedited by National Bureau of Statistics of China. The book mainly records Chinese dataincluding population, energy, economy and other aspects which reflects Chinese social statusdata information. Through these data information, it can understand the economic situation ofChina in macroscopic, and it also provide a basis for future economic by summarizing thepast economic status.This article mainly aims at the Eleventh part of "China Statistical Yearbook2010", whichis the general situation of the city. The total number of the provincial capital cities and citiesspecifically designated in the state plan is36. In view of the36City, it lists22economicindexes. This paper’s aim is studying the economic indexes of the36cities, and it madeclustering of the36cities through the basis of these economic indicators.In data mining, clustering analysis is a common data analysis method, and using clusteranalysis can study data better, therefore it can extract useful information on the lives of people.Cluster analysis has been widely used in people’s life, and through cluster analysis, it canmake the objects belong to a class, which have same or similar nature. Principal componentanalysis is a method for dimension reduction. Aiming at the22economic indicators of highdimensional data, it reduced the dimensions of the sample data using principal componentanalysis. Rough set upper approximation and the lower approximation set ideas can solve theproblem that clustering analysis of boundary is not clear.Based on the idea combing with the principal component analysis, rough set and clusteranalysis, this paper presents a clustering model, and the model reduced the dimensionalityusing the principal components analysis ideology. And it solved the problem that thetraditional clustering methods of boundary is not clear using rough set method of upperapproximation and lower approximation set. The model of principal component analysis andrough set ideas were used in clustering analysis, and it made clustering China statisticalyearbook data combing the three methods including principal component analysis, rough setand cluster analysis. The specific method is that it made cluster analysis using the clustering algorithm based on rough set after principal component analysis.The model includes the following steps:(1) It achieved the purpose of dimensionality reduction, after making principalcomponent analysis to36cities’ information containing22economic indexes.(2) After making clustering analysis of the data in the step (1) based on rough set, it gotthe clustering centers of the upper approximation set and the lower approximation set, and itgot the clustering analysis results, including the upper approximation and lowerapproximation set.(3) Comparing with the results obtained by clustering model and traditional clusteringmethods, it brings advantages of principal component analysis and rough set, and it verifiesthe validity of the model.
Keywords/Search Tags:Rough Set, Clustering Analysis, Principal Component Analysis, Dimension Reduction, SPSS
PDF Full Text Request
Related items