Font Size: a A A

The Application Of Cluster Analysis And Principal Component Regression In Industrial Statistics Data

Posted on:2015-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:J L LiuFull Text:PDF
GTID:2268330428985565Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the current environment, the world economy is to adjust the depth; thedevelopment of domestic and international environment is very complex. The worldeconomy is going to go slow, while the national macro-control is being implementedto achieve economic development, industry plays a key role in a country. Industry isthe driving force of a country’s development and take-off is part of an importantnational economic base.In this year, China’s industrial development has an important opportunity, fromthe current economic situation indicators, China’s employment situation is basicallystable, the overall level of prices are basically flat, the economy maintained a steadypace of growth.Data comes from the topic of this article,“2013China Statistical Yearbook," theChina Statistical Yearbook summary of the various economic data since the reformand opening up three decades, and in accordance with the time, according to thegeographical and other aspects to study in China economic data.Data mining is a comprehensive research discipline, which combines theknowledge of mathematics, probability of cross-disciplinary knowledge, knowledgedatabases, biological knowledge in various disciplines, and many areas of theeconomy in the current society, mathematics, biology, science, etc. where both have awide range of applications.By studying the “China Statistical Yearbook “in the industry and consumerstatistics data, this paper established the two models.The first model is a clustering model.Cluster analysis is a commonly used data mining algorithms, K-meansclustering analysis is the clustering analysis of the classical algorithm. This article uses the K-means clustering studies in2013,“China Statistical Yearbook " in theindustrial statistics, and the country’s31provinces and municipalities clustering,clustering results obtained and the reality of the situation analysis of industrialdevelopment in different regions.The second model is the principal component regression model.Principal component analysis is a dimension reduction approach is the use oflinear transformation process multiple variables into a small number of principalcomponents, multiple linear regression is a linear relationship between the use ofmultiple independent variables and the dependent variable regression model toestablish the process. This article will multiple linear regression and principalcomponent analysis combined the two together to establish the principal componentregression model, and apply this model to the data in the China Statistical Yearbook.Industrial development has a certain impact on society, per capita consumption level,principal component analysis model is to study the relationship between industrialstatistics and per capita consumption levels between. Establish the level of per capitaconsumption and the main component of industrial statistics between multipleregression model, first, the industrial statistics data on principal component analysis,and then the per capita consumption level and principal component analysis of theresults of multiple linear regression, principal component regression results andcompare with other regression methods, principal component regression results foundbetter.
Keywords/Search Tags:Cluster analysis, principal component analysis, multiple linear regression, principal component regression
PDF Full Text Request
Related items