Improvement And Empirical Study Of K-means Ciustering Algoirthm On Panel Data Analysis

Posted on:2016-03-18

Degree:Master

Type:Thesis

Country:China

Candidate:X Gao

Full Text:PDF

GTID:2308330470451818

Subject:Statistics

Abstract/Summary:

The increasing development of science and technology and the continuous expansion ofindustry database, big data in recent years has been gradually into peopleâ€™s horizons. Big datais complex and it is the collection of multidimensional variables, sample diversecharacteristics. People research and study the big data with larger enthusiastic just because ofits presentation of complicate and innovation of information and knowledge. A key point inthe research of big data is the continuous development and perfection of data miningtechnology. In the final analysis, the role of big data is the knowledge that it offers, rather thanthe big data itself. The main tasks of big data mining contain description and prediction. Asclustering model has both of the described and predicted functions in data mining, it played animportant role in the pre-processing step and class division step. Therefore the writer took thedata mining clustering model as the main content of this article, elaborated ideas and methodsof data mining procedures to the reader.The panel data possess the characteristic of continuity in time dimension. As for this, thepaper put forward an improved clustering method which based on the traditional k-meansways of clustering. The innovation of the method comprising: define a new similarity indexbetween objects, which could take both the time and spatial dimensions into consideration toconstitute the overall similarity; split the clustering procedure in accordance with the timedimension of the samples and give the attribution circumstances of each period; according tothe membership principle, calculate the weights that object belongs to a class which couldreflect the likelihood that an object belongs to a class. The innovation is designed to avoid thedegradation defects of the time dimension in the past clustering methods, and thus couldobtain a much better and objective result. Comparing with the traditional clustering methodwhich only take the special development trend of the samples into consideration, theimproved method take consideration to both the time and special development trendï¼Œso it ismore suitable for the panel data.Chapter1briefly introduces the knowledge of big data and data mining and the researchsignificance of panel data clustering pattern. In the second chapter briefly describes the knowledge of multiple clustering analyses for the readerâ€™s, such as the idea principle andmethod of multivariate cluster analysis steps. In chapter3of this paper, the emphasis is theimproved multivariate clustering model on the panel data. Finally, the method is applied to thelisted companyâ€™s shares empirical data analysis, and compared with the traditional clusteringmethods with various aspects. After verification, the results obtained by the improved methodare superior to the traditional method.

Keywords/Search Tags:

Big data and data mining, Clustering model, K-means clustering algorithm, Stock

Related items

1	Scmi-superviscd K-means Clustering Algorithm In Data Mining
2	Research And Implementation Of Stock Volatility Prediction Based On Improved K-Means Algorithm
3	No Default Categories For Large Amount Of Data Clustering Algorithm Research
4	Data Mining Research And Application Of Orders In The Tobacco Industry Crm
5	The Research Of Clustering Data Mining Based On Swarm Intelligence Algorithm
6	The Application Of Data Mining And Wavelet Theory In Stock Market
7	Clustering Data Mining Applications In Department Store And K-means Clustering Algorithm Improvement
8	Research And Improvement Of K - Means Clustering Algorithm
9	Research Of Improving For K-means Clustering Algorithm
10	Research On Ensemble-Initialized K-Means Clustering Algorithms