Font Size: a A A

Application Of K - Means Algorithm In Microblogging Data Mining

Posted on:2017-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2278330485952972Subject:Computer Science and Technology
Abstract/Summary:
Today in twenty-first Century, Micro blog has become an essential part of people’s life. Its development speed is very rapid. As a new social media, it can be concerned about the sharing of information. It has the characteristics of fast release, diversity, less content and so on.those advantages cater to the needs of information that is accurate, real-time and diverse. So it is loved deeply by people. People can share and focus on their favorite things at any time in the micro blog. When using micro blog, people can pay attention to different friends and information according to their habits and customs.So according to the user’s interest to know about their interests and concerns. Different users have different tastes, just like"Birds of a feather flock together". We can use the common preferences of users promoting, marketing it.The data of Microblog is enormous, so it is of great importance of how to dig out the data they need quickly and effectively. Data mining can dig out the valuable data from huge data, The mathematical algorithm applied to in this method has been very mature, and has widespread applications in many industries, such as telecommunications, finance, and other sites, but there are still many problems to be solved in the micro-Bo user Interest group. In this paper, mathematical analysis and methods of mining data were applied to process the the microblog data, to explore hobbies and habits of users, and hope the method of data mining can be applied to the study of micro-Bo mining, and provide the new thoughts and ways to learn the data of micro-Bo data.In this article, we selected Sina WeiBo for the study, the microblogging users on interest groups for data mining clustering analysis. The process will first need to microblogging data visualization, so you can clearly microblogging data distribution, which can microblogging data preprocessing. Due to the amount of data used herein Weibo is very large, and most of the data is not less than the three-dimensional data to visually evaluate microblogging seems more complex. In this paper, k-means algorithm Weibo data mining clustering analysis, however, due to the traditional k-means algorithm to cluster analysis data mining Sina susceptible initial cluster center and to carry out an iterative solution process data easily local optimum. For the presence of the above shortcomings of traditional k-means algorithm, we introduce the k-means algorithm particle swarm (PSO) algorithm, PSO-kmeans improved algorithm makes the introduction of PSO algorithm becomes relatively simple algorithm, parameter settings are also becomes less, it can accelerate the convergence rate, which can effectively address the impact of the particles by the initial cluster center and out of local optimum shackles, improve the clustering effect. Finally, the use of three different metrics index microblogging cluster analysis of data mining results evaluation, the evaluation index shows improved results clustering algorithm PSO-kmeans more excellent than the traditional k-means clustering algorithm.
Keywords/Search Tags:Micro blog, data mining, user interest, k-means algorithm, PSO-kmeans algorithm
Related items