Font Size: a A A

The Blog Ball Clustering Study, Based On Data Mining

Posted on:2012-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:W T JiFull Text:PDF
GTID:2218330368476543Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the emergence of Web2.0 concepts, many systems based on Web2.0 technology with the students, the blog is one of the most typical applications. In recent years, the blog as a new thing, is to promote the rapid development of the entire Internet industry, according to media reports the number of blog on the Internet, the rapid growth. Modern blog is the personal diary from the evolution of the Internet over. People with a blog to record their daily personal lives, and in Bowen in to express their experiences, feelings, and their topics of interest on any views expressed. Blog blog ball that contains all of the community or collective term for social networking. Blog sphere as a huge database of information is causing the business elite, government leaders and researchers are highly valued. Although the study has just started the blog, but we can borrow in other areas have proved very effective method to study the blog sphere. This blog designed for data mining the ball will be many aspects of society have a positive impact. Government departments especially timely and accurate understanding of the people, understand people's suffering, to prevent mass incidents, contain serious incidents, will play an important role.Diversity for blog content and personalized features, this blog will be digging the ball and cluster analysis. These include the transmission of information on the blog content analysis, analysis of trends in the blog, the blog of the comparative analysis, clustering of interest. This is the blog of a very large ball as the original database, set up a blog the ball through the cluster database environment, the establishment of large-scale blog sphere matrix, analyzing blog each index weight in the matrix to determine the matrix elements. Posted by the blog author are often non-professional, so many blog posts is not very standardized, so the text in the blog language is not complete and even includes a bit error, such as a misnomer,synonyms, or typing; typos, etc. are not careful typing noise data. We will use the singular value decomposition (SVD) to the text vector matrix of noise, that is, to remove noise. Matrix decomposition using SVD and NMF on the database to reduce the dimension matrix denoising data mining, and use k-means clustering algorithm to cluster analysis of this information, discover useful knowledge and information.Therefore, the results of this research network for the control of developments in the field, to provide personalized Web services, improve the user experience and so has a very important research value and practical significance. This study for the development of a new research field, provides an effective data mining algorithms blog sphere for social services, for the promotion of blog-related research and application development will be of great significance.
Keywords/Search Tags:Blog-sphere, data mining, k-means clustering, singular value decomposition
PDF Full Text Request
Related items