Font Size: a A A

Microblog Data Mining Based On Complex Network

Posted on:2015-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q N LiuFull Text:PDF
GTID:2298330467963736Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In recent years, people are paying more and more attention to data mining technology. The so-called data mining technology is often referred to the process of using certain technical means to find hidden, specially related or potentially valuable information among vast amounts of data. Currently, data mining technology has been used in the analyzing customers, risk control, financial investment, economic forecasting and monitoring and several other areas, which has created huge economic and social value. With the rapid development of microblog in recent years, a lot significant information has been hidden in this platform. Facing all kinds of complicated information in microblog, it is really a problem to find out the effective data effectively. Therefore, the data mining technology study on microblog is particularly urgent.This paper analyzes the complex network characteristics of Sina microblog based on related theoretical knowledge of complex networks and the specific characteristics of information on microblog, therefore finds out a proposal of community structure discovery algorithm which suits microblog users, according to its complex network characteristics. In the meantime, it also concentrates on analysis of the modeling of microblog users’interest graph. In past analysis of modeling of microblog users’ interest, they mainly used microblog users’content to analyze. However, most users only provided the information of their advantages, which can hardly reflect their real interests. So this paper introduces the concept of edge information and comes up with a new modeling algorithm of users’interest. It extracts the characteristics of the user’s interest based on LDA algorithm first, then deals with the characteristics of the user’s interest according to complex network characteristics among users and the principle of characteristics communication, finally draws a conclusion of probability distribution of users’ interest on different areas and the users’ interest model.The main innovation and distribution of this paper is that the community detection algorithm and the interest modeling algorithm is more suitable for the microblog data analysis. Meanwhile, compared to traditional interest modeling algorithm methods, the new method performs better in users’ interest analysis, solving the problem of over-simplicity on the users’interest modeling.
Keywords/Search Tags:Complex Networks, Data Mining, Clustering, Interests Modeling, LDA, Feature Propagation
PDF Full Text Request
Related items