In recent years, microblog social network is rising rapidly in China and becoming the main place of Internet users’ public opinion. In microblog, users can focus on topics which they are interested in and express their opinion. These data can be used for research on collective social behavior. At the same time, more and more users are keen to express their attitudes and emotions on microblog platform, such as support or opposition on a hot figure, positions and views to a hot topic and so on. Therefore, the research on topic classification and identification and emotional tendency of microblog has important significance.This paper proposed a model of sentiment analysis for different topics in microblog website to study a period of microblogs. Here are my main works:(1) This paper proposed a model of microblog topic classification and recognition. Firstly, it extracted features of microblogs. Then, it clustered preprocessed microblogs based on K-means clustering algorithm to divide microblogs which had the same theme into a class. Then, it identified the topic of each class of microblogs based on LDA topic model to realize the microblog topic classification and recognition. This paper presented the evaluation experiment for the effectiveness of the clustering and the experiment for microblog topic classification and recognition. The experiment results proved the availability and stability of microblog topic classification and identification model.(2) This paper established a dictionary of emotion. Firstly, it built a fundamental dictionary of emotion including commendatory terms and derogatory terms based on existing emotional words resources and judged sentiment polarity of unknown emotional words based on calculation of word similarity to expand the fundamental dictionary of emotion. Then, it calculated the emotional value of new words which are discovered in microblogs by dependency parsing and counting word frequency to judge the emotional tendency of new words. This paper presented the emotional tendency judgement experiment of unknown emotional words based on the microblog data set of COAE2008. The experiment results showed that the model of the emotional tendency judgement for unknown emotional words is more accurate and it’s stable.(3) This paper constructed a model for analysis on microblog sentiment. Firstly, it computed emotional values of microblogs by the established dictionary of emotion and syntax parsing to divided microblogs into positive, negative and neutral microblogs. Then, it selected a certain proportion of positive microblogs and negative microblogs as a training set. It applied support vector machine algorithm to classify microblogs that didn’t belong to training set and got all emotional tendencies of microblogs. To verify the availability of the microblog sentiment analysis model, this paper presented the experiment based on the microblog data set of COAE2013 and compared the experiment results with the best results of COAE2013. It showed that the model for analysis on Chinese microblog sentiment was more effective.(4) This paper presented the sentiment analysis experiment based on topic division. It analyzed emotional tendency of microblogs based on topic division. Then, it got emotional tendencies of all users in each topic of a week. Finally, it calculated ratios of positive microblogs, negative microblogs and neutral microblogs and acquired emotional tendencies of active users in each hot topic. |