Font Size: a A A

Research On Sentiment Analysis And Opinion Mining Of Weibo

Posted on:2016-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y JuFull Text:PDF
GTID:2308330503976328Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
After stepping into the era of mobile internet, people are becoming more willing to share their opinions and life experiences with others through internet. Meanwhile, it results in an explosive accumulation of subjective information on social networks like Weibo. And in this context, the sentiment analysis of Weibo keeps both great social and commercial values; as for governors, it can facilitate their understanding towards public opinions; while for merchants, it helps them to follow up consumers’attitudes.Here I aim to develop a set of methods for sentiment analysis and opinion mining of Weibo. My work mainly comprises four parts, i.e. unknown words recognition, Weibo corpus pre-processing, sentiment lexicons extension and algorithms for sentiment analysis. Through the work, we have significantly improved the recognition rate of the word segmentation system. And a sentiment lexicon for Weibo was built by us. We have also constructed intelligent mining method for the sentiment polarity analysis of Weibo. Finally, we achieved to build up a proto-system for the sentiment analysis and opinion mining of hot topics on Weibo. And through it we can get the sentiment polarity of topics, as well as acquire the real opinions of Weibo users.In this paper, we first give a brief introduction to the current study of sentiment analysis and opinion mining, and its social and commercial value. After the dissection of the linguistic characteristic of content on Weibo, we evaluated the effectiveness of data pre-processing methods and algorithms for sentiment analysis of Weibo. On data pre-processing, we provide an algorithm for unknown words recognition which based on co-occurrence frequency, and it can effectively recognize unknown words from Weibo and improve the performance of word segmentation system. The sentiment lexicons expansion is started by pre-processing of Weibo corpus based on the word segmentation system. And then is the clarification of rough data to make it well structured. Eventually we expand the primary sentiment lexicons based on well-processed data. And in this process, we have compared the advantages and disadvantages of two sentiment lexicons expansion algorithms that based either on Weibo corpus or on the SO-PMI. While 3 on the algorithm for sentiment analysis of Weibo, we first compared several methods for feature extraction, which is composed of its feature selection and weight calculation. And we found the Chi-square statistic and Boolean weight are best fit for feature selection and calculation of weight, respectively. Then we discussed Self-training algorithms. And we investigated two unsupervised algorithms for sentiment analysis, i.e. lexicon based sentiment analysis algorithm and spectral clustering dichotomous algorithm, as well as evaluated their advantages and weaknesses.As for application, we have built up a system for the sentiment analysis and opinion mining of hot topics on Weibo, which is based on our analysis and revision of several core techniques in sentiment analysis. And the system could accomplish the functions of data acquisition, data pre-processing, sentiment polarity judgment and opinion mining for hot topics on Weibo. The system applies the LDA topic model to extract key words from topics. And it could mine real opinions of Weibo users through analyzing key words and their sentiment polarity, so as to help its users make right decision. The system has certain academic significance and practical value.
Keywords/Search Tags:sentiment analysis and opinion mining, Weibo, sentiment lexicons, unknown words recognition, SO-PMI, spectral clustering dichotomous algorithm
PDF Full Text Request
Related items