Font Size: a A A

The Research Of Target-dependent Sentiment Analysis Of Chinese Micro-blog

Posted on:2016-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:H WeiFull Text:PDF
GTID:2428330473464962Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The emergence of the Web2.0 has changed the traditional way of information dissemination and sharing,which leads to the explosion of user-generated data on the Internet.Micro-blog is one of the most popular social networking platforms in the Web2.0 Era.It's conveniently to write a micro-blog post and publish it in real time,so micro-blog platform has attracted a large number of users since it has been published.Nowadays,micro-blog has become an important way for people to record their lives,share their opinions,and discuss some popular events.A large number of micro-blog posts are published and disseminated everyday,which contains a lot of useful subjective information.Sentiment analysis of micro-blog data is very helpful in many fields such as business survey,public opinion monitoring and sociological research,which makes it become a hot research topic in the field of natural language processing.Micro-blog post is length limited,abnormal written,with casual style and divergent topics.So sentiment analysis of micro-blog texts is much more difficult and challenging than ordinary texts.This paper focus on target-dependent sentiment analysis of Chinese micro-blog,and the innovations and contributions are as follows:Firstly,opinion lexicon is an important tool to extract sentimental information from texts,but micro-blog posts contain many online emotional words,which cannot be detected by basic opinion lexicons.To solve this problem,this paper proposes a dependency relations based method to detect new emotional words.According to the sentence structures of emotional words,we build templates based on dependency relations to detect new emotional words,and then use a Pointwise Mutual Information based method to judge the semantic orientation of them.In the experiment,we use different opinion lexicons to extract sentimental features for micro-blog sentiment analysis.The result shows that compare to the basic opinion lexicons,our new opinion lexicon's performance is much better,and the average accuracy of classification is 6%~12% higher,which proves the validity of our opinion lexicon.Secondly,the traditional sentiment analysis methods always ignore structured semantic information,which leads to the low accuracy.They also tend to ignore the target of the sentimental expressions and adopt a target-independent strategy to analysis the sentiments,which leads to some mistakes.This paper uses syntax tree of sentence as the structured features,and use the convolution kernel of support vector machine to obtain the structured information from syntax trees.Then we propose a target-dependent syntax tree pruning strategy according to the domain ontology and appraisal expressions syntactic paths library,which can eliminate the inference of irrelevant appraisal expressions and analyze the sentiments towards certain target.At last,we use compound kernel to extract information from both structured and flat features.Experimental results on two corpora with different targets show that the average accuracy of classification can reach 86.6% and 86.1%,which is better than the traditional methods.
Keywords/Search Tags:Chinese micro-blog, Sentiment analysis, Emotional words detection, Dependency analysis, Tree kernel, Compound kernel, Pruning strategy, Support vector machine
PDF Full Text Request
Related items