Font Size: a A A

Study On Chinese Semantic Orientation Analysis Based On Hownet

Posted on:2009-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:D Y ZhouFull Text:PDF
GTID:2198360308478560Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The highest level of artificial intelligence is to make computer understand human feelings. Predicting the semantic orientation of documents is a fundamental but technically challenging task which has great value. Especially, on the Web2.0, the great volume of reviews on Web contains lots of potential information. To find the information automatically and intelligently in time, it's required urgently to apply semantic orientation analysis technology for solving the problem.The biggest value for semantic orientation analysis is generating summaries from many reviews which comment on the same topic, so this refers to how to download large numbers of reviews spreading on the Web. Parallel crawlers are the available strategy to adopt which has been deeply studied, and researchers focus on serving pages gathering part of general search engine. Their target aims at the web sites in the whole world. The reviews on one topic locate collectively on several sites and the content of the reviews is highly structured, so in this thesis, a dynamic task assignment parallel model is designed to gather Web reviews and the model is implemented with relational database on www.douban.com web site. The result of the experiment shows great advantage of the model, good scalability, easier crawler design and low requirement on computer which runs crawlers. The model can also serve as web page gathering part of vertical search engine.Sentiment repository is the basis of semantic orientation analysis, but there is even no sentiment dictionary in Chinese. Though HowNet contains lots of sentiment words, it's hard to be applied to sentiment analysis directly. So sentiment dictionary construction is studied based on HowNet in this thesis.On the aspect of document level orientation analysis, the knowledge and rule of linguistics is considered useful for statistical learning algorithm by this thesis, so an attribute weighted statistical learning method is proposed on semantic orientation analysis, which augments the sentiment words contribution in document orientation classification. Weighted Naive Bayes and weighted score algorithm are implemented and the experimental result shows that attribute weighted statistical learning method can improve the accuracy of document orientation classification effectively.
Keywords/Search Tags:orientation analysis, sentiment analysis, parallel crawlers, weighted naive bayes, sentiment dictionary, HowNet
PDF Full Text Request
Related items