Font Size: a A A

Research Of Sentiment Target Extraction For Chinese Microblogging Based On CRFs

Posted on:2015-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:S Z DuFull Text:PDF
GTID:2268330428497338Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of social networks, more and more people use microblogging for information exchange and sharing. Because microblogging has dapper, easy to use, and spread rapidly, etc., making it popular and penetrate into every aspect of people’s lives. Users are willing to share their views or experience on microblogging, which makes presence of a large number of sentiment polarity reviews. With the expansion of such data, it is difficult to use artificial methods to collect and process massive of comments. Therefore, how to use computer technology for processing and mining microblogging reviews effectively has become a research hotspot, sentiment target recognition is a very effective way used to solve this problem.In this paper, the content of sentiment target recognition research is conducted for the Chinese microblogging text. But sentiment target extraction for unstructured text like microblogging is a difficult problem and existing studies tend to have some inadequacies. On the one hand, the difference between traditional text and microblogging is microblogging freedom of expression and textual brief and usually non-standard Chinese language, which also increases the difficulty of recognition task, so existing methods usually cannot effectively avoid errors from parsing this particular text. We presents an approach through standardized micro-blog text to improve the segmentation and syntactic parsing. On the other hand, the contextual information, syntactic rules and opinion lexicon are considered in the targets extraction model. When the sentiment target appear directly in the text, we use conditional random model with a combination of a classification model to solve. For the case that the sentiment target does not appears in the text, this paper presents an improved model based on random model conditions, through the abstraction implied targets and adding global hidden nodes to CRFs approach to identify targets.The core idea of this research is to study the sentiment target recognition in Chinese microblogging text which treats it as a sequence labeling problem, then we use Conditional Random fields to label the text in sentence level and comprehensive utilization of a variety of features to improve the accuracy of model. In the experimental part, we conducted experiments to verify and evaluate the model in two datasets which contain NLP&CC2012dataset and self-built dataset. Experimental results on two datasets demonstrate that this method outperforms the state-of-art methods which not only be able to better identify the dominant microblogging sentiment targets, but also to identify the hidden sentiment targets.
Keywords/Search Tags:Microblogging, CRFs, Sentiment Analysis, Sentiment Target, EntityRecognition
PDF Full Text Request
Related items