Font Size: a A A

Research On Identifying Opinion Holder In Opinionated Sentences

Posted on:2012-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y KuangFull Text:PDF
GTID:2178330335460778Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Sentiment analysis mainly aims to automatically obtain useful sentimental knowledge and relevant information from subjectivity texts. With the development of the Internet and Information Industry, many users can make their reviews in the forum, blog or other platforms, and what they have been talking is nearly all inclusive. In the field of sentiment analysis, among those views that had been put up it is important to identify the author or sponsor that is opinion holder in order to clearly know how people thought about the social or public problems meanwhile to device better measures and make the reviews properly. Therefore, identifying opinion holder based on natural language processing technology is of great value.In this paper, opinion holders from different fields are identified respectively based on statistical and rules, then we combined the results from statistical with rules to obtain the final identification result. The main results of this paper are:Firstly, by analyzing the definition of opinion holder, the relevant six features are extracted and proposed, including lexical, the opinionated_trigger words, POS tags, named entities, dependency and sentence structure, and feature observation windows are designed to contain the contextual information of features as precisely as possible. Secondly, by analyzing the layer of structure from parsing trees on a large scale, we propose two novel syntactic rules with opinionated_trigger words to directly identify opinion holder from the parse trees through the designed opinion holder extraction algorithm based on proposed two syntactic rules.Finally, a combination method of CRF with syntactic rules is proposed to identify opinion holder, where the syntactic rules are regarded as additional three features for CRF obtained through the feature extracting algorithm we designed. The combination identification results show a high precision and recall, and indicate satisfactory results. However, the anaphora resolution is not used in our study, so in the future the anaphora resolution combined with semantic disambiguation will be used to further improve the accuracy of opinion holder identification.
Keywords/Search Tags:natural language processing, opinion holder identification, conditional Random Fields, syntactic rule, feature
PDF Full Text Request
Related items