Font Size: a A A

Research On Intergrating Domain Knowledge To Extract Opinion Targets From Chinese Sentences

Posted on:2014-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z C LeiFull Text:PDF
GTID:2308330461972633Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Extracting Opinion Targets (Sentiment objects) from Chinese Sentences is not only an important orientation of Chinese Opinion Mining, but also a difficult task. So far it has drawn a lot of attention of domestic and foreign scholars, and some achievements have been made. However, extracting opinion targets is hard to deal with, for one thing opinion targets are domain related, for another it is difficult to extract compound targets and un-login words, for the third there is long distance dependency between some opinion targets. All these characteristics are deeply involved in the extraction, and the accuracy will be affected a lot if we don’t treat them effectively. So in this paper we will focus on these characteristics and aim to extract the opinion targets more effectively, the details of the research are proposed as followed:(1)In this paper we propose a new method in which domain knowledge is intergrated to deal with the domain related opinion targets. Firstly, we build up the domain dictionary, then using the model of Linear-chain, Skip-chain and Cascaded Conditional Random Fields(CRFs) in which the grammar features of token, part-of-speech, grammar dependence and nearest noun are adopted we add the feature of domain dictionary to help extracting the domain related targets. After the opinion targets are extracted, we employ the domain rules to do the optimization. The experiment results on the domain of Digital, Entertainment and Finance show that our method is helpful to extract the opinion targets which are domain related, and the precision are improved.(2)In this paper we focus on making use of domain knowledge to improve the middle-lever model of CCRFs, and we introduce an improved Cascaded CRFs model to handle the long distance dependency. The model compromises the merits of Skip-chain CRFs and Cascaded CRFs, and it is fit for compound targets, un-login words and long distance dependency. Firstly, the linear-chain model is employed to identify the candidate opinion targets. Then the improved middle-lever model is adopted for the noise filtering and complement. After that, the candidates are put into the skip-chain model in which opinion targets are output. Finally, we use some domain rules to optimize the set of opinion targets. Experiment results on the corpus of COAE 2011 on Digital, Entertainment and Finance turn out to be promising and the model is useful when adopted in different domains to deal with long distance dependency.(3)Finally, we combine the job of (1) and (2), and design the system of Chinese Opinion Targets Extraction intergrated domain knowledge. In the system, we firstly deal with the reviews grabbed from network and analyze the grammar information, and we set up the domain dictionary and rules. Then we adopt the improved Cascaded CRFs to extract the opinion targets. Finally we do the ranking of the opinion targets and generate the summary of each domain’s reviews which can demonstrate the hotspot of the network reviews.
Keywords/Search Tags:Extracting Opinion Targets, Conditional Random Fields, Domain Knowledge, Domain Related, Cascaded CRFs
PDF Full Text Request
Related items