Font Size: a A A

Research On Ellipsis Phenomenon In Chinese Opinion Target Extraction

Posted on:2015-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:M DaiFull Text:PDF
GTID:2268330428998565Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently, with the rapid development of the Internet technology in China, thelife-style of people has been changed by some e-business websites and social network,such as Taobao and Weibo. A large number of Chinese comments talking about char-acters, events and products are published on the Internet by the Chinese users. TheChinese opinion text is undergoing a rather rapid expansion. The opinion informationwhich contains a huge value in use, becomes a new challenge for the Chinese OpinionInformation Extraction(OIE) technology.Opinion information extraction is the task of extracting important elements con-cerning sentimental expressing. Deep study on this task is highly valuable for manypractice usages and theory studies. In this paper, we mainly focus on the task of Chi-nese OIE, and on Opinion Target Extraction(OTE). Our study includes the followingthree aspects:First, this paper proposes an annotation framework for Chinese OIE so as tosolve the data sparse problem in Chinese OIE, and builds a large Chinese corpus likeabundant information. Specifcally, besides the popular elements including sentimentorientation, opinion target and opinion keyword, our corpus contains other importantinformation including opinion target ellipsis, the expressing opinion without sentimen-tal words and the sentimental polarity shifting. The statistics show the common andnecessary of investigating the issue of opinion target ellipsis in Chinese text. We believethat our corpus can become a very useful data resource for Chinese OIE.Second, this paper employs a method to identify the ellipsis of opinion targetin Chinese text. We treat the task of opinion target ellipsis as a binary classifcationproblem. We propose three kinds of features to solve this problem, including sentence’sposition-independent features, sentence’s position-dependent features and contextualfeatures. To better improve the performance, greedy algorithm is used to fnd the best feature set. The experimental results demonstrate that the machine learning-basedmethod is efective for the task of ellipsis target recognition. The selected featuresyield about80%in accuracy across three diferent domains.Third, this paper proposes a novel Chinese OTE approach by fully consideringthe target ellipsis information. The main idea of our approach is to use meta-learningto fuse the CRFs-based model of OTE and the ME-based model of target ellipsisidentifcation so as to employ the information of ellipsis to improve the performance ofChinese OTE. Experimental results demonstrate that our method can greatly improvethe target extraction performance when the training data is in small-scale.
Keywords/Search Tags:Sentiment Analysis, Opinion Information Exaction, Chinese Corpus, Opinion Target Exaction, Opinion Target Ellipsis
PDF Full Text Request
Related items